As an organization, Chitika has wholeheartedly invested in building out the Chitika Insights research program through staffing, rigorous process management, and a quality data infrastructure. Additionally, our reports rely on our vast trove of ad impression data and our knowledgeable and skilled team of data scientists. It’s this impression-level data and attention to statistical detail that has ultimately led our research to be cited by The New York Times, Wall Street Journal, CNN, Bloomberg, and many others.
In our last edition of Logging Data, we introduced Cluster Map Reduce, or CMR. The new tool acts as an alternative to Hadoop and HDFS when paired with a POSIX compliant clustered file system, simplifying the movement of data through the analytical back end, and helping to minimize the dependencies and potential points where the data pull process may slow or stop altogether. Today, we’re proud to provide this tool to the world as a free, Open Source release!
Thus far, our Logging Data series has focused on the nuts and bolts of our network operations and data infrastructure. While we employ some terrific software and hardware, our proverbial secret sauce consists of the various customizations we employ using these tools. No place was this more evident than during the transition from HDFS to Gluster, and the subsequent porting of Hadoop resources. The team here is well versed in working around issues, so after some brainstorming, the solution pretty much morphed into “Let’s just build something internally that fulfills our needs better than Hadoop.” Not an easy task, but one that our Operations and DI teams took on readily
We’ve briefly mentioned our implementation of Infiniband in both of the previous Logging Data posts without giving a thorough explanation of its function and capabilities within our architecture. In this latest installment, we’ll be doing just that, along with discussing our corresponding Hadoop framework.
The previous installment of our Logging Data series outlined how individual impressions move through our network. In this edition, we’ll discuss the necessary storage considerations cataloguing all of these impressions effectively 24 hours a day, specifically focusing on the challenges that result from the requirements of ad network operations.
In this “Logging Data” series, we’ll provide some in-depth detail on the intricacies of data collection, infrastructure, and access here at Chitika, hopefully providing some useful lessons for both newcomers and veterans in the field. Our first post will focus on our logs – the baseline of our data collection – and the subsequent processes that coalesce the information they contain into more readily accessible formats for our data scientists.
As Chitika displays hundreds of millions of ads every day, a tremendous amount of data need to be stored to fulfill business requirements. In our case, that figure amounts to roughly 1 to 1.2 TB per day. The Chitika Data Infrastructure and Engineering teams each have several Minecraft aficionados among them, and our recent side project visualizes roughly 10 terabytes of this data as large towers of 8-bit 3D-rendered blocks. We call it, aptly, the Great Wall of Data.
Earlier this week on Digital Point, Chitika publisher Chris discussed how he got his forum, DeadMansCrossZone.com, upgraded to Gold level while attracting a relatively small audience – tens of thousands of impressions per day. While we’ve discussed major considerations in terms of getting your site Gold account ready, here we’ll be discussing some of the unique considerations for smaller websites looking to achieve Gold account status as quickly as possible.
Formerly the Publisher Panel, our revamped Partner Center has undergone some major changes with our latest redesign. The aim was to make it as easy as possible for publishers to monitor and make changes to their accounts. In this blog post, we provide an overview of some of the newest tools and improvements that are available to all Chitika publishers
A question we often get from new publishers relates to what they can expect to earn using Chitika. While the answer can vary significantly, this blog post outlines the factors that influence publisher revenues on our network