How "Real-Time" Is Your RTB Platform?
by Ciaran O'Kane on 17th Feb 2012 in News
Brian O'Kelley is CEO of AppNexus. Here he discusses cache update cycles, how shorter cycles are critical for buyers using real-time platforms, and why traders should be placing greater importance on it.
Last week, Right Media posted a blog entry stating that it has "slashed" their cache update cycle to 45 minutes. This is a great excuse for the industry to take a closer look at trafficking update cycles and understand why they're so important.
A real-time ad platform is built around a trafficking database which stores all the campaigns, creatives and other targeting information. It also has thousands of servers in multiple datacentres which store an efficient representation, or "cache", of this database. The cache update cycle is the amount of time it takes for a change you make in the primary trafficking database - for instance, a new creative - to be syndicated out to each of the servers. When the change is out on every server, we consider the cache update complete.
A cache update cycle of 30 to 45 minutes doesn't sound very long (and in fact, I think it's better than the industry average). However, in the world of ad exchanges and real-time bidding, it's an eternity. In the blog post Right Media says that it serves 11 billion transactions a day. We can assume this has a peak volume of around 200,000 transactions per second. That means that in the 45 minutes that a cache update cycle takes, around 550 million transactions will occur.
Let's imagine a worst-case scenario (one that happens all too often): a trafficker puts a campaign live and forgets to apply targeting. He quickly realises his mistake and fixes it, but it's too late: the change is already en-route to the servers. If we assume a $0.60 global CPM, this campaign could spend $324,000 before it's possible to fix it. That's a lot of money on the line every time a trafficker hits "submit".
It's a lot like driving a car: at slow speeds, you have a long time to react without consequences. At high speeds, you need the ability to make split-second changes or you're in real trouble - and the consequences are potentially immense.
When Mike Nolet and I started AppNexus, we wanted to build the ad technology equivalent of a precision driving machine: an ad server that could get cache updates out to thousands of servers fast. The engineering challenge is, as our team likes to say, "non-trivial". In fact, this was my favorite interview question for new hires in the first year or two of AppNexus: "How do you build an ad server that can do fast cache updates at scale?"
What we came up with was a fundamental re-thinking of how an ad platform works. In the old days, when I was the CTO of Right Media (I have no knowledge of how the system works today), to do a cache update we would copy the entire database to a file, send the file to every box, and have them load it into memory. The problem with this approach is that the bigger the file gets, the longer the process takes. It would take a super-human engineering feat to get these massive cache files out to all of the Right Media servers in just 45 minutes.
At AppNexus, we do things a bit differently. Instead of sending the whole file out to each box on every update cycle, we only push out the changes. This dramatically reduces the amount of data that we have to send out each cache cycle, and means that we can update our caches as frequently as we want to. At the moment, any change you make through AppNexus Console or the AppNexus API is live on every server in three minutes or less. I'm making this sound a lot easier than it is, but hey, it's nice to be the CEO, not the CTO!
Let's calculate the economic impact this has. AppNexus processes around 15 billion impressions a day, so at peak we do around 275,000 transactions per second. In the three minutes that it takes to do a cache update cycle, we see 50 million impressions, meaning that the worst case trafficking error would cost $30,000. It's still a lot of money for a trafficking error, but to use my driving analogy, we just dinged our Lamborghini instead of totaling it.
It's a shame that most of the companies in the RTB space aren't talking about this issue. The Forrester DSP analysis didn't include cache update cycle (though it did include its siblings, reporting update cycle and optimisation update cycle) as an evaluation criterion. I think this is a major oversight. The volume available to real-time bidders is doubling each year, and with it, the amount of risk buyers take with every trafficking update.
When driving on the RTB superhighway, you need a fast car - but you also need precision steering. Kudos to Right Media for making big investments here, and I hope the rest of the industry follows suit.
Follow ExchangeWire