Google’s BigQuery: A Revolution Or An Evolution For Big Data Strategy?
by Ciaran O'Kane on 8th May 2012 in News
Damien Healy is Group Head of Technology at Havas Digital. Here he discusses Google's BigQuery roll-out and what it means for the big data requirements of digital media comapnies
On Tuesday Google released ‘BigQuery’ to the public. BigQuery is a cloud platform enabling queries (questions) to be asked across potentially VERY large data sets. Is it a revolution or an evolution? To answer this, it’s worth backtracking slightly to view it in context of ‘Big Data’ and cloud services generally.
The ‘Big Data’ ecosystem is relatively new, but essentially it enables data to be processed at a scale that simply was not possible just a few years ago. The way to achieve this is to horizontally scale (i.e. break up the load across large numbers of computers), and to minimise how much the data needs to move around. Many platforms offer this kind of capability – the Amazon cloud offers services based on Hadoop, and we ourselves in Havas use Greenplum software & EMC server clusters to power our global data analytics platform, Artemis™. IBM, Informatica, Teradata, Microsoft, Oracle, and others also have Big Data solutions of varying capability. With Tuesday’s launch, Google’s BigQuery is now another addition to that competitive set.
Ironically, just about every ‘Big Data’ platform borrows a process called ‘MapReduce’, which was released into the wild by Google’s own Labs division back in 2004. This technology underpins Big Data systems’ ability to break big problems into very many, much smaller, problems (the ‘Map’) that can each be sent out to individual servers. The ‘Reduce’ step subsequently matches up all of the results into a single response. This is the not-so-secret-sauce of Big Data – with enough servers, even the biggest queries can be trivial. With the core mechanics of Big Data largely understood, the question becomes where to keep data, and where to run the analytics. This is where “The Cloud” comes in.
Cloud-based services continue to generate a lot of hype. Why? The ability to simply throw problems at massive pools of servers ‘somewhere else’ definitely has its charms! The main benefit from services like Google’s and Amazon’s helps two types of businesses – those that need to avoid up front capex costs, and those that have highly variable or short term needs. An example of the former would include digital startups like a DSP; while an example of the latter would be a small animation studio that needs a high number of machines to render animation on a variable basis. In cases like this, the economics of cloud services are hard to challenge – but this is only part of the story.
In the case of businesses like Havas, our Artemis™ platform has massive long term data requirements that are continually growing. We store historical data for every individual touch point between consumers and our clients’ advertising, which can date back years. We also hold summarised performance data running back ten years. This amounts to a LOT of data. In addition, analytics – the process of creating insights through data queries – is endemic throughout our data. In cases like this, the economics of paying cloud services for data storage over time, and paying compute time or data access charges for analytics is not such an easy sell.
Data requirements in digital are constantly evolving; there seems to be literally no limit to the data that can be used for marketing analytics and reporting. Services like Google’s BigQuery and Amazon’s Elastic MapReduce offering are brilliant, as they offer firmly established, stable ways for companies to find advantage in data. Cloud based analytics also drives innovation by massively altering the economics of Big Data for startups and those businesses with heavily fluctuating analytical demands.
Is Google BigQuery a revolution or an evolution then? I’d say that it’s simply an evolutionary step forwards, yet a very welcome one that re-aligns Google within the Big Data space. The broader emerging ability to extract value from Big Data however – that is most certainly a revolution driving change throughout organisations within and far outside of marketing. It’s very early days and the potential can only be imagined today.
Follow ExchangeWire