Google's cloud-based analytics service lets advertisers and other external users store, query, and explore big data in a hurry.
Big Data Talent War: 10 Analytics Job Trends
(click image for larger view and for slideshow)
Google Tuesday formally launched its BigQuery service, which promises to "analyze terabytes of data with just a click of a button."
The cloud-based service, which was previewed in March, combines a NoSQL-style data store with "SQL-like" querying capabilities. The service is geared to interactive analysis of very large datasets. Google said BigQuery is not a database, contrasting it with its cloud-hosted, MySQL-based Google Cloud SQL service. Where Cloud SQL offers full-SQL syntax and table management tools, BigQuery does not support table indexes, updates, deletes, or other SQL data-management features.
Like NoSQL systems, BigQuery uses no fixed schemas; Google said data will typically be added using a small number of very large, append-only tables. The service then supports ad-hoc querying, reporting, data-exploration, or even Web-based applications that run against these multi-terabyte data sets. Users interact with the service through a BigQuery Browser tool, a command-line tool, or through calls to a REST-based API using Java, Python, or other languages.
The primary use of the service is likely to be analysis of Google advertising data that already resides in Google's cloud. "When an advertiser wants to understand the ROI or effectiveness of a keyword campaign running across the globe, that's a big-data problem," said Google's Ju-Kay Kwek, product manager for Google Cloud Platform Team, during a public preview of BigQuery at a March trade show.
Google Adwords customers typically extract data from that service using the Adwords API in order to build on-premises databases for further analysis. But these databases often become unwieldy, requiring complex sharding and indexing steps such that "customers sometimes lose track of the questions they wanted to ask by the time they have the data available," Kwek said.
Customers will also be able to upload their own data to BigQuery, and Google is counting on a variety of uses for external developers. Beta users of the service have taken on challenges such as Web ad and e-commerce targeting. French business intelligence vendor We Are Cloud has tested BigQuery as a high-scale, backend data store for its own query, analysis, and data-visualization capabilities. The firm says BigQuery will free its customers from the arduous task of running a big platform while supporting rapid query and analysis of tens of terabytes of data.
Attracting big-data customers would certainly bolster Google's online storage business, which just got a big shot in the arm by way of last week's launch of Google Drive. Storing data in the BigQuery service will cost 12 cents per gigabyte per month for up to two terabytes, with costs declining above that volume. The BigQuery analysis itself costs 3.5 cents per gigabyte of data processed.
As InformationWeek reported in March, BigQuery joins a fast-growing field of analytic services, with leading lights including Amazon Elastic MapReduce, IBM BigInsights, and Microsoft Azure Hadoop-based services. Google's prominence in ad serving, keyword advertising, and Web analytics will undoubtedly make related analyses of Google-sourced data the biggest play for Google Big Query. But The New York Times pointed out the potential for a budding rivalry with Amazon Web Services, noting that AWS storage prices start at 12.5 cents per gigabyte.
"We have huge respect for AWS, but we're different in terms of philosophy," Kwek told The Times. Where Amazon caters to technically able people, largely at start-ups, Google wants to attract lots of less-proficient executives."
The pay-as-you go nature of the cloud makes ROI calculation seem easy. It's not. Also in the new, all-digital Cloud Calculations InformationWeek supplement: Why infrastructure-as-a-service is a bad deal. (Free registration required.)
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.