We were extremely successful building Analytics platform on Amazon for our customer. The stack included the following:
The primary reason for migration was better performance and cost saving. Google BigData platform was the main selling point. When we did simple benchmarks of loading data and querying data, the performance for the cost was unprecedented. Our savings in BigData costs was easily a factor of 8. And we had complete support from our customer to migrate to Google Cloud. It was partly driven by the delay in loading data and providing actionable analytics time with exploding costs.
Since BigQuery is a managed service, there is no cost for managing the infrastructure. And the pricing is reasonable. And the most important part of the benefit was that we could experiment and recover very fast which is almost impossible in any normal big data solution. The cost of redesign is expensive in terms of resources, people, and time.
And in most BigData solution, data storage is distinct from compute engine. In our example above, the big data storage was Apache Cassandra and Compute Engine was Spark. And another problem is that designing aggregate tables is again expensive and the workflow of using Apache Cassandra and Apache Spark is non-trivial.
BigQuery addresses all these problem really well. And make BigData implementation accessible to anyone who has reasonable interest and basic skills.
The Google Cloud and BigData stack used:
Clearly the moving parts in terms of infrastructure has reduced considerably. Though the data growth is exponential, the management of the infrastructure is manageable – thanks to BigQuery.
A detailed Blog on Google Migration will be soon made available on our Blog.