Cloudera Data Science Workbench enables secure self-service data science for the enterprise. It is a collaborative environment where developers can work with a variety of libraries and frameworks.
N/A
Google BigQuery
Score 8.6 out of 10
N/A
Google's BigQuery is part of the Google Cloud Platform, a database-as-a-service (DBaaS) supporting the querying and rapid analysis of enterprise data.
$6.25
per TiB (after the 1st 1 TiB per month, which is free)
Organizations which already implemented on-premise Hadoop based Cloudera Data Platform (CDH) for their Big Data warehouse architecture will definitely get more value from seamless integration of Cloudera Data Science Workbench (CDSW) with their existing CDH Platform. However, for organizations with hybrid (cloud and on-premise) data platform without prior implementation of CDH, implementing CDSW can be a challenge technically and financially.
Google BigQuery really shines in scenarios requiring real-time analytics on large data streams and predictive analytics with its machine learning integration. Teams have been using it extensively all over. However, it may not be the best fit for organizations dealing with small datasets because of the higher costs. And also, it might not be the best fit for highly complex data transformations, where simpler or more specialized solutions could be more appropriate.
Its serverless architecture and underlying Dremel technology are incredibly fast even on complex datasets. I can get answers to my questions almost instantly, without waiting hours for traditional data warehouses to churn through the data.
Previously, our data was scattered across various databases and spreadsheets and getting a holistic view was pretty difficult. Google BigQuery acts as a central repository and consolidates everything in one place to join data sets and find hidden patterns.
Running reports on our old systems used to take forever. Google BigQuery's crazy fast query speed lets us get insights from massive datasets in seconds.
It is challenging to predict costs due to BigQuery's pay-per-query pricing model. User-friendly cost estimation tools, along with improved budget alerting features, could help users better manage and predict expenses.
The BigQuery interface is less intuitive. A more user-friendly interface, enhanced documentation, and built-in tutorial systems could make BigQuery more accessible to a broader audience.
We have to use this product as its a 3rd party supplier choice to utilise this product for their data side backend so will not be likely we will move away from this product in the future unless the 3rd party supplier decides to change data vendors.
web UI is easy and convenient. Many RDBMS clients such as aqua data studio, Dbeaver data grid, and others connect. Range of well-documented APIs available. The range of features keeps expanding, increasing similar features to traditional RDBMS such as Oracle and DB2
Cloudera Data Science Workbench has excellence online resources support such as documentation and examples. On top of that the enterprise license also comes with SLA on opening a ticket to Cloudera Services and support for complaint handling and troubleshooting by email or through a phone call. On top of that it also offers additional paid training services.
BigQuery can be difficult to support because it is so solid as a product. Many of the issues you will see are related to your own data sets, however you may see issues importing data and managing jobs. If this occurs, it can be a challenge to get to speak to the correct person who can help you.
Both the tools have similar features and have made it pretty easy to install/deploy/use. Depending on your existing platform (Cloudera vs. Azure) you need to pick the Workbench. Another observation is that Cloudera has better support where you can get feedback on your questions pretty fast (unlike MS). As its a new product, I expect MS to be more efficient in handling customers questions.
I have used Snowflake and DataGrip for data retrieval as well as Google BigQuery and can say that all these tools compete for head to head. It is very difficult to say which is better than the other but some features provided by Google BigQuery give it an edge over the others. For example, the reliability of Google is unmatchable by others. One thing that I really like is the ability to integrate Data Studio so easily with Google BigQuery.
Google Support has kindly provide individual support and consultants to assist with the integration work. In the circumstance where the consultants are not present to support with the work, Google Support Helpline will always be available to answer to the queries without having to wait for more than 3 days.
Pricing has been very reasonable for us. The first 10 GB of storage is free each month and costs start at 2 cents per GB per month after that. For example, if you store 1 terabyte (TB) for a month, then the cost would be $20. Streaming data inserts start at 1 cent per 200 megabytes (MBs). The first 1 TB of queries is free, with additional analysis at $5 per TB thereafter. Meta data operations are free.
Big Query helps reduce the bar for data analytics, ML and AI. BQ takes care of mundane tasks and streamlines for easy data processing, consumption. The most impressive thing is the ML and AI integration as SQL functions, so the need for moving data around is minimized.
The visuals of ML models is very helpful to fine tune training, model building and prediction, etc.