Apache Kafka is an open-source stream processing platform developed by the Apache Software Foundation written in Scala and Java. The Kafka event streaming platform is used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
N/A
Google Cloud Pub/Sub
Score 9.2 out of 10
N/A
Google offers Cloud Pub/Sub, a managed message oriented middleware supporting many-to-many asynchronous messaging between applications.
N/A
Pricing
Apache Kafka
Google Cloud Pub/Sub
Editions & Modules
No answers on this topic
No answers on this topic
Offerings
Pricing Offerings
Apache Kafka
Google Cloud Pub/Sub
Free Trial
No
No
Free/Freemium Version
No
No
Premium Consulting/Integration Services
No
No
Entry-level Setup Fee
No setup fee
No setup fee
Additional Details
—
—
More Pricing Information
Community Pulse
Apache Kafka
Google Cloud Pub/Sub
Considered Both Products
Apache Kafka
Verified User
Analyst
Chose Apache Kafka
Confluent Cloud is still based on Apache Kafka but it has a subscription fee so, from a long term perspective, it is wiser to deploy your own Kafka instance that spans public and private cloud. Amazon Kinesis, Google Cloud Pub/Sub do not do well for a very number of messages …
I used other messaging/queue solutions that are a lot more basic than Confluent Kafka, as well as another solution that is no longer in the market called Xively, which was bought and "buried" by Google. In comparison, these solutions offer way fewer functionalities and respond …
Kafka looks like and ordered queue, there no deliver backoff, so if a message has a problem, it doesn't advance to the next one. Google Cloud Pub/Sub looks like more a SET of messages, and kafka like a LIST. In kafka a same message will repeat instantaneously while it is being …
We considered several messaging platforms including Kafka and Kinesis but both would have required more developer work and didn't integrate as nicely with our ecosystem. RabbitMQ is another messaging platform I've researched and prototyped on; it also would have required more …
Apache Kafka is well-suited for most data-streaming use cases. Amazon Kinesis and Azure EventHubs, unless you have a specific use case where using those cloud PaAS for your data lakes, once set up well, Apache Kafka will take care of everything else in the background. Azure EventHubs, is good for cross-cloud use cases, and Amazon Kinesis - I have no real-world experience. But I believe it is the same.
If you want to stream high volumes of data, be it for ETL streaming or event sourcing, Google Cloud Pub/Sub is your go-to tool. It's easy to learn, easy to observe its metrics and scales with ease without additional configuration so if you have more producers of consumers, all you need to do is to deploy on k8s your solutions so that you can perform autoscaling on your pods to adjust to the data volume. The DLQ is also very transparent and easy to configure. Your code will have no logic whatsoever regarding orchestrating pubsub, you just plug and play. However, if you are not in the Google Cloud Pub/Sub environment, you might have trouble or be most likely unable to use it since I think it's a product of Google Cloud.
Really easy to configure. I've used other message brokers such as RabbitMQ and compared to them, Kafka's configurations are very easy to understand and tweak.
Very scalable: easily configured to run on multiple nodes allowing for ease of parallelism (assuming your queues/topics don't have to be consumed in the exact same order the messages were delivered)
Not exactly a feature, but I trust Kafka will be around for at least another decade because active development has continued to be strong and there's a lot of financial backing from Confluent and LinkedIn, and probably many other companies who are using it (which, anecdotally, is many).
With a pub/sub architecture the consumer is decoupled in time from the publisher i.e. if the consumer goes down, it can replay any events that occurred during its downtime.
It also allows consumer to throttle and batch incoming data providing much needed flexibility while working with multiple types of data sources
A simple and easy to use UI on cloud console for setup and debugging
It enables event-driven architectures and asynchronous parallel processing, while improving performance, reliability and scalability
Sometimes it becomes difficult to monitor our Kafka deployments. We've been able to overcome it largely using AWS MSK, a managed service for Apache Kafka, but a separate monitoring dashboard would have been great.
Simplify the process for local deployment of Kafka and provide a user interface to get visibility into the different topics and the messages being processed.
Learning curve around creation of broker and topics could be simplified
It serves all of our purposes in the most transparent way I can imagine, after seeing other message queueing providers, I can only attest to its quality.
Apache Kafka is highly recommended to develop loosely coupled, real-time processing applications. Also, Apache Kafka provides property based configuration. Producer, Consumer and broker contain their own separate property file
It has many libraries in many languages, google provides either good guides or they're AI generated code libraries that are easy to understand. It has very good observability too.
Support for Apache Kafka (if willing to pay) is available from Confluent that includes the same time that created Kafka at Linkedin so they know this software in and out. Moreover, Apache Kafka is well known and best practices documents and deployment scenarios are easily available for download. For example, from eBay, Linkedin, Uber, and NYTimes.
They have decent documentation, but you need to pay for support. We weren't able to answer all our questions with the documentation and didn't have time to setup support before we needed it so I can't give it a higher rating but I think it tends to be a bit slow unless you're a GCP enterprise support customer.
I used other messaging/queue solutions that are a lot more basic than Confluent Kafka, as well as another solution that is no longer in the market called Xively, which was bought and "buried" by Google. In comparison, these solutions offer way fewer functionalities and respond to other needs.
Having used Amazon Web Services SNS & SQS I can say that even if the latter may offer more features, Google Cloud Pub/Sub is easier to use. On the other hand, usage of SNS & SQS as well as documentation and troubleshooting is easier with the AWS solution. Since we are not using GCP only for Pub/Sub the choice depends on other variables.
You can just plug in consumers at will and it will respond, there's no need for further configuration or introducing new concepts. You have a queue, if it's slow, you plug in more consumers to process more messages: simple as that.
Positive: Get a quick and reliable pub/sub model implemented - data across components flows easily.
Positive: it's scalable so we can develop small and scale for real-world scenarios
Negative: it's easy to get into a confusing situation if you are not experienced yet or something strange has happened (rare, but it does). Troubleshooting such situations can take time and effort.