Top 10 Problems When Using Apache Kafka

Apache Kafka is one of the most popularly used open-source distributed event streaming platforms. Its use cases range from enabling mission-critical apps to building and maintaining high-performance data pipelines. If you consider whether to use Apache Kafka for your future projects, you should know all about the pros and cons of using it. In today’s article, we will stick to the cons. While Apache Kafka is a solid distributed messaging platform, it has some limitations. To paint the picture, we have put together the top 10 problems when using Apache Kafka.

1. In Sync Replica Alerts

Kafka In Sync Replica Alert tells you that some of the topics are under-replicated. The data is simply not being replicated to brokers. These alerts indicate a potentially serious problem because the probability of data being lost becomes higher. It can happen entirely unexpectedly, even if you do nothing on your side. It usually takes place when downlevel clients affect the volume of data.

A spike in data volume causes the Kafka broker to back up message conversion. However, the problem has to be addressed as soon as possible. Usually, the questionable broker has to be fixed for the entire system to be operational again.

2. Kafka Liveness Check Problems and Automation

The Kafka liveness check problems can quickly occur if the host where the liveness check is running cannot reach the host where the broker is running. If this happens, the broker will keep on restarting. Meanwhile, all the downlevel clients won’t be able to run their apps. It can become a real nuance if you want to automate some of your tasks on Kafka.

Why? Because you need to enable liveness check to streamline automation and make sure that the broker’s client-serving port is open. You can simply write a piece of code to restart the broker when the port is not open. But if the broker falls into a dead-loop and keeps restarting, your entire infrastructure is rendered useless. Is there a quick fix? Simply turn off the liveness check.

3. New Brokers Can Impact the Performance

Staging a new cluster and installing the broker software on Apache Kafka is straightforward. Adding new brokers should not cause any problems, right? Pushing a new Kafka broker into production can potentially impact the performance and cause serious latency and missing file problems.

The broker can work properly before the partition reassign process is completed. Devs usually forget about it and use the default commands from the documentation. Moving thousands of partitions to the staging cluster can take hours. And, until all the partitions have been moved, its performance will suffer. This is why you should be careful and have a plan when you want to add a new broker to the infrastructure.

4. Questionable Long-Term Storage Solution

If you are working with large sets of data, using Apache Kafka to store it might cause you several problems. The major problem comes from Kafka storing redundant copies of data. It can affect the performance, but, more importantly, it can significantly increase your storage costs.

The best solution would be to use Kafka only for storing data for a brief period and migrate data to a relational or non-relational database, depending on your specific requirements.

5. Finding Perfect Data Retention Settings

While we are discussing long-term storage solution problems, let’s point out one additional issue related to it. The downstream clients often have completely unpredictable data request patterns. This makes finding the perfect and most optimal data retention settings somewhat of a problem.

Kafka stores messages in topics. This data can take up significant disk space on your brokers. To dump the data, you need to set the retention period or configurable size. If you don’t tune the data retention settings correctly, you risk either rendering data useless or paying way too much for storage than you should have to in the first place.

6. Overly Complex Data Transformations on-Fly

Using Apache Kafka on big data integration and migration projects can become too complex. How come? Kafka was built to streamline delivering messages, and the platform excels at it. However, you will run into some problems if you want to transform data on-fly.

Even with Kafka Stream API, you will have to spend days building complex data pipelines and managing the interaction between data producers and data consumers. Not to mention having to deal with and manage a system this complex. There are other distributed messaging systems that are much better for streamlining ETL jobs, such as Apache Pulsar.

7. Upscaling and Topic Rebalancing

The volume of your data streams can go in both directions. This is why it is crucial to choose a distributed messaging platform easy to scale up and down. With Kafka, this is a problem because you need to balance things manually to reduce resource bottlenecks.

You will have to do it every time a major change in the data stream occurs. And do it both via partition leadership balancing and Kafka reassign partition script. At the same time, with stateless brokers, Apache Pulsar makes the scale-out process significantly easier.

8. MirrorMaker Doesn’t Replicate the Topic Offsets

MirrorMaker is one of Kafka’s features that allows you to make copies of your clusters. This would be a great disaster recovery plan if it weren’t for one downside. MirrorMaker doesn’t replicate the topic offsets between the clusters. You will have to create unique keys in messages to overcome this problem which can become a daunting task when you are working at scale.

9. Not All Messaging Paradigms Are Included

While Apache Kafka comes with many messaging paradigms, some are still missing.This can turn into a real problem if you need to extend your infrastructure use case.It limits the Kafka capability to support building complex data pipelines.

Two major messaging paradigms not supported in Kafka are point-to-point queues and request/reply queues

10. Changing Messages Reduces Performance

If you want to use Apache Kafka to deliver messages as they are, you will have no issues performance-wise. However, the problem occurs once you wish to modify the messages before you deliver them.

Manipulating data on the fly is possible with Kafka, but the system it uses has some limits. It uses system calls to do it, and modifying messages makes the entire platform perform significantly slower.

Nevertheless, many giants across industries use Apache Kafka, including Twitter, Netflix, and LinkedIn. These ten problems are quite specific to Kafka, and they might affect your implementation of the distributed messaging solution in a specific case. Feel free to check Pandio if you want to learn more about Apache Pulsar, a distributed messaging platform that outperforms Kafka in almost every possible use case and is positioned for the ML workloads of the future.

Leave a Reply