My name is Sergey Kalinets, I am an architect at Parimatch Tech, and in this publication I want to share our experience in the field of message search in Kafka.
For our company, Kafka is the central nervous system through which microservices exchange information. From input to output, a message can go through a dozen services that filter and transform it, transferring it from one topic to another. These services are owned by different teams, and it can be very useful to see what is contained in a particular message. This is especially interesting in cases when something does not go according to plan - it is important to understand at what stage everything turned into a pumpkin (well, who needs to give it to the pumpkin so that this does not happen again). From a bird's eye view, the solution is simple - you need to take the relevant messages from the kafka and see what is wrong with them. But, as usual, the interesting starts in the details.
Let's start with the fact that kafka is not just a message broker, as many people think and use it, but also a distributed log. This means a lot, but what is interesting to us is that messages are not removed from topics after the recipients have read them, and technically, you can read them again at any time and see what is inside. However, things get complicated by the fact that you can only read from Kafka sequentially. We need to know the offset (for simplicity, this is the ordinal number in the topic) from which we need messages. It is also possible to specify the time as the starting point, but then you can still only read all messages in order.
It turns out that in order to find the desired message, you need to read the pack and find among them those that are interesting. For example, if we want to understand the problems of a player with id = 42, we need to find all the messages where he is mentioned (playerId: 42), line them up, and then look at what stage everything went wrong.
Unlike databases like MySQL or MSSQL, which immediately have client applications with a graphical interface in the package, vanilla Kafka does not spoil us with such delights and offers only console utilities with rather narrow (at first glance) functionality.
But there is good news as well. There are a number of solutions on the market that help make the whole process easier. I note right away that “on the market” here is not in the sense of “for money” - all the tools discussed below are free.
, , , .
, ?
Kafka Tool
( https://www.kafkatool.com/features.html)
, , GUI . , . , . , Kafka Tool , . ( ): «Not great not terrible».
, , . .
Kafka Console Consumer
, . . Kafka, JVM , Java. , Kafka Tool, Java — docker:
, docker run --rm -it taion809/kafka-cli:2.2.0, « , , , , , ».
, , , . — , .
Kafkacat
, , . , , kafka-console-consumer ( ).
10 messages ( JSON):
- ., , kafkacat .
( , ):
Kafka — Robin Moffatt. — kafkacat Kafka, kafkacat, . , , . .
. . , — - grep .
, kafkacat Avro , protobuf — .
Kafka Connect + ELK
, . — . QA ( 90% ) Kafka Tool, — . , Kibana, UI Elasticsearch. Kibana QA . « , Kibana». , , , — Kafka Connect.
Kafka Connect — Kafka . , ( ?) Kafka . , — Connect JSON. «» , , — , , , — Kubernetes.
Kafka Connect REST API, c , Kafka. , Elasticsearch :
HTTP PUT Connect, , , ElasticSinkConnector, Elastic.
, , , . )
. , , , , - Elasticsearch.
. , , . Kafka , . ?
4 . , . , .
— .
Elasticsearch , . . / . — .
, Kibana , , . Kafka. , , UTC . , Elasticsearch timestamp, , index template, « — »:
, , , .
, , , , , Kibana, .
, Kafka Connect . , , , , . Kafka. — Kafka Elasticsearch. Elasticsearch, id .
. , - , — ? - , -, — , )