Post

Kafka misc

A good material to checkout for kafka https://jaceklaskowski.gitbooks.io/apache-kafka/content/kafka-controller.html

Concepts:

  • isr: An in-sync replica (ISR) is a broker that has the latest data for a given partition
  • fence broker: broker failed heartbeat.

replication factor vs number of brokers

Replication factor should never be larger than the number of brokers. See source code.

Group coordinator vs Kafka Controller

Each consumer group has a coordinator. API FindCoordinator is used to find group coordinator.

Build

Checkout https://github.com/apache/kafka and

1
2
3
4
5
6
7
8
# you may need to use a lower version of Java
export JAVA_HOME=/Users/xiongding/Library/Java/JavaVirtualMachines/azul-13.0.14/Contents/Home

# do not running tests as it takes too much time.
./gradlew build -x test

# run a specific test
./gradlew clients:test --tests '*PemKeyStoreFileWithKeyPassword*'

Code structure

Auto generated files

Requests and responses that are specified in Kafka’s binary protocols are auto generated. These are json files under resources folder. If you do not build the repo, you will see lots of import errors in IDE. For example: CreatePartitionsRequest.

An interesting thing is that these requests and responses schemas are defined in client project. And core project depends on client projects, so core can use these definitions. I just feel this structure is little weird.

Service entry point

Kafka’s binary protocols is built on top of TCP, namely stream sockets in Unix-like stack, so it is still a request-response model. The handlers for these endpoints are defined in two places:

This post is licensed under CC BY 4.0 by the author.