Kafka misc
A good material to checkout for kafka https://jaceklaskowski.gitbooks.io/apache-kafka/content/kafka-controller.html
Concepts:
- isr: An in-sync replica (ISR) is a broker that has the latest data for a given partition
- fence broker: broker failed heartbeat.
replication factor vs number of brokers
Replication factor should never be larger than the number of brokers. See source code.
Group coordinator vs Kafka Controller
Each consumer group has a coordinator. API FindCoordinator
is used to find group coordinator.
Build
Checkout https://github.com/apache/kafka and
1
2
3
4
5
6
7
8
# you may need to use a lower version of Java
export JAVA_HOME=/Users/xiongding/Library/Java/JavaVirtualMachines/azul-13.0.14/Contents/Home
# do not running tests as it takes too much time.
./gradlew build -x test
# run a specific test
./gradlew clients:test --tests '*PemKeyStoreFileWithKeyPassword*'
Code structure
Auto generated files
Requests and responses that are specified in Kafka’s binary protocols are auto generated. These are json files under resources
folder. If you do not build the repo, you will see lots of import errors in IDE. For example: CreatePartitionsRequest.
An interesting thing is that these requests and responses schemas are defined in client
project. And core
project depends on client
projects, so core
can use these definitions. I just feel this structure is little weird.
Service entry point
Kafka’s binary protocols is built on top of TCP, namely stream sockets in Unix-like stack, so it is still a request-response model. The handlers for these endpoints are defined in two places: