Elasticsearch and Opensearch
We know that Opensearch was forked from Elasticsearch 7.10.2 by AWS after Elastic changed Elasticsearch license. Most concepts and APIs are the same between them, but there are still some gotchas.
In both ES and OS, there is only one node that is responsible for maintaining a consistent cluster status, which is called master node in ES, but cluster-manager node in OS. If you run GET /_cat/nodes?v
in both ES and OS, you can see the name difference. It seems that Opensearch team deliberately wants to change this name. See source file of class TransportClusterManagerNodeAction.
Read Opensearch developer guide to get start locally.
1
2
./gradlew localDistro
./gradlew run --debug-server-jvm
Parameter --debug-server-jvm
above turns on remote debugging, so you can debug it using jdb.
Helm chart
The Elasticsearch helm chat has a configuration replicas
which means the number of nodes, not the index.number_of_replicas
. The number of replicas of each shard is by default one unless changed.
Node discovery
Zen discovery
Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/discovery-hosts-providers.html/
Snapshot related logic is mainly contained in class SnapshotsService. One thing to note when snapshotting is happening, you cannot delete index. Otherwise, you will see SnapshotInProgressException.
Workflow
What is running inside an ES node?
1
2
3
4
5
6
elasticsearch@elasticsearch-master-0:~$ ps auxww
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
elastic+ 1 0.0 0.0 2500 592 ? Ss 13:43 0:00 /bin/tini -- /usr/local/bin/docker-entrypoint.sh eswrapper
elastic+ 7 0.0 0.5 2599688 89504 ? Sl 13:43 0:09 /usr/share/elasticsearch/jdk/bin/java -Xms4m -Xmx64m -XX:+UseSerialGC -Dcli.name=server -Dcli.script=/usr/share/elasticsearch/bin/elasticsearch -Dcli.libs=lib/tools/server-cli -Des.path.home=/usr/share/elasticsearch -Des.path.conf=/usr/share/elasticsearch/config -Des.distribution.type=docker -cp /usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/cli-launcher/* org.elasticsearch.launcher.CliToolLauncher
elastic+ 170 71.3 32.0 428683128 5197536 ? Sl 13:43 128:31 /usr/share/elasticsearch/jdk/bin/java -Des.networkaddress.cache.ttl=60 -Des.networkaddress.cache.negative.ttl=10 -Djava.security.manager=allow -XX:+AlwaysPreTouch -Xss1m -Djava.awt.headless=true -Dfile.encoding=UTF-8 -Djna.nosys=true -XX:-OmitStackTraceInFastThrow -Dio.netty.noUnsafe=true -Dio.netty.noKeySetOptimization=true -Dio.netty.recycler.maxCapacityPerThread=0 -Dlog4j.shutdownHookEnabled=false -Dlog4j2.disable.jmx=true -Dlog4j2.formatMsgNoLookups=true -Djava.locale.providers=SPI,COMPAT --add-opens=java.base/java.io=ALL-UNNAMED -Des.cgroups.hierarchy.override=/ -XX:+UseG1GC -Djava.io.tmpdir=/tmp/elasticsearch-7239405004551249449 -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError -XX:HeapDumpPath=data -XX:ErrorFile=logs/hs_err_pid%p.log -Xlog:gc*,gc+age=trace,safepoint:file=logs/gc.log:utctime,pid,tags:filecount=32,filesize=64m -Xmx4g -Xms4g -XX:MaxDirectMemorySize=2147483648 -XX:G1HeapRegionSize=4m -XX:InitiatingHeapOccupancyPercent=30 -XX:G1ReservePercent=15 -Des.distribution.type=docker --module-path /usr/share/elasticsearch/lib --add-modules=jdk.net -m org.elasticsearch.server/org.elasticsearch.bootstrap.Elasticsearch
elastic+ 191 0.0 0.0 114336 6164 ? Sl 13:43 0:00 /usr/share/elasticsearch/modules/x-pack-ml/platform/linux-x86_64/bin/controller
There are two java processes. The CliToolLauncher class is the entry point. It loads server-cli
which is process bootstrap.Elasticsearch. Why it has this design? Because ES also provides other command line tools which also use CliToolLauncher
as an entry point.
Another thing to note is the memory usage. Elasticsearch uses 428G virtual memory and 5G physical memory. Underneath, Lucene uses mmap
to make searching faster. So, ES official guideline is to use 50% memory for JVM heap, so the rest can be used for Lucene mmap
. The contributors of Lucene are even more aggressive: only reserve 25% for JVM heap. Here is relevant code for Lucene memory mapped indices.
Shards
According to this elastic doc, ES searches run on a single thread per shard, so making make shard smaller leads to faster search, but in reality, more shards mean more overhead, so the optimal size of a shard is 50GB. You can also use below query to see the size of thread pool which determines the maximal parallelism.
1
GET /_cat/thread_pool/search?v=true&h=node_name,name,core,largest,max,qs,size,type
ES data stream is a good way to control shard size.
Storage
Endpoint /_cat/indices
tells basic information about ES indices. Example below.
1
2
3
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open filebeat-7.10.0-2023.02.19 pCBMkxMzQ1CAwoOkkyLY3g 1 1 43446487 0 118.6gb 59.5gb
green open filebeat-7.10.0-2023.02.20 xdKlPEsiQqu4Bt6NimFifA 1 1 45494980 0 128.6gb 64.4gb
Take note of above uuid
field. The index files are stored on disk at location data/indices<index_uuid>
. Inside this folder, you will see files with various extensions: .tid
, .cfs
, and etc. There are Lucene index files. Check out this description if you are interested.
Misc
How to get ES version?
1
/usr/share/elasticsearch/bin/elasticsearch --version
The Elasticsearch helm chat has a configuration replicas
which means the number of nodes, not the index.number_of_replicas
. The number of replicas of each shard is by default one unless changed.