Post

Istio

Istio used to have a micro-service architecture. However, starting from v1.5, It moved back to a monolithic architecture. This largely simplifies the management of Istio. After v1.5, we only need to worry about two components: the proxy agent and istiod daemon. See more details in this blog.

It sounds simple with only two components, right? The reality is not. See below pods I grabbed from an old installation.

1
2
3
4
5
$ k get pods -A | grep istio
istio-operator               istio-operator-6f7d5894cc-mw5bp          1/1     Running
istio-system                 istio-egressgateway-58c576f5f-w9wht      1/1     Running
istio-system                 istio-ingressgateway-775747ccd7-66pdg    1/1     Running
istio-system                 istiod-888fc4cbb-4trqq                   1/1     Running

First, let’s talk about istio-operator. This pod actually should not exit. It is discouraged to install istio using operator mode any more. It is deprecated in istion 1.23 and will be removed in version 1.24. Quote from the source code

The operator formerly acted as an in-cluster operator, dynamically reconciling and Istio installation. This mode has now been removed, and it only serves as a client-side CLI tool to install Istio.

We can safely remove this pod/deployment if we use helm or istioctl to install istio.

Second, let’s turn to (ingress/egress)gateways. First, we should not mistake them with k8s gateway api. They are related, but not the same.

A quick recap on k8s gateway api. Gateway is similar to ingress. GatewayClass to Gateway is what IngressClass to Ingress. However, gateway is more powerful. Gateway supports L4 and L7 routing. While ingress only supports http routing. Ingress can only route traffic to a single namespace, it can’t be used as a unified gateway across multiple namespaces. Gateway api becomes available since k8s 1.24. However, it is not natively implemented inside k8s. It exists as an add-on. You need to manually install those CRDs or choose an implementation which handles it for you.

Finally, John Howard has an excellent blog talking about all these istio confusions. Please check it out!

Pilot

Pilot has two components: pilot-agent and pilot-discovery. If you have read my Envoy notes below, you know that the xDS protocol has two parts: proxy and management server. pilot-agent is the proxy part, and pilot-discovery is the management server part.

Pilot-agent

It runs in the istio-proxy container. Ex below:

1
2
3
/usr/local/bin/pilot-agent proxy sidecar --domain filebeat.svc.cluster.local --proxyLogLevel=warning --proxyComponentLogLevel=misc:error --log_output_level=default:info --concurrency 2

/usr/local/bin/envoy -c etc/istio/proxy/envoy-rev0.json --restart-epoch 0 --drain-time-s 45 --drain-strategy immediate --parent-shutdown-time-s 60 --service-cluster airflow.data --service-node sidecar~172.31.76.240~airflow-worker-0.data~data.svc.cluster.local --local-address-ip-version v4 --bootstrap-version 3 --disable-hot-restart --log-format %Y-%m-%dT%T.%fZ.%l.envoy %n.%v -l warning --component-log-level misc:error --concurrency 2

It is mainly responsible for managing envoy proxy sidecar.

Pilot-discovery

It runs in istiod pod. Ex below:

1
/usr/local/bin/pilot-discovery discovery --monitoringAddr=:15014 --log_output_level=default:info --domain cluster.local --keepaliveMaxServerConnectionAge 30m

Pilot-discovery is responsible for reacting to k8s events, converting these events to xDS responses and sending them to pilot-agents. We use two examples to illustrate how this part works.

Example 1: When a new service is deployed in a k8s cluster

As you can imagine, pilot-discovery subscribes service create/update/delete events in a k8s cluster. When an event happens, pilot-discovery will update its internal service registry and push this change to all envoy proxies.

How does this process work? The answer is Kubernetes controllers. The controllers for all types of resources inside Kubernetes such as namespace, node, service, pod and etc are defined in this file. Especially for services, it register a handler for service event:

1
	c.registerHandlers(c.serviceInformer, "Services", c.onServiceEvent, nil)

Inside the handler, it calls xdsUpdater to push the change to remote Envoy proxies.

Example 2: What happens when setting service mesh across two k8s clusters

Istio is able to set up a service mesh across multiple k8s clusters. How is this possible? To set it up, One step of the installation process is to run below command:

1
istioctl x create-remote-secret --context="${CTX_CLUSTER1}" --name=cluster1 | kubectl apply -f - --context="${CTX_CLUSTER2}"

It generates a yaml file like following

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
apiVersion: v1
kind: Secret
metadata:
  annotations:
    networking.istio.io/cluster: cluster1
  creationTimestamp: null
  labels:
    istio/multiCluster: "true"
  name: istio-remote-secret-cluster1
  namespace: istio-system
stringData:
  cluster1: |
    apiVersion: v1
    clusters:
    - cluster:
        certificate-authority-data: <...>
        server: https://369A1EB0F8100069D5C96281F5BD706D.gr7.us-east-2.eks.amazonaws.com
      name: cluster1
    contexts:
    - context:
        cluster: cluster1
        user: cluster1
      name: cluster1
    current-context: cluster1
    kind: Config
    preferences: {}
    users:
    - name: cluster1
      user:
        token: <...>

and then applies it to cluster2. Basically, cluster2 now has credentials to operate on cluster1. See this code for more details about how this command works. The secret creation event will be captured by secretcontroller because it listens all events on secrets with label istio/multiCluster, which registers a set of ClusterHandlers to handle secret create/update/delete events.

How to use pilot-discovery debug endpoints

Pilot-discovery provides a set of debug endpoints. Here, we show some example use cases.

Get all connected istioy-proxy instances

This endpoint will return all pods that have istio-proxy injected.

1
2
3
4
5
6
7
8
9
10
11
$ ke istiod-64fb77c48d-x8cjj -- curl localhost:15014/debug/connections
{
    "totalClients": 21,
    "clients": [
      {
        "connectionId": "sidecar~172.31.90.176~airflow-scheduler-7bc4cf9fb8-bnb4r.data~data.svc.cluster.local-65372",
        "connectedAt": "2022-12-30T07:56:30.843188241Z",
        "address": "172.31.90.176:45502",
        "watches": null
      },
...

Get Envoy config dump

Envoy admin API has an endpoint GET /config_dump. Istio has a debug endpoint to forward request to this config_dump endpoint.

1
ke istiod-64fb77c48d-x8cjj -- curl localhost:15014/debug/config_dump?proxyID=sidecar~172.31.90.176~airflow-scheduler-7bc4cf9fb8-bnb4r.data~data.svc.cluster.local-6537

The output is huge. Note the proxyID query parameter is required and it is the connnectionID above.

Get all services

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
$ ke istiod-64fb77c48d-x8cjj -- curl localhost:15014/debug/registryz
[
{
  "Attributes": {
    "ServiceRegistry": "Kubernetes",
    "Name": "airflow-datadog-cluster-agent-admission-controller",
    "Namespace": "data",
    "UID": "istio://data/services/airflow-datadog-cluster-agent-admission-controller",
    "ExportTo": null,
    "LabelSelectors": {
      "app": "airflow-datadog-cluster-agent"
    },
    "ClusterExternalAddresses": null,
    "ClusterExternalPorts": null
  },
  "ports": [
    {
      "port": 443,
      "protocol": "UnsupportedProtocol"
    }
  ],
  "creationTime": "2022-11-23T05:46:05Z",
  "hostname": "airflow-datadog-cluster-agent-admission-controller.data.svc.cluster.local",
  "address": "10.100.94.238",
  "Mutex": {},
  "cluster-vips": {
    "cluster1": "10.100.94.238"
  },
  "Resolution": 0,
  "MeshExternal": false
},

The cluster-vips displays the virtual IP of the service in the corresponding cluster. If you have the same service set up in two clusters and in the same namespace, the you will see two records in cluster-vips, and istio will round-robin requests to these two clusters, namely, this service is a cross-cluster service.

There is a simpler way provided by istioctl.

1
istioctl proxy-config cluster <pod-name>.<namespace>

How does injection work

Istio installation will install a MutatingWebhookConfiguration istio-sidecar-injector See blow sample output. As you can see that istiod has an endpoint /inject does then injection work.

1
2
3
4
5
6
7
8
$ k describe mutatingwebhookconfiguration istio-sidecar-injector
...
    Service:
      Name:        istiod
      Namespace:   istio-system
      Path:        /inject
      Port:        443
...

Checkout code for more details.

Useful commands

  • istioctl version: quickly get all running versions.
This post is licensed under CC BY 4.0 by the author.