These are some of the handnotes that I've prepared over the course of a few years working with kubernetes. Many of these are generally not known well, unless a person has dug deep into the official kubernetes docs.
These are in no specific order, and are meant to be used as notes for quick, and comparatively detailed reference to kubernetes.
The target audience for these handnotes are beginners who have familiaried themselves with the core concepts of kubernetes, and now wish to dig deeper for better understanding, and professionals, who want a quick and thorough reference to the most important aspects of kubernetes without referring to official docs again and again.
These are in no way a replacement for kubernetes official docs. They are, for all purposes, the best available resources for learning kubernetes. These handnote just try to provide a better ratio of knowledge gained vs the time taken, compared to the official kubernetes docs.
Resources covered:
- Pods
- ReplicaSets
- Controller
- Master Components
- Node components
- Objects
- UID’s
- Namespaces
- Services
- Labels
- Label Selectors
- Field Selectors
- Annotations
- Object Management
- Init Containers
- Secret generator
- Persistent volumes(pv) and Persistent volumes claims(pvc)
- Ingress
- Ingress Controller
- Service discovery
- Endpoint
- Kube proxy
- Kube dns
- Etcd
Pods
- Each Pod in a Kubernetes cluster has a unique IP address, even Pods on the same Node.
- Every container in a Pod shares the network namespace, including the IP address and network ports.
- By default, docker uses host-private networking, so containers can talk to other containers only if they are on the same machine.
- Containers inside a Pod can communicate with one another using localhost, and all pods in a cluster can see each other without NAT.
- All containers in the Pod can access the shared volumes, allowing those containers to share data.
- Although each Pod has a unique IP address, those IPs are not exposed outside the cluster without a Service. Services allow your applications to receive traffic.
ReplicaSets
- Scaling is accomplished by changing the number of replicas in a Deployment
- A ReplicaSet might dynamically drive the cluster back to desired state via creation of new Pods to keep your application running
- Kubernetes also supports autoscaling of Pods, but it is outside of the scope of this article. Scaling to zero is also possible, and it will terminate all Pods of the specified Deployment.
- States in a kubernetes replicaset
- 1. DESIRED : Configured number of replicas
- 2. CURRENT : Show how many replicas are running now
- 3. UP-TO-DATE : Number of replicas that were updated to match the desired (configured) state
- 4. AVAILABLE : Shows how many replicas are actually AVAILABLE to the users
- Updates in kubernetes are versioned, and any deployment update can be reverted to a previous (stable) version.
Controller
- A controller handles all aspects of pod management including, but not limited to, creation, scheduling, replication, and healing of pods.
Kubernetes Master Components
- kube-apiserver : Exposes the Kubernetes API. It is designed to scale horizontally
- etcd : Consistent and highly-available key value store used as Kubernetes’ backing store for all cluster data. Always have a backup plan for etcd’s data for your Kubernetes cluster.
- kube-scheduler : Schedules newly created pods to run on nodes based on individual and collective resource requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference and deadlines.
-
kube-controller-manager : Runs controllers. Each controller includes:
- 1. Node controller: Responsible for noticing and responding when nodes go down.
- 2. Replication Controller: Responsible for maintaining the correct number of pods for every replication controller object in the system.
- 3. Endpoints Controller: Populates the Endpoints object (that is, joins Services & Pods).
- 4. Service Account & Token Controllers: Create default accounts and API access tokens for new namespaces.
-
Cloud controller manager : runs controllers that interact with the underlying cloud providers. The cloud-controller-manager binary is an alpha feature introduced in Kubernetes release 1.6. The following controllers have cloud provider dependencies:
- 1. Node Controller: For checking the cloud provider to determine if a node has been deleted in the cloud after it stops responding
- 2. Route Controller: For setting up routes in the underlying cloud infrastructure
- 3. Service Controller: For creating, updating and deleting cloud provider load balancers
- 4. Volume Controller: For creating, attaching, and mounting volumes, and interacting with the cloud provider to orchestrate volumes
Kubernetes Node components
- kubelet : An agent that runs on each node in the cluster. It makes sure that containers are running in a pod. It doesn’t manage containers which were not created by Kubernetes.
- Container runtime
- Addons : Provides cluster features
- DNS : DNS server. Serves DNS records for Kubernetes services. Containers started by Kubernetes automatically include this DNS server in their DNS searches.
- Web UI
- Container Resource Monitoring
- Cluster-level logging
Kubernetes Objects
- They are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster.
- By creating an object, you’re effectively telling the Kubernetes system what you want your cluster’s workload to look like; this is your cluster’s desired state.
- Every Kubernetes object includes two nested object fields that govern the object’s configuration: the object spec and the object status.
- 1. Spec : It describes your desired state for the object
- 2. Status : It describes the actual state of the object, and is supplied and updated by the Kubernetes system.
- Required fields in an object spec file:
- 1. apiVersion
- 2. kind
- 3. metadata
- 1. name: mandatory
- 2. uid: mandatory: system provided
- 3. namespace: optional
- 4. spec
- All objects in the Kubernetes REST API are unambiguously identified by a Name and a UID.
- Name is a client-provided string that refers to an object in a resource URL, such as /api/v1/pods/some-name. This becomes DNS_NAME?.
- Only one object of a given kind can have a given name at a time. However, if you delete the object, you can make a new object with the same name.
Kubernetes UID's
- They are a kubernetes systems-generated strings to uniquely identify objects.
- Every object created over the whole lifetime of a Kubernetes cluster has a distinct UID. It is intended to distinguish between historical occurrences of similar entities.
Namespaces
- Kubernetes supports multiple virtual clusters backed by the same physical cluster. These virtual clusters are called namespaces.
- Namespaces are intended for use in environments with many users spread across multiple teams, or projects.
- Namespaces provide a scope for names. Names of resources need to be unique within a namespace, but not across namespaces.
- Namespaces can not be nested inside one another.
- Default namesspaces:
- 1. default
- 2. kube-system: For objects created by the Kubernetes system
- 3. kube-public: Readable by all users (including those not authenticated).
- If you want to reach a service across namespaces, you need to use the fully qualified domain name (FQDN).
- Default behavior of kubernetes is to lookup a service in local namespace. Service DNS entry is of the form: ..svc.cluster.local.
- Namespace resources are not themselves in a namespace.
- Low-level resources, such as nodes and persistentVolumes, are not in any namespace.
Services
- A Service in Kubernetes is an abstraction which defines a logical set of Pods, and a policy by which to access them. Services enable a loose coupling between dependent Pods.
- Although each Pod has a unique IP address, those IPs are not exposed outside the cluster without a Service. Services allow your applications to receive traffic.
- A service is assigned a unique IP address (also called clusterIP).
- This address is tied to the lifespan of the Service, and will not change while the Service is alive
- A Service is backed by a group of Pods, and these pods are exposed through endpoints.
- The Service’s selector will be evaluated continuously and the results will be POSTed to an Endpoints object
- When a Pod dies, it is automatically removed from the endpoints, and new Pods matching the Service’s selector will automatically get added to the endpoints.
- A service IP is completely virtual, it never hits the wire.
- Kubernetes services provides a stable, virtual IP (VIP) address.
- Virtual IP(VIP) address means it is not attached to any network interface
- VIP's purpose is to forward traffic to pods
- Keeping the mapping between the VIP and the pods up-to-date is the job of kube-proxy, a process that runs on every node, which queries the API server to learn about new services in the cluster.
- The target for service may not necessarily be a pod. It can be external cluster component, component in some other namespace, non-kubernetes component. Just define your service without the selector attribute.
- With no selector attribute, no endpoints object is created.
- For multi-port services (services that expose more than one port), you must give all of your ports names, so that endpoints can be disambiguated
- Services can be exposed in different ways by specifying a type in the ServiceSpec:
- 1. ClusterIP (default) - Exposes the Service on an internal IP in the cluster. This type makes the Service only reachable from within the cluster.
- 2. NodePort - Exposes the Service on the same port of each selected Node in the cluster using NAT. Makes a Service accessible from outside the cluster using
<NodeIP>:<NodePort>
. This acts as superset of ClusterIP. - 3. LoadBalancer - Creates an external load balancer in the current cloud (if supported) and assigns a fixed, external IP to the Service. This acts as superset of NodePort.
- 4. ExternalName - Exposes the Service using an arbitrary name (specified by externalName in the spec) by returning a CNAME record with the name. No proxy is used. This type requires v1.7 or higher of kube-dns.
- 5. To summarize, ExternalName => is superset of LoadBalancer => is superset of NodePort => is superset of ClusterIP
- Difference b/w port, targetPort and nodePort:
- 1. Port: Port number which makes a service visible to other services running within the same cluster
- 2. TargetPort: Port on which service is running
- 3. NodePort: Port on which the service can be accessed from external users using Kube-Proxy
- A Service routes traffic across a set of Pods.
- A few notes about NodePort type service:
- 1. It is not designed for production environments. Use LoadBalancer or Ingress Controller/Resource for same
- 2. Need to specify extra nodePort attribute to service definition
- 3. Opens specified (or automatically chosen if not specified) port on every node
- 4. You can only have once service per port
- 5. You can only use ports 30000–32767
- 6. If your Node/VM IP address change, you need to deal with that
- A few notes about LoadBalancer type service:
- 1. Best method to expose a service to outside world, if your cloud provider supports it
- 2. There is no filtering, no routing, etc. This means you can send almost any kind of traffic to it, like HTTP, TCP, UDP, Websockets, gRPC, or whatever.
- 3. Each service exposed with LoadBalancer will get its own IP address, and you have to pay for a LoadBalancer per exposed service, which can get expensive.
- Use externalIPs in service spec to set an ip address as target of service. This can be outside cluster
- Services have an integrated load-balancer that will distribute network traffic to all Pods of an exposed Deployment. Services will monitor continuously the running Pods using endpoints, to ensure the traffic is sent only to available Pods.
- If you want to reach a service across namespaces, you need to use the fully qualified domain name (FQDN).
- Default behavior of kubernetes is to lookup a service in local namespace. Service DNS entry is of the form: <service-name>.<namespace-name>.svc.cluster.local.
- Kubernetes offers a DNS cluster addon service that automatically assigns dns names to other services.
- A few notes about Headless services:
- 1. When you don’t need or want load-balancing and a single service IP.
- 2. Specify None as the clusterIP value.
- 3. Allows developers to reduce coupling to the Kubernetes system by allowing them freedom to do discovery their own way.
- 4. Cluster IP is not allocated.
- 5. kube-proxy does not handle these services.
- 6. There is no load balancing or proxying done by the platform for them.
- 7. If selectors are defined, the endpoints controller creates Endpoints records in the API, and modifies the DNS configuration to return A records (addresses) that point directly to the Pods backing the Service.
- 8. If selectors are not defined, no endpoints objects are created. But, CNAME records for ExternalName type services are created and A records for any Endpoints that share a name with the service, for all other types.
Labels
- Labels can be attached to objects at creation time or later on. also, they can be modified at any time.
- Labels are used to specify identifying attributes of objects.
- The name segment is required and must be 63 characters or less.
- The prefix must be a DNS subdomain: a series of DNS labels separated by dots (.), not longer than 253 characters in total, followed by a slash (/).
- Labels are used to specify identifying attributes of objects. Non-identifying information should be recorded using annotations.
- If the prefix is omitted, the label Key is presumed to be private to the user.
- The kubernetes.io/ and k8s.io/ prefixes are reserved for Kubernetes core components.
- Valid label values must be 63 characters or less. They could be empty too.
Label Selectors
- The client, or a user can identify a set of objects. The label selector is the core grouping primitive in Kubernetes.
- Following types of label selectors are supported: equality-based and set-based.
- 1. Equality based label selectors:
- 2. Set based label selectors:
-
Equality based label selectors:
- 1. Operators allowed:
=,==,!=
-
- These can be specified as:
- 1. New line terminated:
environment = production tier != frontend
- 2. Comma separated:
environment=production,tier!=frontend
- 1. Operators allowed:
-
Set based label selectors:
- 1. Operators allowed:
in, notin and exists
(only the key identifier) - 2. Examples:
environment in (production, qa) tier notin (frontend, backend) partition !partition partition,environment notin (qa) >>> Above one selects resources with a partition key(no matter the value) and with environment different than qa
- 1. Operators allowed:
Set-based label selectors can be mixed with equality-based selectors. Example:
partition in (customerA, customerB),environment!=qa
Label selectors can be clubbed with API calls as query params and kubectl commands to filter the returned results. Examples:
partition in (customerA, customerB),environment!=qa
Service and ReplicationController does NOT support set based label selectors.
Newer resources, such as Job, Deployment, Replica Set, and Daemon Set, support set-based label selectors.
Field Selectors
-
They let you select Kubernetes resources based on the value of one or more resource fields. Examples:
status.phase=Pending kubectl get pods --field-selector status.phase=Running
Supported field selectors vary by Kubernetes resource type.
All resource types support the metadata.name and metadata.namespace fields.
Using unsupported field selectors produces an error.
Multiple resources can be filtered in one go. Also, multiple field selectors can be given using comma as separator. Example:
kubectl get statefulsets,services --field-selector=status.phase!=Running,spec.restartPolicy=Always
Annotations
- They store non-identifying object data. They cannot be used to target/select objects based on their value(s).
- The metadata in an annotation can be small or large, structured or unstructured, and can include characters not permitted by labels.
- Annotations have the same syntax as labels.
Kubernetes Object Management
Following are the 3 ways to interact with kubernetes objects:
- Using Imperative commands: They operate on Live objects
- Using Imperative object configuration: They operate on Individual files
- Using Declarative object configuration: They operate on Directories of files
Imperative commands
- Used with kubectl with resource name in the command
Imperative object configuration
- Used with kubectl with operation(create, replace, etc.) and single object config file.
- The object config file specified must contain a full definition of the object in YAML or JSON format.
- Multiple files can also be specified. Example:
kubectl delete -f nginx.yaml -f redis.yaml
Declarative object configuration
- Operates on object configuration files stored locally.
- The user does not define the operations to be taken on the files.
- Create, update, and delete operations are automatically detected per-object by kubectl.
- Uses patch operation to preserve changes made by other writers, while applying new changes (diffs only).
-
To see what changes are going to be made, use:
kubectl diff -f configs/
-
To apply the changes:
kubectl apply -f configs/
Use command line flag -R to process directories recursively.
These files are known as resource configs.
Init Containers
- They are specialized containers that run before app containers.
- Thwy can contain utilities or setup scripts, not present in the app image.
- They always run to completion.
- Each one must complete successfully before the next one is started.
- Custom containers can be specified as initContainers using initContainers field of PodSpec.
- Almost exactly same as regular containers in all aspects of the specs object.
- Does not support readiness probes as they must run to completion before the pod can be ready.
- They are started after network and volumes are initialized.
- Changes to the init container spec are limited to the container image field.
- The activeDeadlineSeconds is applicable on both types of containers.
- App container image changes only restarts the app container, not the init containers. For that, init container image needs to changed.
Secret generator: kustomization.yaml
- A Secret is an object that stores a piece of sensitive data like a password or key.
- Since 1.14, kubectl supports the management of kubernetes objects using a kustomization file.
- You can create a secret by generators in kustomization.yaml.
Persistent volumes(pv) and Persistent volumes claims(pvc)
- Use ReadWriteMany and not ReadWriteOnce when using shared volume.
- Access control: gid(group id) can be assigned to the created volume to restrict access to specific pods with the same gid.
- A PersistentVolume (PV) is a piece of storage in the cluster that has been manually provisioned by an administrator, or dynamically provisioned by kubernetes using a StorageClass.
- A PersistentVolumeClaim (PVC) is a request for storage by a user that can be fulfilled by a Persistent Volume.
- PersistentVolumes and PersistentVolumeClaims are independent from Pod lifecycles and preserve data through restarting, rescheduling, and even deleting Pods.
- Many cluster environments have a default StorageClass installed. When a StorageClass is not specified in the PersistentVolumeClaim, the cluster’s default StorageClass is used instead.
- In local clusters with default storage class (hostPath), data is saved in node's /tmp directory. Hence, could be lost on reboot.
Kubernetes Secret object type
- Stores a piece of sensitive data like a password or key.
DNS
- Kubernetes offers a DNS cluster addon service that automatically assigns dns names to other services.
- To check whether the same is running on your cluster or not, use following:
kubectl get services kube-dns --namespace=kube-system
Ingress
- An API object that manages external access to the services in a cluster, typically HTTP.
- Ingress can provide load balancing, SSL termination and name-based virtual hosting.
- Kind of like a reverse proxy
- An Ingress can be configured to give services externally-reachable URLs, load balance traffic, terminate SSL, and offer name based virtual hosting.
- An Ingress does not expose arbitrary ports or protocols. Exposing services other than HTTP and HTTPS to the internet typically uses a service of type
Service.Type=NodePort
orService.Type=LoadBalancer
. - You must have an Ingress controller to satisfy an Ingress. Only creating an Ingress resource has no effect.
- Note that NOT all ingress controllers support the full spec. Be careful while choosing.
- It supports TLS.
- Supports load balancing. A few of the common algorithms are supported. For others, service loadbalancer can be used.
- Health checks not exposed by default. But, readiness probes can be used.
- Cross availability zones deployments can be done, but depends on cloud provider support. Refer federation documentation for details on deploying Ingress in a federated cluster.
- Types of Ingress:
- 1. Single Service Ingress: Expose a single service. No host and path mapping
- 2. Sample Fanout: Exposes multiple services using path-only mapping
- 3. Name based virtual hosting: Uses domain and path mappings
- At least on GKE, spins up a L7 layer HTTP load balancer, hence, is non-protocol agnostic.
- Lot of available ingress controllers: Google Cloud Load Balancer, Nginx, Contour, Istio, and more.
Ingress Controller
- An Ingress Controller listens to the Kubernetes API for Ingress resources and then handle requests that match them.
- Can technically be any system capable of reverse proxying, but the most common is Nginx.
- Nginx controller needs a backend. Other controllers might not need one.
- An Ingress with no rules sends all traffic to a single default backend.
- The default backend is typically a configuration option of the Ingress controller and is not specified in your Ingress resources
Service discovery
Following are the 2 ways in which service discovery can be provisioned:
- Using environment variables
- DNS (recommended)
Environment variables - for service discovery
- kubelet exposes environment variables of the form {SVCNAME}{SOME_NAME}. Example, for a redis service with cluser ip of _10.0.0.11:
REDIS_MASTER_SERVICE_HOST=10.0.0.11
REDIS_MASTER_SERVICE_PORT=6379
REDIS_MASTER_PORT=tcp://10.0.0.11:6379
REDIS_MASTER_PORT_6379_TCP=tcp://10.0.0.11:6379
REDIS_MASTER_PORT_6379_TCP_PROTO=tcp
REDIS_MASTER_PORT_6379_TCP_PORT=6379
REDIS_MASTER_PORT_6379_TCP_ADDR=10.0.0.11
DNS - for service discovery
- It's a cluster add-on.
- The DNS server watches the Kubernetes API for new services, and creates a set of DNS records for each.
- For service my-service in namespace my-ns, a DNS record for my-service.my-ns is created.
- No need to specify namespace, if pods belong to the same.
- The Kubernetes DNS server is the only way to access services of type ExternalName.
Endpoint
- An Endpoint is an object-oriented representation of a REST API endpoint that is populated on the Kubernates API server. Thus, the endpoint in terms of Kubernetes is the way to access its resource (e.g. a Pod) - the resource behind the 'endpoint'.
- Contains EndpointSubset array.
- EndpointSubset is a group of addresses with a common set of ports. The expanded set of endpoints is the cartesian product of Addresses x Ports.
- EndpointAddress of EndpointSubset may NOT be loopback (127.0.0.0/8), link-local (169.254.0.0/16), or link-local multicast ((224.0.0.0/24).
- IPv6 is also accepted, but not fully supported on all platforms.
- The Service’s selector is continuously evaluated, and the results are POSTed to an endpoints object
- When a Pod dies, it is automatically removed from the endpoints, and new pods matching the service’s selector are automatically added to the endpoints.
- Endpoints track the IP addresses of the objects the service send traffic to.
- And endpoint be loosely coupled with a service by keeping service's and endpoint's name same. See https://kubernetes.io/docs/concepts/services-networking/service/#services-without-selectors for more.
- With no selector attribute mentioned (is a service), no endpoints object is created.
Kube proxy
- It's a special daemon (application) running on every worker node.
- Can run in two modes, configurable with
--proxy-mode
command line switch:- 1. userspace
- 2. iptables
- For higher throughput and better latency, use iptables proxy mode.
- Not IPv6 ready.
- Maintains network rules and performs connection forwarding.
- This is useful for:
- 1. Debugging your services, or connecting to them directly from your laptop for some reason
- 2. Allowing internal traffic, displaying internal dashboards, etc.
Kube dns
- This allows accessing K8s services using their names directly, rather than VIP:PORT combination.
- When you use kube-dns, K8s injects certain nameservice lookup configuration into new pods that allows you to query the DNS records in the cluster.
-
kube-dns creates an internal cluster DNS zone which is used for DNS and service discovery. This means that we can access the services from inside the pods via the service names directly. Example:
curl -I nginx-svc:8080
-
You can use following to see node dns config as setup by kube-dns:
cat /etc/resolv.conf > search default.svc.cluster.local svc.cluster.local cluster.local google.internal c.kube-blog.internal nameserver 10.32.0.10 options ndots:5
-
If the service is created in default namespace, it can be accessed using the cluster internal DNS name, too:
curl -I nginx-svc.default.svc.cluster.local:8080
Etcd
- This is a consistent and highly-available key value store, used as kubernetes’ backing store for all cluster data.
- Make sure to always have a backup plan for etcd’s data for your kubernetes cluster.
-
All data is saved in etc as registries. Example:
/registry/services /registry/events /registry/secrets /registry/minions
-
Following command can be used to query data in etcd:
etcdctl --ca-file=/etc/etcd/ca.pem get /registry/services/endpoints/default/kubernetes
That's all, folks ¯\(ツ)/¯
Drop me a mail at contact@nitinbansal.dev or DM me on twitter if you have any suggestions or need any help with software development.
Also, do visit my blog https://nitinbansal.dev if you like this. I have much more useful content planned to be added.
References
- https://kubernetes.io/docs/tutorials/kubernetes-basics/expose/expose-intro/
- https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands
- https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/
- https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/
- Examples:
- https://stackoverflow.com/questions/54923806/why-do-i-get-unbound-immediate-persistentvolumeclaims-on-minikube
- https://kubernetes.io/docs/tutorials/stateful-application/mysql-wordpress-persistent-volume/
- https://github.com/containous/traefik
- https://blog.openshift.com/kubernetes-services-by-example/
- http://containerops.org/2017/01/30/kubernetes-services-and-ingress-under-x-ray/
- https://matthewpalmer.net/kubernetes-app-developer/articles/kubernetes-ingress-guide-nginx-example.html
- https://medium.com/@cashisclay/kubernetes-ingress-82aa960f658e
- https://github.com/kubernetes/kops/tree/master/addons
Top comments (0)