Kubernetes

AI-Optimized Primer | Production-ready container orchestration patterns for Claude Code, Cursor & AI agents

Primer Metadata

Category: Infrastructure & DevOps
Difficulty: Intermediate to Advanced
Examples: 85+ copy-paste YAML configurations
Tokens: 32,000+ comprehensive coverage
Last Updated: January 28, 2025
AI-Ready: Optimized for agent consumption

The Ultimate Guide to Container Orchestration and Cloud-Native Application Deployment Tags: #kubernetes #containers #orchestration #devops #cloud-native

Introduction & Philosophy
Installation & Setup
Core Concepts & Architecture
Workload Management
Networking & Service Discovery
Storage & Persistence
Configuration Management
Security Best Practices
Monitoring & Observability
Advanced Features
Custom Resource Definitions (CRDs) and Custom Resources
Helm Package Management
Service Mesh
Serverless and Knative
Autoscaling
Resource Management
Multi-Cluster Management
Real-World Patterns and Use Cases
GitOps Workflows
Troubleshooting
CI/CD & GitOps Integration
Best Practices
Resources & Next Steps

Introduction & Philosophy

What is Kubernetes?

Kubernetes (often abbreviated as K8s) is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. It provides a framework for running distributed systems resiliently, handling scaling and failover for applications, and providing deployment patterns.

Core Philosophy

"Declarative Configuration and Desired State Management" - Kubernetes embodies the principle that you declare what you want, not how to achieve it. The system continuously works to match the actual state to your desired state.

Key philosophical principles:

Declarative Configuration: Describe the desired state, and Kubernetes works to achieve and maintain it
Automation: Automates manual processes involved in deploying, scaling, and managing containerized applications
Self-Healing: Automatically restarts failed containers, replaces and reschedules containers when nodes die
Extensibility: Designed to be extended and customized without changing the core system
Portability: Runs anywhere - on-premises, hybrid cloud, public cloud, and edge computing

Key Differentiators

Not a PaaS: Doesn't limit supported languages/runtimes or frameworks
Container-Centric: Manages containers rather than VMs or bare metal
Loosely Coupled: Components are extensible and can be swapped out
Ecosystem Rich: Large and growing ecosystem of tools and integrations
Cloud-Native: Designed for cloud architectures and microservices

Architecture Overview

┌────────────────────────────────────────────────────────────┐
│                     Control Plane                          │
│  ┌─────────────┐  ┌──────────────┐  ┌──────────────────┐   │
│  │ API Server  │  │    etcd      │  │  Controller Mgr  │   │
│  │             │  │   (Key-Value │  │                  │   │
│  │  Frontend   │  │    Store)    │  │  Node Controller │   │
│  │             │  │              │  │ Replication Ctrl │   │
│  └─────────────┘  └──────────────┘  └──────────────────┘   │
│          │                    │                            │
│  ┌──────────────┐  ┌────────────────────┐                  │
│  │   Scheduler  │  │ Cloud Controller   │                  │
│  │              │  │       Mgr          │                  │
│  └──────────────┘  └────────────────────┘                  │
└────────────────────────────────────────────────────────────┘
                              │
                              │ Watches/Communicates
                              │
┌─────────────────────────────────────────────────────────────┐
│                    Worker Nodes                             │
│                                                             │
│  Node 1                      Node 2                      Node N
│  ┌─────────────┐              ┌─────────────┐              ┌─────────────┐
│  │   kubelet   │              │   kubelet   │              │   kubelet   │
│  │             │              │             │              │             │
│  └─────────────┘              └─────────────┘              └─────────────┘
│         │                          │                          │
│  ┌─────────────┐              ┌─────────────┐              ┌─────────────┐
│  │ kube-proxy  │              │ kube-proxy  │              │ kube-proxy  │
│  │             │              │             │              │             │
│  └─────────────┘              └─────────────┘              └─────────────┘
│         │                          │                          │
│  ┌─────────────┐              ┌─────────────┐              ┌─────────────┐
│  │ Container   │              │ Container   │              │ Container   │
│  │ Runtime     │              │ Runtime     │              │ Runtime     │
│  │ (Docker/    │              │ (Docker/    │              │ (Docker/    │
│  │  containerd │              │  containerd │              │  containerd │
│  │  /CRI-O)    │              │  /CRI-O)    │              │  /CRI-O)    │
│  └─────────────┘              └─────────────┘              └─────────────┘
│         │                          │                          │
│  ┌─────────────┐              ┌─────────────┐              ┌─────────────┐
│  │    Pods     │              │    Pods     │              │    Pods     │
│  │ (Containers)│              │ (Containers)│              │ (Containers)│
│  └─────────────┘              └─────────────┘              └─────────────┘
└─────────────────────────────────────────────────────────────┘

Installation & Setup

System Requirements

Minimum Requirements for Production Clusters

CPU: 2+ CPUs per machine
RAM: 2GB+ RAM per machine
Disk: 20GB+ free disk space
Network: Full network connectivity between all machines
Unique Identifiers: Unique hostname, MAC address, and product_uuid for each node
Swap: Disabled (kubelet fails to start if swap is detected by default)

Required Ports

Control Plane Nodes: 6443 (API server), 2379-2380 (etcd), 10250 (kubelet), 10259 (kube-scheduler), 10257 (kube-controller-manager)
Worker Nodes: 10250 (kubelet), 30000-32767 (NodePort services)

Local Development Setups

Minikube

# Installation
curl -LO https://github.com/kubernetes/minikube/releases/latest/download/minikube-linux-amd64
sudo install minikube-linux-amd64 /usr/local/bin/minikube

# Start cluster
minikube start --driver=docker

# Access dashboard
minikube dashboard

# Example deployment
kubectl create deployment hello-minikube --image=kicbase/echo-server:1.0
kubectl expose deployment hello-minikube --type=NodePort --port=8080

Kind (Kubernetes in Docker)

# Installation
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.30.0/kind-linux-amd64
chmod +x ./kind
sudo mv ./kind /usr/local/bin/kind

# Create single-node cluster
kind create cluster

# Create multi-node cluster
kind create cluster --config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
EOF

Production Installation Methods

kubeadm

# On all nodes:
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg

# Add Kubernetes repository
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.34/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.34/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

# Install Kubernetes components
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

# Initialize control plane (on master node)
sudo kubeadm init --pod-network-cidr=192.168.0.0/16

# Set up kubectl for regular user
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

# Install network plugin (e.g., Calico)
kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

Cloud Provider Options

Amazon EKS

# Install eksctl
curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp
sudo mv /tmp/eksctl /usr/local/bin

# Create cluster
eksctl create cluster \
  --name my-cluster \
  --region us-east-1 \
  --node-type t3.medium \
  --nodes 3

Google Kubernetes Engine (GKE)

# Create cluster
gcloud container clusters create my-cluster \
  --zone us-central1-a \
  --machine-type e2-medium \
  --num-nodes 3

# Configure kubectl
gcloud container clusters get-credentials my-cluster --zone us-central1-a

Azure Kubernetes Service (AKS)

# Create resource group
az group create --name myResourceGroup --location eastus

# Create AKS cluster
az aks create \
  --resource-group myResourceGroup \
  --name myAKSCluster \
  --node-count 3 \
  --enable-addons monitoring \
  --generate-ssh-keys

# Configure kubectl
az aks get-credentials --resource-group myResourceGroup --name myAKSCluster

Post-Installation Configuration

Common Setup Tasks

# Verify cluster status
kubectl cluster-info
kubectl get nodes

# Install Helm package manager
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

# Install Kubernetes Dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0/aio/deploy/recommended.yaml

Core Concepts & Architecture

Kubernetes Objects

Object Structure

All Kubernetes objects contain two main fields:

spec: Describes the desired state
status: Describes the current state, updated by Kubernetes

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: my-container
    image: nginx:latest
status:
  phase: Running

Common Kubernetes Objects

Pod: Smallest deployable unit
Service: Network abstraction for pods
Deployment: Manages ReplicaSets and provides declarative updates
StatefulSet: Manages stateful applications
ConfigMap: Stores non-sensitive configuration data
Secret: Stores sensitive data
Volume: Storage for containers
Namespace: Virtual cluster within a physical cluster
Ingress: Manages external access to services

Namespaces

Namespaces provide a scope for names within a cluster:

# List namespaces
kubectl get namespaces

# Create namespace
kubectl create namespace my-namespace

# Set current namespace
kubectl config set-context --current --namespace=my-namespace

Labels and Selectors

Labels are key/value pairs attached to objects:

metadata:
  labels:
    app: my-app
    version: v1.2
    environment: production

Selectors are used to filter objects:

selector:
  matchLabels:
    app: my-app
  matchExpressions:
  - key: environment
    operator: In
    values:
    - production
    - staging

Annotations

Annotations provide metadata about objects:

metadata:
  annotations:
    description: "Frontend web server"
    contact: "team@example.com"
    last-modified: "2024-01-15T10:00:00Z"

Workload Management

Pods

Pods are the smallest deployable units in Kubernetes:

apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
spec:
  containers:
  - name: nginx
    image: nginx:1.25
    ports:
    - containerPort: 80

Multi-Container Pods

apiVersion: v1
kind: Pod
metadata:
  name: multi-container-pod
spec:
  containers:
  - name: nginx
    image: nginx:1.25
    volumeMounts:
    - name: shared-data
      mountPath: /usr/share/nginx/html
  - name: content-creator
    image: busybox
    command: ["/bin/sh", "-c"]
    args:
    - while true; do
        echo "$(date) - Hello from sidecar!" > /usr/share/nginx/html/index.html;
        sleep 10;
      done
    volumeMounts:
    - name: shared-data
      mountPath: /usr/share/nginx/html
  volumes:
  - name: shared-data
    emptyDir: {}

Deployments

Deployments manage ReplicaSets and provide declarative updates:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.25
        ports:
        - containerPort: 80

Rolling Updates

# Update deployment image
kubectl set image deployment/nginx-deployment nginx=nginx:1.26

# Rollback to previous version
kubectl rollout undo deployment/nginx-deployment

# Check rollout status
kubectl rollout status deployment/nginx-deployment

StatefulSets

StatefulSets for stateful applications:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres-statefulset
spec:
  serviceName: postgres
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15
        env:
        - name: POSTGRES_DB
          value: mydb
        - name: POSTGRES_USER
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: username
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        volumeMounts:
        - name: postgres-data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: postgres-data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

DaemonSets

DaemonSets ensure all (or some) nodes run a pod:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-logging
  template:
    metadata:
      labels:
        name: fluentd-logging
    spec:
      tolerations:
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      containers:
      - name: fluentd
        image: fluent/fluentd:v1.16
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: containers
          mountPath: /var/lib/docker/containers
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: containers
        hostPath:
          path: /var/lib/docker/containers

Jobs and CronJobs

Jobs for batch processing:

apiVersion: batch/v1
kind: Job
metadata:
  name: batch-job
spec:
  template:
    spec:
      containers:
      - name: batch
        image: busybox
        command: ["echo", "Hello from batch job!"]
      restartPolicy: Never

CronJobs for scheduled tasks:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: hello-cronjob
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            command: ["echo", "Hello from cron job!"]
          restartPolicy: OnFailure

Networking & Service Discovery

Cluster Networking Models

Kubernetes imposes the following fundamental requirements on any networking implementation:

All pods can communicate with all other pods without NAT
All nodes can communicate with all pods without NAT
The IP that a pod sees itself as is the same IP that others see it as

Service Types

ClusterIP

Default service type, exposes the service on a cluster-internal IP:

apiVersion: v1
kind: Service
metadata:
  name: my-service
spec:
  selector:
    app: my-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080

NodePort

Exposes the service on each Node's IP at a static port:

apiVersion: v1
kind: Service
metadata:
  name: my-nodeport-service
spec:
  type: NodePort
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 80
      nodePort: 30007

LoadBalancer

Exposes the service externally using a cloud provider's load balancer:

apiVersion: v1
kind: Service
metadata:
  name: my-loadbalancer-service
spec:
  type: LoadBalancer
  selector:
    app: my-app
  ports:
    - port: 80
      targetPort: 80

Service Discovery

Kubernetes provides two primary modes of service discovery:

DNS: Services are assigned DNS names
Environment Variables: Services are available as environment variables

Network Policies

Network policies control traffic flow between pods:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: test-network-policy
  namespace: default
spec:
  podSelector:
    matchLabels:
      role: db
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          project: myproject
    ports:
    - protocol: TCP
      port: 6379
  egress:
  - to:
    - ipBlock:
        cidr: 10.0.0.0/24
  ports:
  - protocol: TCP
      port: 5978

Ingress

Ingress manages external access to services:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: example-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - http:
      paths:
      - path: /app1
        pathType: Prefix
        backend:
          service:
            name: app1-service
            port:
              number: 80
      - path: /app2
        pathType: Prefix
        backend:
          service:
            name: app2-service
            port:
              number: 80

Storage & Persistence

Volume Types

emptyDir

EmptyDir is created when a Pod is assigned to a node and exists as long as that Pod is running on that node:

volumes:
- name: cache-volume
  emptyDir:
    sizeLimit: 500Mi

hostPath

HostPath mounts a file or directory from the host node's filesystem:

volumes:
- name: host-volume
  hostPath:
    path: /data
    type: DirectoryOrCreate

Persistent Volumes and Claims

PersistentVolume (PV)

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
spec:
  capacity:
    storage: 10Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: slow
  hostPath:
    path: /mnt/data

PersistentVolumeClaim (PVC)

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 8Gi
  storageClassName: slow

Storage Classes

StorageClasses enable dynamic provisioning:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
reclaimPolicy: Delete
allowVolumeExpansion: true
mountOptions:
  - debug
volumeBindingMode: Immediate

Using Volumes in Pods

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: my-container
    image: nginx
    volumeMounts:
    - name: my-volume
      mountPath: /data
  volumes:
  - name: my-volume
    persistentVolumeClaim:
      claimName: my-pvc

Configuration Management

ConfigMaps

ConfigMaps store non-sensitive configuration data:

apiVersion: v1
kind: ConfigMap
metadata:
  name: my-config
data:
  database_url: "postgres://localhost:5432/mydb"
  feature_flags: |
    feature1=true
    feature2=false
    feature3=true

Using ConfigMaps

As environment variables:

envFrom:
  - configMapRef:
      name: my-config

As mounted files:

volumeMounts:
- name: config-volume
  mountPath: /etc/config
volumes:
- name: config-volume
  configMap:
    name: my-config

Secrets

Secrets store sensitive data:

apiVersion: v1
kind: Secret
metadata:
  name: my-secret
type: Opaque
data:
  username: YWRtaW4=  # base64 encoded
  password: MWYyZDFlMmU2N2Rm  # base64 encoded

Kustomize

Kustomize enables configuration customization:

# kustomization.yaml
resources:
- deployment.yaml
- service.yaml

configMapGenerator:
- name: app-config
  files:
  - config.properties

images:
- name: my-app
  newTag: v1.2.3

Security Best Practices

Authentication and Authorization

RBAC (Role-Based Access Control)

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: default
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: default
subjects:
- kind: User
  name: jane
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Pod Security

Security Context

securityContext:
  runAsUser: 1000
  runAsGroup: 3000
  fsGroup: 2000

Pod Security Standards

apiVersion: policy/v1
kind: PodSecurityPolicy
metadata:
  name: restricted
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    - 'persistentVolumeClaim'
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'

Network Security

Network Policies

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
spec:
  podSelector: {}
  policyTypes:
  - Ingress
  - Egress

Secrets Management

External Secrets Operator

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: vault-backend
    kind: SecretStore
  target:
    name: db-credentials
  data:
  - secretKey: username
    remoteRef:
      key: secret/data/database
      property: username
  - secretKey: password
    remoteRef:
      key: secret/data/database
      property: password

Monitoring & Observability

Metrics Collection

Prometheus

apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
    scrape_configs:
    - job_name: 'kubernetes-pods'
      kubernetes_sd_configs:
      - role: pod

Metrics Server

# Install Metrics Server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

# Check resource usage
kubectl top nodes
kubectl top pods

Logging

Fluentd Configuration

apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd-config
data:
  fluent.conf: |
    <source>
      @type tail
      path /var/log/containers/*_app_*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      format json
      time_format %Y-%m-%dT%H:%M:%S.%NZ
    </source>
    
    <match kubernetes.**>
      @type elasticsearch
      host elasticsearch
      port 9200
      index_name fluentd
      type_name _doc
    </match>

Health Checks

Liveness and Readiness Probes

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 15
  periodSeconds: 20

readinessProbe:
  exec:
    command:
    - cat
    - /tmp/healthy
  initialDelaySeconds: 5
  periodSeconds: 10

Advanced Features

Custom Resource Definitions (CRDs)

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: crontabs.stable.example.com
spec:
  group: stable.example.com
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                cronSpec:
                  type: string
                image:
                  type: string
  scope: Namespaced
  names:
    plural: crontabs
    singular: crontab
    kind: CronTab
    shortNames:
    - ct

Operators

Operator SDK Example

// controllers/crontab_controller.go
func (r *CronTabReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    // Fetch the CronTab instance
    instance := &stablev1.CronTab{}
    err := r.Get(ctx, req.NamespacedName, instance)
    if err != nil {
        // Handle error
    }
    
    // Reconcile logic here
    return ctrl.Result{}, nil
}

Service Mesh

Istio Gateway

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: bookinfo-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - "*"

Custom Resource Definitions (CRDs) and Custom Resources

Overview

CRDs extend Kubernetes API to define custom resources, allowing you to treat your applications as native Kubernetes objects.

Creating a CRD

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.myapp.example.com
spec:
  group: myapp.example.com
  versions:
    - name: v1alpha1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                engine:
                  type: string
                  enum: [mysql, postgresql, mongodb]
                version:
                  type: string
                size:
                  type: string
                  pattern: '^\d+(Gi|Mi)$'
                replicas:
                  type: integer
                  minimum: 1
                  maximum: 10
            status:
              type: object
              properties:
                phase:
                  type: string
                message:
                  type: string
                readyReplicas:
                  type: integer
      additionalPrinterColumns:
        - name: Engine
          type: string
          jsonPath: .spec.engine
        - name: Size
          type: string
          jsonPath: .spec.size
        - name: Status
          type: string
          jsonPath: .status.phase
  scope: Namespaced
  names:
    plural: databases
    singular: database
    kind: Database
    shortNames:
    - db

Creating a Custom Resource

apiVersion: myapp.example.com/v1alpha1
kind: Database
metadata:
  name: prod-mysql
spec:
  engine: mysql
  version: "8.0"
  size: 20Gi
  replicas: 3

Validation and Webhooks

Admission Webhook for Validation

apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingWebhookConfiguration
metadata:
  name: database-validator
webhooks:
- name: database-validator.myapp.example.com
  rules:
  - apiGroups: ["myapp.example.com"]
    apiVersions: ["v1alpha1"]
    operations: ["CREATE", "UPDATE"]
    resources: ["databases"]
  clientConfig:
    service:
      name: database-webhook-service
      namespace: myapp-system
      path: "/validate"
  admissionReviewVersions: ["v1"]
  sideEffects: None

Conversion Webhook for Version Migration

apiVersion: admissionregistration.k8s.io/v1
kind: ConversionReview
metadata:
  name: database-conversion
webhooks:
- name: database-converter.myapp.example.com
  conversionReviewVersions: ["v1", "v1alpha1"]
  clientConfig:
    service:
      name: database-converter-service
      namespace: myapp-system

Best Practices for CRDs

Start with v1alpha1 for unstable APIs
Use semantic versioning
Provide clear validation schemas
Include status subresource for reporting state
Use labels and annotations for metadata
Implement proper garbage collection
Document API changes thoroughly

Operators and Operator Pattern

What is an Operator?

An operator extends Kubernetes to automate the management of complex applications using custom resources and controllers.

Operator Architecture

┌─────────────────────────────────────────────────┐
│                Kubernetes API                   │
└─────────────────────┬───────────────────────────┘
                      │
┌─────────────────────▼───────────────────────────┐
│              Custom Controller                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────┐  │
│  │ Watch CRDs  │  │ Reconcile   │  │  Act    │  │
│  │             │──▶ Loop        │──▶         │  │
│  └─────────────┘  └─────────────┘  └─────────┘  │
└─────────────────────────────────────────────────┘

Building an Operator with Operator SDK

Project Structure

my-operator/
├── api/
│   └── v1alpha1/
│       ├── database_types.go
│       ├── database_webhook.go
│       └── groupversion_info.go
├── controllers/
│   └── database_controller.go
├── config/
│   ├── crd/
│   ├── manager/
│   └── webhook/
├── main.go
└── go.mod

Controller Implementation

// controllers/database_controller.go
package controllers

import (
    "context"
    
    "k8s.io/apimachinery/pkg/runtime"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/client"
    
    myappv1alpha1 "myapp.example.com/api/v1alpha1"
)

type DatabaseReconciler struct {
    client.Client
    Scheme *runtime.Scheme
}

func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    // Fetch the Database instance
    database := &myappv1alpha1.Database{}
    if err := r.Get(ctx, req.NamespacedName, database); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }
    
    // Check if deployment exists
    deployment := &appsv1.Deployment{}
    if err := r.Get(ctx, req.NamespacedName, deployment); err != nil {
        if errors.IsNotFound(err) {
            // Create deployment
            return r.createDeployment(ctx, database)
        }
        return ctrl.Result{}, err
    }
    
    // Update status
    database.Status.ReadyReplicas = deployment.Status.ReadyReplicas
    if err := r.Status().Update(ctx, database); err != nil {
        return ctrl.Result{}, err
    }
    
    return ctrl.Result{RequeueAfter: time.Minute * 5}, nil
}

Operator Patterns

Stateless Operator: Manages resources without persistent state
Stateful Operator: Maintains state about managed resources
Leader Election: Runs only one instance in HA setup
Finalizers: Clean up resources before deletion

Example: Database Operator with Finalizer

func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    // ... existing code ...
    
    // Handle deletion
    if !database.ObjectMeta.DeletionTimestamp.IsZero() {
        if containsString(database.ObjectMeta.Finalizers, "database.finalizer") {
            // Perform cleanup
            if err := r.cleanupDatabase(ctx, database); err != nil {
                return ctrl.Result{}, err
            }
            
            // Remove finalizer
            database.ObjectMeta.Finalizers = removeString(database.ObjectMeta.Finalizers, "database.finalizer")
            if err := r.Update(ctx, database); err != nil {
                return ctrl.Result{}, err
            }
        }
        return ctrl.Result{}, nil
    }
    
    // Add finalizer if not present
    if !containsString(database.ObjectMeta.Finalizers, "database.finalizer") {
        database.ObjectMeta.Finalizers = append(database.ObjectMeta.Finalizers, "database.finalizer")
        if err := r.Update(ctx, database); err != nil {
            return ctrl.Result{}, err
        }
    }
    
    // ... rest of reconciliation logic ...
}

Popular Operators

Prometheus Operator: Manages Prometheus, Alertmanager, and related components
Elasticsearch Operator: Automates Elasticsearch cluster management
PostgreSQL Operator: Manages PostgreSQL clusters
Strimzi Operator: Runs Apache Kafka on Kubernetes
Cert-Manager: Automates certificate management

Operator Maturity Levels

Level 1: Basic installation
Level 2: Seamless upgrades
Level 3: Full lifecycle API
Level 4: Auto-pilot (automatic healing, scaling)
Level 5: Autonomous operations

Helm Package Management

Helm Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Helm Client   │───▶│   Helm Repo     │───▶│   Chart Repo    │
└─────────────────┘    └─────────────────┘    └─────────────────┘
        │                                            │
        ▼                                            ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│  Kubernetes API │◀───│    Tiller       │◀───│    Chart (tar)  │
└─────────────────┘    └─────────────────┘    └─────────────────┘

Creating a Helm Chart

Chart Structure

my-chart/
├── Chart.yaml          # Chart metadata
├── values.yaml         # Default values
├── values.schema.json  # Values schema validation
├── charts/             # Dependency charts
├── templates/          # Manifest templates
│   ├── deployment.yaml
│   ├── service.yaml
│   ├── ingress.yaml
│   ├── _helpers.tpl    # Template helpers
│   └── NOTES.txt       # Installation notes
└── tests/              # Test templates
    └── test-connection.yaml

Chart.yaml

apiVersion: v2
name: my-app
description: A Helm chart for my application
type: application
version: 0.1.0
appVersion: "1.16.0"
keywords:
  - web
  - nginx
home: https://github.com/myorg/my-app
sources:
  - https://github.com/myorg/my-app
maintainers:
  - name: John Doe
    email: john@example.com
dependencies:
  - name: redis
    version: "^14.0.0"
    repository: "https://charts.bitnami.com/bitnami"

Template with Values

# templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "my-app.fullname" . }}
  labels:
    {{- include "my-app.labels" . | nindent 4 }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      {{- include "my-app.selectorLabels" . | nindent 6 }}
  template:
    metadata:
      labels:
        {{- include "my-app.selectorLabels" . | nindent 8 }}
    spec:
      containers:
        - name: {{ .Chart.Name }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          ports:
            - name: http
              containerPort: 80
              protocol: TCP
          env:
            {{- range $key, $value := .Values.env }}
            - name: {{ $key }}
              value: {{ $value | quote }}
            {{- end }}
          resources:
            {{- toYaml .Values.resources | nindent 12 }}
      {{- with .Values.nodeSelector }}
      nodeSelector:
        {{- toYaml . | nindent 8 }}
      {{- end }}

Values.yaml

# Default values for my-app.
replicaCount: 1

image:
  repository: nginx
  pullPolicy: IfNotPresent
  tag: ""

env:
  NODE_ENV: production
  DEBUG: "false"

service:
  type: ClusterIP
  port: 80

ingress:
  enabled: false
  className: ""
  annotations: {}
  hosts:
    - host: chart-example.local
      paths:
        - path: /
          pathType: ImplementationSpecific

resources:
  limits:
    cpu: 100m
    memory: 128Mi
  requests:
    cpu: 100m
    memory: 128Mi

autoscaling:
  enabled: false
  minReplicas: 1
  maxReplicas: 100
  targetCPUUtilizationPercentage: 80

Advanced Helm Features

Template Functions

# Using built-in functions
env:
  - name: POD_NAME
    valueFrom:
      fieldRef:
        fieldPath: metadata.name
  - name: NAMESPACE
    value: {{ .Release.Namespace | quote }}
  - name: CONFIG_HASH
    value: {{ .Values.config | toYaml | sha256sum | trunc 8 | quote }}

Conditionals and Loops

# Conditional inclusion
{{- if .Values.ingress.enabled -}}
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: {{ include "my-app.fullname" . }}
  annotations:
    {{- toYaml .Values.ingress.annotations | nindent 4 }}
spec:
  # ... ingress spec
{{- end }}

# Loop through volumes
volumes:
{{- range .Values.volumes }}
  - name: {{ .name }}
    persistentVolumeClaim:
      claimName: {{ .claimName }}
{{- end }}

Named Templates

# templates/_helpers.tpl
{{- define "my-app.labels" -}}
helm.sh/chart: {{ include "my-app.chart" . }}
{{ include "my-app.selectorLabels" . }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end }}

Helm Repository Management

Creating a Helm Repository

# Package chart
helm package my-chart/

# Create index
helm repo index .

# Serve with nginx
docker run -v $(pwd):/usr/share/nginx/html -p 8080:80 nginx

Adding and Using Repositories

# Add repository
helm repo my-repo https://my-repo.example.com/charts

# Update repositories
helm repo update

# Install chart
helm install my-release my-repo/my-chart

# Search charts
helm search repo my-repo

Helm 3 Features

No Tiller component
Client-side only
Improved security
Better library support
Helm tests
OCI registry support

Service Mesh

Service Mesh Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        Service Mesh                             │
├─────────────────────┬─────────────────────┬─────────────────────┤
│   Control Plane     │     Data Plane      │     Application     │
│                     │                     │                     │
│  ┌─────────────┐    │  ┌─────────────┐    │  ┌─────────────┐    │
│  │   Istiod    │    │  │  Envoy      │    │  │  Micro-     │    │
│  │  (Pilot)    │    │  │  Proxy      │    │  │  service    │    │
│  │             │    │  │             │    │  │             │    │
│  └─────────────┘    │  └─────────────┘    │  └─────────────┘    │
│  ┌─────────────┐    │  ┌─────────────┐    │  ┌─────────────┐    │
│  │   Citadel   │    │  │  Envoy      │    │  │  Micro-     │    │
│  │  (mTLS)     │    │  │  Proxy      │    │  │  service    │    │
│  │             │    │  │             │    │  │             │    │
│  └─────────────┘    │  └─────────────┘    │  └─────────────┘    │
│  ┌─────────────┐    │                     │                     │
│  │   Galley    │    │                     │                     │
│  │ (Config)    │    │                     │                     │
│  │             │    │                     │                     │
│  └─────────────┘    │                     │                     │
└─────────────────────┴─────────────────────┴─────────────────────┘

Istio Installation and Configuration

Install Istio

# Download Istio
curl -L https://istio.io/downloadIstio | sh -

# Install demo profile
istioctl install --set profile=demo -y

# Enable sidecar injection
kubectl label namespace default istio-injection=enabled

Gateway and VirtualService

# Gateway
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: myapp-gateway
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 80
      name: http
      protocol: HTTP
    hosts:
    - myapp.example.com

# VirtualService
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - myapp.example.com
  gateways:
  - myapp-gateway
  http:
  - match:
    - headers:
        user-agent:
          regex: ".*Mobile.*"
    route:
    - destination:
        host: myapp
        subset: v2
  - route:
    - destination:
        host: myapp
        subset: v1

DestinationRule

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: myapp
spec:
  host: myapp
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 100
        maxRequestsPerConnection: 10
    outlierDetection:
      consecutiveGatewayErrors: 5
      interval: 30s
      baseEjectionTime: 30s
  subsets:
  - name: v1
    labels:
      version: v1
  - name: v2
    labels:
      version: v2

Service Mesh Features

Traffic Management

Request routing
Load balancing
Retries and timeouts
Fault injection
Mirroring traffic

Security

mTLS between services
RBAC for services
JWT validation
Audit logging

Observability

Distributed tracing
Metrics collection
Access logging
Custom metrics

Linkerd Service Mesh

Install Linkerd

# Install CLI
curl -sL https://run.linkerd.io/install | sh

# Install on cluster
linkerd install | kubectl apply -f -

# Check installation
linkerd check

# Inject mesh
kubectl get deploy -o yaml | linkerd inject - | kubectl apply -f -

Linkerd Service Profile

apiVersion: linkerd.io/v1alpha2
kind: ServiceProfile
metadata:
  name: myapp.default.svc.cluster.local
spec:
  routes:
  - name: GET /api/users
    condition:
      method: GET
      path: /api/users
  - name: POST /api/users
    condition:
      method: POST
      path: /api/users
  retryBudget:
    retryRatio: 0.2
    minRetriesPerSecond: 10
    ttl: 10s

Service Mesh Best Practices

Start with monitoring before enabling policies
Use incremental rollout
Monitor performance impact
Implement proper mTLS key rotation
Use service profiles for optimization

Serverless and Knative

Knative Architecture

┌─────────────────────────────────────────────────────────────────┐
│                          Knative                                │
├─────────────────────┬─────────────────────┬─────────────────────┤
│   Serving           │      Eventing       │      Build          │
│                     │                     │                     │
│   ┌─────────────┐   │  ┌─────────────┐    │  ┌─────────────┐    │
│   │  Autoscaler │   │  │   Broker    │    │  │   Build     │    │
│   │             │   │  │             │    │  │             │    │
│   └─────────────┘   │  └─────────────┘    │  └─────────────┘    │
│   ┌─────────────┐   │  ┌─────────────┐    │  ┌─────────────┐    │
│   │  Activator  │   │  │  Trigger    │    │  │   Tekton    │    │
│   │             │   │  │             │    │  │             │    │
│   └─────────────┘   │  └─────────────┘    │  └─────────────┘    │
│   ┌─────────────┐   │                     │                     │
│   │   Queue     │   │                     │                     │
│   │             │   │                     │                     │
│   └─────────────┘   │                     │                     │
└─────────────────────┴─────────────────────┴─────────────────────┘

Knative Serving

Install Knative Serving

# Install Serving
kubectl apply -f https://github.com/knative/serving/releases/download/knative-v1.10.0/serving-core.yaml

# Install networking layer (Contour)
kubectl apply -f https://github.com/knative/net-contour/releases/download/knative-v1.10.0/contour.yaml

# Configure default networking
kubectl patch configmap/config-network \
  --namespace knative-serving \
  --type merge \
  --patch '{"data":{"ingress.class":"contour.ingress.networking.knative.dev"}}'

Knative Service

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: hello-world
spec:
  template:
    spec:
      containers:
        - image: gcr.io/knative-samples/helloworld-go
          ports:
            - containerPort: 8080
          env:
            - name: TARGET
              value: "World"

Autoscaling Configuration

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: autoscaled-service
spec:
  template:
    metadata:
      annotations:
        autoscaling.knative.dev/min-scale: "1"
        autoscaling.knative.dev/max-scale: "10"
        autoscaling.knative.dev/target: "100"
        autoscaling.knative.dev/target-utilization-percentage: "70"
    spec:
      containers:
        - image: my-app:latest

Knative Eventing

Install Eventing

# Install Eventing
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.10.0/eventing-core.yaml

# Install Channel (MTChannelBasedBroker)
kubectl apply -f https://github.com/knative/eventing/releases/download/knative-v1.10.0/mt-channel-broker.yaml

Event Source (CronJob)

apiVersion: sources.knative.dev/v1
kind: CronJobSource
metadata:
  name: cronjob-source
spec:
  schedule: "*/1 * * * *"
  data: '{"message": "Hello world!"}'
  sink:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: event-display

Trigger

apiVersion: eventing.knative.dev/v1
kind: Trigger
metadata:
  name: my-service-trigger
spec:
  broker: default
  filter:
    attributes:
      type: dev.knative.sources.cron
  subscriber:
    ref:
      apiVersion: serving.knative.dev/v1
      kind: Service
      name: my-service

Serverless Patterns

Event-driven architecture
API Gateway integration
Stream processing
Scheduled tasks
Chatbots and automation

Autoscaling

Horizontal Pod Autoscaler (HPA)

Basic HPA

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-app-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 50
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 50
        periodSeconds: 60
      - type: Pods
        value: 2
        periodSeconds: 60
      selectPolicy: Max

HPA with Custom Metrics

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: custom-metrics-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  minReplicas: 1
  maxReplicas: 20
  metrics:
  - type: Pods
    pods:
      metric:
        name: requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"
  - type: External
    external:
      metric:
        name: queue_messages_ready
        selector:
          matchLabels:
            queue: "tasks"
      target:
        type: AverageValue
        averageValue: "30"

Vertical Pod Autoscaler (VPA)

Install VPA

# Install VPA
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler/
./hack/vpa-up.sh

VPA Configuration

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: "apps/v1"
    kind:       "Deployment"
    name:       "my-app"
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: "my-app"
      minAllowed:
        cpu: "100m"
        memory: "100Mi"
      maxAllowed:
        cpu: "1"
        memory: "1Gi"
      controlledResources: ["cpu", "memory"]

Cluster Autoscaler

Cluster Autoscaler on GKE

# Enable cluster autoscaler
gcloud container clusters update my-cluster \
  --enable-autoscaling \
  --min-nodes 1 \
  --max-nodes 10 \
  --node-pool default-pool

Node Auto-Provisioning

# Enable node auto-provisioning
gcloud container clusters update my-cluster \
  --enable-autoprovisioning \
  --autoprovisioning-config-file=config.yaml

Configuration File

# config.yaml
resourceLimits:
  - resourceType: 'cpu'
    minimum: '4'
    maximum: '100'
  - resourceType: 'memory'
    minimum: '4'
    maximum: '1000'
autoprovisioningNodePoolDefaults:
  diskSizeGb: 100
  diskType: 'pd-ssd'
  management:
    autoRepair: true
    autoUpgrade: true

KEDA (Kubernetes Event-Driven Autoscaler)

Install KEDA

# Install KEDA
kubectl apply -f https://github.com/kedacore/keda/releases/download/v2.8.0/keda.yaml

Scale based on RabbitMQ queue

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: rabbitmq-consumer
spec:
  scaleTargetRef:
    name: rabbitmq-consumer
  pollingInterval: 30
  cooldownPeriod: 300
  minReplicaCount: 0
  maxReplicaCount: 30
  triggers:
  - type: rabbitmq
    metadata:
      host: "amqp://user:password@rabbitmq:5672"
      queueName: "myqueue"
      queueLength: "20"

Scale based on Prometheus metrics

apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
  name: prometheus-scaler
spec:
  scaleTargetRef:
    name: my-app
  minReplicaCount: 1
  maxReplicaCount: 20
  triggers:
  - type: prometheus
    metadata:
      serverAddress: http://prometheus:9090
      metricName: http_requests_total
      threshold: '100'
      query: sum(rate(http_requests_total{deployment="my-app"}[2m]))

Resource Management

Resource Quotas

Namespace Quota

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-resources
  namespace: development
spec:
  hard:
    pods: "10"
    requests.cpu: "4"
    requests.memory: "8Gi"
    limits.cpu: "10"
    limits.memory: "16Gi"
    persistentvolumeclaims: "4"

Object Count Quota

apiVersion: v1
kind: ResourceQuota
metadata:
  name: object-counts
  namespace: production
spec:
  hard:
    configmaps: "10"
    persistentvolumeclaims: "4"
    replicationcontrollers: "20"
    secrets: "10"
    services: "10"
    services.loadbalancers: "2"

Limit Ranges

Default Limits

apiVersion: v1
kind: LimitRange
metadata:
  name: default-limits
  namespace: default
spec:
  limits:
  - default:
      cpu: "500m"
      memory: "512Mi"
    defaultRequest:
      cpu: "250m"
      memory: "256Mi"
    type: Container

Min/Max Constraints

apiVersion: v1
kind: LimitRange
metadata:
  name: min-max-limits
  namespace: production
spec:
  limits:
  - min:
      cpu: "100m"
      memory: "128Mi"
    max:
      cpu: "2"
      memory: "4Gi"
    type: Container

Quality of Service (QoS) Classes

Guaranteed QoS

apiVersion: v1
kind: Pod
metadata:
  name: guaranteed-pod
spec:
  containers:
  - name: my-container
    image: my-app:latest
    resources:
      limits:
        cpu: "1"
        memory: "1Gi"
      requests:
        cpu: "1"
        memory: "1Gi"

Burstable QoS

apiVersion: v1
kind: Pod
metadata:
  name: burstable-pod
spec:
  containers:
  - name: my-container
    image: my-app:latest
    resources:
      requests:
        cpu: "500m"
        memory: "512Mi"

BestEffort QoS

apiVersion: v1
kind: Pod
metadata:
  name: besteffort-pod
spec:
  containers:
  - name: my-container
    image: my-app:latest

Priority Classes

Define Priority Classes

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: high-priority
value: 1000000
globalDefault: false
description: "High priority class for critical services"
---
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: low-priority
value: 1000
globalDefault: true
description: "Low priority class for batch jobs"

Use Priority Class

apiVersion: apps/v1
kind: Deployment
metadata:
  name: critical-service
spec:
  template:
    spec:
      priorityClassName: high-priority
      containers:
      - name: critical-app
        image: critical-app:latest

Pod Disruption Budget

Ensure Minimum Availability

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

Allow Maximum Disruptions

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  maxUnavailable: 1
  selector:
    matchLabels:
      app: my-app

Multi-Cluster Management

Cluster API (CAPI)

Install Cluster API

# Install clusterctl
curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.3.0/clusterctl-linux-amd64 -o clusterctl
chmod +x clusterctl
sudo mv clusterctl /usr/local/bin/

# Initialize management cluster
clusterctl init --infrastructure aws

Cluster Manifest

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: my-workload-cluster
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: my-workload-cluster-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AWSCluster
    name: my-workload-cluster

Worker Machine Deployment

apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: my-workload-cluster-md-0
spec:
  clusterName: my-workload-cluster
  replicas: 3
  selector:
    matchLabels: {}
  template:
    spec:
      clusterName: my-workload-cluster
      version: v1.25.0
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AWSMachineTemplate
        name: my-workload-cluster-md-0

KubeFed (Kubernetes Federation)

Install KubeFed

# Install KubeFed control plane
kubectl apply -f https://github.com/kubernetes-sigs/kubefed/releases/download/v0.9.0/kubefed-operator.yaml

# Create federation
kubefedctl join cluster1 --cluster-context cluster1 --host-cluster-context cluster1
kubefedctl join cluster2 --cluster-context cluster2 --host-cluster-context cluster1

Federated Deployment

apiVersion: types.kubefed.io/v1beta1
kind: FederatedDeployment
metadata:
  name: my-app
  namespace: default
spec:
  template:
    metadata:
      labels:
        app: my-app
    spec:
      replicas: 3
      selector:
        matchLabels:
          app: my-app
      template:
        metadata:
          labels:
            app: my-app
        spec:
          containers:
          - name: my-app
            image: my-app:latest
  placement:
    clusters:
    - name: cluster1
    - name: cluster2
  overrides:
  - clusterName: cluster2
    clusterOverrides:
    - path: "/spec/replicas"
      value: 5

Rancher Multi-Cluster Management

Install Rancher

# Install Rancher using Helm
helm repo add rancher-latest https://releases.rancher.com/server-charts/latest
helm install rancher rancher-latest/rancher \
  --namespace cattle-system \
  --create-namespace \
  --set hostname=rancher.example.com

Multi-Cluster Application

# Cluster template
apiVersion: management.cattle.io/v3
kind: Cluster
metadata:
  name: production-cluster
spec:
  dockerEngine:
    storageDriver: overlay2
  kubernetesVersion: v1.25.0
  rancherKubernetesEngineConfig:
    rkeConfig:
      network:
        plugin: calico
      services:
        etcd:
          snapshot: true
          creation: "6h"
          retention: "24h"

Cross-Cluster Service Discovery

Multi-Cluster Ingress

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: global-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/upstream-vhost: "$service_name.$namespace.svc.cluster.local"
spec:
  rules:
  - host: myapp.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: my-app
            port:
              number: 80

Real-World Patterns and Use Cases

Microservices Architecture Patterns

Backend for Frontend (BFF) Pattern

# Frontend Service
apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend-service
spec:
  replicas: 3
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
      - name: frontend
        image: frontend:latest
        env:
        - name: API_GATEWAY_URL
          value: "http://api-gateway:8080"
---
# API Gateway
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api-gateway
spec:
  replicas: 2
  template:
    spec:
      containers:
      - name: gateway
        image: gateway:latest
        env:
        - name: USER_SERVICE_URL
          value: "http://user-service:8081"
        - name: ORDER_SERVICE_URL
          value: "http://order-service:8082"

Circuit Breaker Pattern with Istio

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: payment-service
spec:
  host: payment-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
        connectTimeout: 30ms
        tcpKeepalive:
          time: 7200s
          interval: 75s
    outlierDetection:
      consecutiveGatewayErrors: 5
      interval: 30s
      baseEjectionTime: 30s
      maxEjectionPercent: 50

Stateful Applications

PostgreSQL with StatefulSet

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:13
        env:
        - name: POSTGRES_USER
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: username
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        ports:
        - containerPort: 5432
          name: postgres
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
        livenessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - $(POSTGRES_USER)
            - -d
            - postgres
          initialDelaySeconds: 30
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: fast-ssd
      resources:
        requests:
          storage: 10Gi

Redis Cluster

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: redis-cluster
spec:
  serviceName: redis-cluster
  replicas: 6
  podManagementPolicy: Parallel
  template:
    spec:
      containers:
      - name: redis
        image: redis:6.2-alpine
        command:
        - redis-server
        - /etc/redis/redis.conf
        - --cluster-enabled
        - --cluster-config-file
        - /data/nodes.conf
        - --cluster-node-timeout
        - "5000"
        - --appendonly
        - "yes"
        - --protected-mode
        - "no"
        ports:
        - containerPort: 6379
          name: client
        - containerPort: 16379
          name: gossip
        volumeMounts:
        - name: data
          mountPath: /data
        - name: config
          mountPath: /etc/redis
      volumes:
      - name: config
        configMap:
          name: redis-cluster-config
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 1Gi

Batch Processing and ETL Workflows

CronJob for Daily Reports

apiVersion: batch/v1
kind: CronJob
metadata:
  name: daily-report
spec:
  schedule: "0 2 * * *"  # Run at 2 AM daily
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: report-generator
            image: report-generator:latest
            env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: db-credentials
                  key: url
            - name: REPORT_DATE
              value: "$(date -d 'yesterday' +%Y-%m-%d)"
            resources:
              requests:
                cpu: "100m"
                memory: "256Mi"
              limits:
                cpu: "500m"
                memory: "1Gi"
          restartPolicy: OnFailure
  concurrencyPolicy: Forbid

Argo Workflows

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  name: etl-pipeline
spec:
  entrypoint: etl-pipeline
  volumes:
  - name: workdir
    persistentVolumeClaim:
      claimName: etl-workspace
  templates:
  - name: etl-pipeline
    dag:
      tasks:
      - name: extract
        template: extract-data
      - name: transform
        template: transform-data
        dependencies: [extract]
      - name: load
        template: load-data
        dependencies: [transform]
  
  - name: extract-data
    script:
      image: python:3.9
      command: [python]
      source: |
        import requests
        import pandas as pd
        
        # Extract data from API
        response = requests.get("https://api.example.com/data")
        data = response.json()
        df = pd.DataFrame(data)
        df.to_csv("/work/raw_data.csv", index=False)
        print("Data extracted successfully")
      volumeMounts:
      - name: workdir
        mountPath: /work
  
  - name: transform-data
    script:
      image: python:3.9
      command: [python]
      source: |
        import pandas as pd
        
        # Transform data
        df = pd.read_csv("/work/raw_data.csv")
        df['processed_date'] = pd.Timestamp.now()
        df.to_parquet("/work/processed_data.parquet")
        print("Data transformed successfully")
      volumeMounts:
      - name: workdir
        mountPath: /work
  
  - name: load-data
    script:
      image: python:3.9
      command: [python]
      source: |
        import pandas as pd
        import psycopg2
        
        # Load to database
        conn = psycopg2.connect(
            host="postgres",
            database="analytics",
            user="analytics",
            password="password"
        )
        
        df = pd.read_parquet("/work/processed_data.parquet")
        # Load logic here
        print("Data loaded successfully")
      volumeMounts:
      - name: workdir
        mountPath: /work

Machine Learning Workloads

Kubeflow Pipeline

apiVersion: kubeflow.org/v1beta1
kind: KFPipeline
metadata:
  name: ml-training-pipeline
spec:
  pipelineSpec:
    pipelines:
      - name: training-pipeline
        components:
          - name: data-preprocessing
            implementation:
              container:
                image: data-prep:latest
                command: ["python", "/app/preprocess.py"]
            inputs:
              artifacts:
                - name: raw-data
                  path: /data/raw
            outputs:
              artifacts:
                - name: processed-data
                  path: /data/processed
          
          - name: model-training
            implementation:
              container:
                image: trainer:latest
                command: ["python", "/app/train.py"]
            inputs:
              artifacts:
                - name: training-data
                  path: /data/processed
            outputs:
              artifacts:
                - name: model
                  path: /model
              parameters:
                - name: accuracy
                  valueFrom:
                    path: /accuracy.txt
        
        dependencies:
          model-training:
            after: [data-preprocessing]

GPU-enabled Training Job

apiVersion: batch/v1
kind: Job
metadata:
  name: model-training
spec:
  template:
    spec:
      containers:
      - name: trainer
        image: nvidia/cuda:11.3.1-base
        command: ["python", "train.py"]
        resources:
          limits:
            nvidia.com/gpu: 2
            memory: "16Gi"
          requests:
            nvidia.com/gpu: 2
            memory: "16Gi"
        volumeMounts:
        - name: dataset
          mountPath: /data
        - name: models
          mountPath: /models
      volumes:
      - name: dataset
        persistentVolumeClaim:
          claimName: dataset-pvc
      - name: models
        persistentVolumeClaim:
          claimName: models-pvc
      nodeSelector:
        cloud.google.com/gke-accelerator: nvidia-tesla-t4
      tolerations:
      - key: nvidia.com/gpu
        operator: Exists
        effect: NoSchedule

Edge Computing and IoT

Edge Device Pattern

# Edge Controller Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: edge-controller
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: controller
        image: edge-controller:latest
        env:
        - name: DEVICE_REGISTRY_URL
          value: "http://device-registry:8080"
        - name: MESSAGE_BROKER_URL
          value: "mqtt://mqtt-broker:1883"
        resources:
          limits:
            cpu: "500m"
            memory: "1Gi"
---
# Device Registry StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: device-registry
spec:
  serviceName: device-registry
  replicas: 1
  template:
    spec:
      containers:
      - name: registry
        image: device-registry:latest
        ports:
        - containerPort: 8080
        volumeMounts:
        - name: data
          mountPath: /data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

Hybrid and Multi-Cloud Deployments

Hybrid Cloud with Cluster API

# Cloud cluster template
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: cloud-cluster
spec:
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AWSCluster
    name: cloud-cluster
---
# On-premises cluster template
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: onprem-cluster
spec:
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: VSphereCluster
    name: onprem-cluster

Multi-Cloud Ingress Controller

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: global-app
  annotations:
    kubernetes.io/ingress.class: "global-ingress"
    global-ingress/load-balancer: "multi-cloud"
    global-ingress/health-check: "/health"
spec:
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: app-service
            port:
              number: 80

GitOps Workflows

Argo CD

Install Argo CD

# Install Argo CD
kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

# Access Argo CD UI
kubectl port-forward svc/argocd-server -n argocd 8080:443

Argo CD Application

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: my-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/my-app.git
    targetRevision: HEAD
    path: kubernetes/manifests
  destination:
    server: https://kubernetes.default.svc
    namespace: my-app
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
      allowEmpty: false
    syncOptions:
    - CreateNamespace=true
    - Validate=false
    - PrunePropagationPolicy=foreground
    - PruneLast=true
  ignoreDifferences:
  - group: apps
    kind: Deployment
    jsonPointers:
    - /spec/replicas

Argo CD App of Apps Pattern

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: root-app
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/myorg/infrastructure.git
    targetRevision: HEAD
    path: argocd/apps
  destination:
    server: https://kubernetes.default.svc
    namespace: argocd
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Flux CD

Install Flux

# Install Flux CLI
curl -s https://toolkit.fluxcd.io/install.sh | sudo bash

# Bootstrap Flux
flux bootstrap github \
  --owner=myorg \
  --repository=infrastructure \
  --path=clusters/my-cluster \
  --personal

Flux Kustomization

apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: my-app
  namespace: flux-system
spec:
  interval: 5m
  path: "./apps/my-app/overlays/production"
  prune: true
  validation: client
  healthChecks:
  - kind: Deployment
    name: my-app
    namespace: my-app
  force: false

Flux HelmRelease

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: my-app
  namespace: flux-system
spec:
  interval: 5m
  chart:
    spec:
      chart: my-app
      version: "1.0.0"
      sourceRef:
        kind: HelmRepository
        name: my-repo
        namespace: flux-system
  values:
    replicaCount: 3
    image:
      tag: "1.0.0"
    resources:
      limits:
        cpu: 500m
        memory: 512Mi

GitOps Best Practices

Structure your repository properly

infrastructure/
├── clusters/
│   ├── prod/
│   │   ├── flux-system/
│   │   └── apps/
│   └── staging/
│       ├── flux-system/
│       └── apps/
└── apps/
    ├── my-app/
    │   ├── base/
    │   └── overlays/
    │       ├── production/
    │       └── staging/
    └── another-app/

Use sealed secrets for sensitive data

# Install Sealed Secrets
kubectl apply -f https://github.com/bitnami-labs/sealed-secrets/releases/download/v0.18.0/sealedsecret-controller.yaml

# Seal a secret
kubeseal --raw --from-file=secret.txt > sealed-secret.yaml

Implement proper CI/CD pipelines

# GitHub Actions workflow
name: Deploy to Kubernetes
on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    
    - name: Validate manifests
      run: |
        kubectl apply -f kubernetes/ --dry-run=client
    
    - name: Setup kubeconfig
      run: |
        mkdir -p $HOME/.kube
        echo "${{ secrets.KUBECONFIG }}" | base64 -d > $HOME/.kube/config
    
    - name: Deploy to cluster
      if: github.ref == 'refs/heads/main'
      run: |
        kubectl apply -f kubernetes/

Monitor deployments and rollbacks

# Argo CD Rollback
argocd app rollback my-app --revision HEAD~1

# Flux Rollback
flux suspend kustomization my-app
git revert HEAD
flux resume kustomization my-app

Troubleshooting

Common Issues and Solutions

Pod Issues

# Check pod status
kubectl get pods

# Describe pod for details
kubectl describe pod <pod-name>

# View pod logs
kubectl logs <pod-name>

# View logs from previous instance
kubectl logs <pod-name> --previous

Networking Issues

# Check service endpoints
kubectl get endpoints <service-name>

# Test DNS resolution
kubectl run -it --rm dnsutils --image=tutum/dnsutils -- nslookup <service-name>

# Check network policies
kubectl get networkpolicy

Debugging Tools

kubectl debug

# Debug a running pod
kubectl debug <pod-name> --image=busybox --target=container-name

# Copy files from pod
kubectl cp <pod-name>:/path/to/file ./local-file

# Port forwarding
kubectl port-forward <pod-name> 8080:80

CI/CD & GitOps Integration

Argo CD Application

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: guestbook
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/argoproj/argocd-example-apps.git
    targetRevision: HEAD
    path: guestbook
  destination:
    server: https://kubernetes.default.svc
    namespace: guestbook
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

GitOps with Flux

apiVersion: kustomize.toolkit.fluxcd.io/v1beta2
kind: Kustomization
metadata:
  name: apps
  namespace: flux-system
spec:
  interval: 5m0s
  path: "./apps/production"
  prune: true
  sourceRef:
    kind: GitRepository
    name: flux-system

Real-World Workflows

Microservices Architecture

# Frontend Service
apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
spec:
  replicas: 3
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
      - name: frontend
        image: myapp/frontend:v1.0
        env:
        - name: BACKEND_URL
          value: "http://backend-service:8080"
        ports:
        - containerPort: 3000

Stateful Application (Database)

# MySQL StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: mysql
  replicas: 1
  template:
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          valueFrom:
            secretKeyRef:
              name: mysql-secret
              key: root-password
        volumeMounts:
        - name: mysql-persistent-storage
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: mysql-persistent-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

ML Workload with GPU

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ml-training
spec:
  replicas: 1
  template:
    spec:
      containers:
      - name: training
        image: tensorflow/tensorflow:latest-gpu
        resources:
          limits:
            nvidia.com/gpu: 1

Best Practices

Development Best Practices

Use ConfigMaps and Secrets

# Externalize configuration
envFrom:
  - configMapRef:
      name: app-config
  - secretRef:
      name: app-secrets

Health Checks

# Always include health checks
livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

Production Best Practices

Resource Limits

resources:
  limits:
    cpu: "1"
    memory: "1Gi"
  requests:
    cpu: "0.5"
    memory: "512Mi"

Pod Disruption Budgets

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: my-app-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

Security Best Practices

Non-root User

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  fsGroup: 1000

Read-only Filesystem

securityContext:
  readOnlyRootFilesystem: true

Monitoring Best Practices

Structured Logging

env:
- name: LOG_FORMAT
  value: json

Custom Metrics

# Expose custom metrics endpoint
ports:
- name: metrics
  containerPort: 9090

Resources & Next Steps

Official Documentation

Learning Resources

Tools and Ecosystem

Helm: Package manager for Kubernetes
Kustomize: Native Kubernetes configuration management
Istio: Service mesh
Prometheus: Monitoring and alerting
Grafana: Visualization and dashboards
Jaeger: Distributed tracing
Argo CD: GitOps continuous delivery
Flux CD: GitOps toolkit

Community and Support

Next Steps

Practice with Minikube or Kind for local development
Deploy a simple application to understand the basics
Explore advanced features like operators and service mesh
Join the community and contribute to open source projects
Consider certification to validate your knowledge

Kubernetes has revolutionized how we deploy and manage containerized applications at scale. By following these practices and leveraging its powerful features, teams can achieve reliable, scalable, and efficient deployments across hybrid and multi-cloud environments.

Table of Contents​

Introduction & Philosophy​

What is Kubernetes?​

Core Philosophy​

Key Differentiators​

Architecture Overview​

Installation & Setup​

System Requirements​

Minimum Requirements for Production Clusters​

Required Ports​

Local Development Setups​

Minikube​

Kind (Kubernetes in Docker)​

Production Installation Methods​

kubeadm​

Cloud Provider Options​

Amazon EKS​

Google Kubernetes Engine (GKE)​

Azure Kubernetes Service (AKS)​

Post-Installation Configuration​

Common Setup Tasks​

Core Concepts & Architecture​

Kubernetes Objects​

Object Structure​

Common Kubernetes Objects​

Namespaces​

Labels and Selectors​

Annotations​

Workload Management​

Pods​

Multi-Container Pods​

Deployments​

Rolling Updates​

StatefulSets​

DaemonSets​

Jobs and CronJobs​

Networking & Service Discovery​

Cluster Networking Models​

Service Types​

ClusterIP​

NodePort​

LoadBalancer​

Service Discovery​

Network Policies​

Ingress​

Storage & Persistence​

Volume Types​

emptyDir​

hostPath​

Persistent Volumes and Claims​

PersistentVolume (PV)​

PersistentVolumeClaim (PVC)​

Storage Classes​

Using Volumes in Pods​

Configuration Management​

ConfigMaps​

Using ConfigMaps​

Secrets​

Kustomize​

Security Best Practices​

Authentication and Authorization​

RBAC (Role-Based Access Control)​

Pod Security​

Security Context​

Pod Security Standards​

Network Security​

Network Policies​

Secrets Management​

External Secrets Operator​

Monitoring & Observability​

Metrics Collection​

Prometheus​

Metrics Server​

Logging​

Fluentd Configuration​

Health Checks​

Liveness and Readiness Probes​

Advanced Features​

Custom Resource Definitions (CRDs)​

Operators​

Table of Contents

Introduction & Philosophy

What is Kubernetes?

Core Philosophy

Key Differentiators

Architecture Overview

Installation & Setup

System Requirements

Minimum Requirements for Production Clusters

Required Ports

Local Development Setups

Minikube

Kind (Kubernetes in Docker)

Production Installation Methods

kubeadm

Cloud Provider Options

Amazon EKS

Google Kubernetes Engine (GKE)

Azure Kubernetes Service (AKS)

Post-Installation Configuration

Common Setup Tasks

Core Concepts & Architecture

Kubernetes Objects

Object Structure

Common Kubernetes Objects

Namespaces

Labels and Selectors

Annotations

Workload Management

Pods

Multi-Container Pods

Deployments

Rolling Updates

StatefulSets

DaemonSets

Jobs and CronJobs

Networking & Service Discovery

Cluster Networking Models

Service Types

ClusterIP

NodePort

LoadBalancer

Service Discovery

Network Policies

Ingress

Storage & Persistence

Volume Types

emptyDir

hostPath

Persistent Volumes and Claims

PersistentVolume (PV)

PersistentVolumeClaim (PVC)

Storage Classes

Using Volumes in Pods

Configuration Management

ConfigMaps

Using ConfigMaps

Secrets

Kustomize

Security Best Practices

Authentication and Authorization

RBAC (Role-Based Access Control)

Pod Security

Security Context

Pod Security Standards

Network Security

Network Policies

Secrets Management

External Secrets Operator

Monitoring & Observability

Metrics Collection

Prometheus

Metrics Server

Logging

Fluentd Configuration

Health Checks

Liveness and Readiness Probes

Advanced Features

Custom Resource Definitions (CRDs)

Operators