Skip to main content

Command Palette

Search for a command to run...

🔐 Kubernetes on EKS: Security, Resource Management & Probes — What I Recently Learned (Deep Dive)

Published
4 min read

Running containers is easy.
Running them securely, efficiently, and reliably on production EKS?
That’s engineering.

Over the past weeks, I’ve been diving deep into:

  • 🔐 Kubernetes security (especially in EKS)

  • ⚙️ Resource management (CPU/Memory control)

  • 🩺 Liveness & Readiness probes (health mechanisms)

This article breaks down the theory, real-world context, and YAML examples you can apply immediately.


☁️ First: What is EKS?

Amazon EKS (Elastic Kubernetes Service) is AWS’s managed Kubernetes service.

👉 AWS manages the control plane (API server, etcd, scheduler)
👉 You manage the worker nodes, workloads, security, and networking

That division of responsibility is critical.

Official Docs:
https://docs.aws.amazon.com/eks/latest/userguide/what-is-eks.html


🔐 Part 1 — Kubernetes Security in EKS

Security in Kubernetes is layered. Think of it like airport security:

LayerWhat it Protects
IAMWho can access AWS
RBACWho can access Kubernetes
Network PoliciesPod-to-pod traffic
Security GroupsNode-level traffic
Pod SecurityContainer privileges

Let’s go deeper.


1️⃣ IAM + RBAC (Identity & Access Control)

In EKS:

  • IAM controls access to AWS

  • RBAC controls access to Kubernetes resources

Example: Allow a user to only view pods.

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: production
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "watch", "list"]

Then bind it:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: read-pods
  namespace: production
subjects:
- kind: User
  name: jane
  apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

💡 Key Learning:
Never use cluster-admin in production.
Least privilege is non-negotiable.


2️⃣ Pod Security (Containers Should Not Be Root)

Many containers run as root by default — dangerous.

Secure configuration:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  allowPrivilegeEscalation: false
  readOnlyRootFilesystem: true

This prevents:

  • Privilege escalation

  • Container breakout attempts

  • Filesystem tampering

AWS EKS Security Best Practices:
https://aws.github.io/aws-eks-best-practices/security/docs/


3️⃣ Network Policies (Zero Trust Inside the Cluster)

By default:
👉 Every pod can talk to every other pod.

That’s risky.

Example: Allow traffic only from frontend to backend.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: backend-policy
  namespace: production
spec:
  podSelector:
    matchLabels:
      app: backend
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: frontend

Now backend is protected from everything else.

That’s micro-segmentation.


⚙️ Part 2 — Resource Management (Why Pods Crash Randomly)

If you don’t define resources:

  • Pods can consume unlimited memory

  • Nodes can get unstable

  • The kernel can kill containers (OOMKilled)

That’s when chaos begins.


Requests vs Limits (Critical Concept)

FieldMeaning
requestsGuaranteed minimum
limitsMaximum allowed

Example:

resources:
  requests:
    memory: "256Mi"
    cpu: "200m"
  limits:
    memory: "512Mi"
    cpu: "500m"

Explanation:

  • Kubernetes scheduler uses requests to place the pod

  • If memory exceeds limit → container is killed

💡 200m CPU = 0.2 CPU core
💡 256Mi = 256 Megabytes


What Happens Without Limits?

Scenario:

  • One pod leaks memory

  • Node memory fills

  • Linux OOM killer kills random pods

  • Production outage

This is why resource governance matters.


🩺 Part 3 — Liveness vs Readiness Probes (Critical for Reliability)

This is where many teams make mistakes.


🩺 Liveness Probe

Question:

"Is the application alive?"

If it fails:
👉 Kubernetes restarts the container

Example:

livenessProbe:
  httpGet:
    path: /health
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

🚦 Readiness Probe

Question:

"Is the application ready to receive traffic?"

If it fails:
👉 Pod is removed from Service endpoints
👉 No traffic is sent

Example:

readinessProbe:
  httpGet:
    path: /ready
    port: 8080
  initialDelaySeconds: 5
  periodSeconds: 5

Real Production Example

Imagine:

  • App starts

  • Needs 20 seconds to connect to DB

  • Without readiness probe:

    • Traffic hits immediately

    • Users get 500 errors

With readiness probe:

  • No traffic until DB connection is successful

  • Zero user-facing error


Startup Probe (Advanced)

Used for slow-starting apps.

startupProbe:
  httpGet:
    path: /health
    port: 8080
  failureThreshold: 30
  periodSeconds: 10

This prevents liveness probe from killing slow apps.


🔥 Combined Production-Ready Deployment Example

apiVersion: apps/v1
kind: Deployment
metadata:
  name: secure-app
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: app
        image: myapp:1.0
        resources:
          requests:
            memory: "256Mi"
            cpu: "200m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        securityContext:
          runAsNonRoot: true
          allowPrivilegeEscalation: false
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
        readinessProbe:
          httpGet:
            path: /ready
            port: 8080

This setup ensures:

  • 🔐 Secure container

  • ⚙️ Controlled resources

  • 🩺 Self-healing

  • 🚦 Smart traffic routing


🎯 Final Takeaways

What I truly understood:

  1. Kubernetes security is layered — not a single setting

  2. Resource management prevents unpredictable crashes

  3. Probes are not optional — they are production essentials

  4. EKS handles control plane, but YOU secure workloads

More from this blog

Rasika DevOps

13 posts