Skip to main content

Command Palette

Search for a command to run...

Building a Production-Grade DevOps Platform on AWS (Terraform + EKS + GitOps + Monitoring)

Updated
7 min read

Modern DevOps platforms require far more than simply deploying an application. Production systems demand infrastructure automation, container orchestration, CI/CD pipelines, GitOps deployment, and robust observability.

In this article, I walk through how I built a production-style DevOps platform on AWS using modern cloud-native tooling.

The platform integrates:

• Infrastructure as Code with Terraform
• Kubernetes orchestration using Amazon EKS
• Containerization with Docker
• CI automation with GitHub Actions
• GitOps deployments using ArgoCD
• Observability using Prometheus and Grafana

The project demonstrates a complete end-to-end DevOps workflow, from infrastructure provisioning to monitoring live application metrics.


Project Repositories

The platform is divided into three repositories, each responsible for a different layer of the system.

Infrastructure (Terraform)

https://github.com/rasika-08061998/eks-devops-platform-infra

This repository provisions the AWS infrastructure and Kubernetes cluster.


Application Code

https://github.com/rasika-08061998/three-tier-ai-app

This repository contains the frontend and backend microservices.


GitOps Deployment

https://github.com/rasika-08061998/eks-gitops-deployments

This repository contains Kubernetes manifests used by ArgoCD for deployments.

Separating repositories in this way reflects real production DevOps architecture used by many engineering teams.


Project Architecture Overview

The platform follows a GitOps-based architecture, where infrastructure, application code, and deployment configuration are managed independently.

High-level architecture:

Developer
    |
    v
GitHub Repositories
    |
    v
GitHub Actions CI Pipeline
    |
    v
AWS ECR (Container Registry)
    |
    v
ArgoCD GitOps Deployment
    |
    v
AWS EKS Cluster
    |
    +---- Frontend (React)
    +---- Backend (FastAPI)
    +---- PostgreSQL
    |
    v
Prometheus Monitoring
    |
    v
Grafana Dashboards

This design reflects a modern cloud-native application platform running on Kubernetes.


Infrastructure Layer (Terraform)

Infrastructure is defined using Terraform, which enables Infrastructure as Code.

This ensures that infrastructure is:

• reproducible

• version controlled

• automated

Repository:

https://github.com/rasika-08061998/eks-devops-platform-infra


Infrastructure Components

The Terraform configuration provisions the following AWS resources:

• AWS VPC

• Public and Private Subnets

• Internet Gateway

• NAT Gateway

• Bastion Host

• AWS EKS Cluster

• Managed Node Groups

• IAM Roles

• IRSA configuration

• AWS ECR Repository

This provides the complete foundation for running Kubernetes workloads.


Terraform Folder Structure

terraform
│
├── modules
│   ├── vpc
│   ├── eks
│   ├── bastion
│   └── ecr
│
└── environments
    └── dev

Terraform modules allow reusable infrastructure components.

Examples include:

• VPC module

• EKS module

• Bastion host module

• ECR repository module

This modular design follows Terraform best practices used in production environments.


Private Kubernetes Cluster

The EKS cluster is configured as a private cluster.

cluster_endpoint_public_access  = false
cluster_endpoint_private_access = true

This means the Kubernetes API server is not publicly exposed to the internet.

Access to the cluster occurs through a bastion host inside the VPC, improving the overall security posture.


Application Layer

Repository:

https://github.com/rasika-08061998/three-tier-ai-app

The platform runs a three-tier application architecture.

Frontend → Backend API → Database

Frontend

The frontend is built using:

React

It communicates with the backend using REST API calls.


Backend

The backend API is implemented using:

Python FastAPI

FastAPI is chosen because it provides:

• high performance

• async support

• automatic API documentation

Example endpoint:

@app.post("/chat")
def chat(request: schemas.MessageRequest):

The backend also exposes a Prometheus metrics endpoint.

/metrics

This allows monitoring tools to collect application metrics.


Database

The platform uses:

PostgreSQL

Postgres runs inside Kubernetes as a StatefulSet with persistent volumes.

This ensures data persistence across pod restarts.


Containerization with Docker

Both frontend and backend services are containerized.

Example backend Dockerfile:

FROM python:3.11

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Docker images are pushed to:

AWS Elastic Container Registry (ECR)

CI Pipeline (GitHub Actions)

Continuous Integration is implemented using GitHub Actions.

Pipeline workflow:

Push to main
    |
    v
GitHub Actions
    |
    +---- Build Docker images
    |
    +---- Push images to AWS ECR
    |
    +---- Update GitOps repository

This ensures application images are automatically built and published.


GitOps Deployment with ArgoCD

Repository:

https://github.com/rasika-08061998/eks-gitops-deployments

Deployment is managed using ArgoCD, which follows the GitOps model.


What is GitOps?

GitOps is a deployment model where:

Git repository = source of truth

ArgoCD continuously monitors the Git repository and synchronizes the cluster state.

Benefits include:

• declarative deployments

• version controlled infrastructure

• automated synchronization

• easy rollback


GitOps Repository Structure

eks-gitops-deployments
│
├── frontend
│   ├── deployment.yaml
│   ├── service.yaml
│
├── backend
│   ├── deployment.yaml
│   ├── service.yaml
│   └── servicemonitor.yaml
│
└── postgres
    ├── statefulset.yaml
    └── service.yaml

Each component has dedicated Kubernetes manifests.


Networking and Ingress

Application traffic is routed through an AWS Application Load Balancer.

This is implemented using:

AWS Load Balancer Controller

Routing rules:

/      → frontend
/api   → backend

This creates a single entry point to the application.


Monitoring with Prometheus

Observability is implemented using:

kube-prometheus-stack

Prometheus collects metrics from:

• Kubernetes nodes

• Kubernetes pods

• application endpoints

The backend exposes metrics through:

/metrics

Prometheus scrapes this endpoint using a ServiceMonitor resource.

Example resource:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor

Grafana Dashboards

Grafana provides visualization for metrics collected by Prometheus.

Example dashboards display:

• API request rate

• requests per endpoint

• HTTP status codes

• Kubernetes CPU and memory usage

This provides real-time observability into application performance.


Alerting

Prometheus alert rules detect abnormal system behavior.

Example alert rule:

High API error rate (5xx responses)

If the error rate exceeds a threshold, an alert is triggered.


Final System Architecture

Developer
    |
    v
GitHub
    |
    v
GitHub Actions
    |
    v
AWS ECR
    |
    v
ArgoCD
    |
    v
AWS EKS Cluster
    |
    +---- React Frontend
    +---- FastAPI Backend
    +---- PostgreSQL
    |
    v
Prometheus
    |
    v
Grafana

Key DevOps Concepts Demonstrated

This project demonstrates several real-world DevOps practices:

• Infrastructure as Code with Terraform

• Containerization with Docker

• CI pipelines using GitHub Actions

• GitOps deployment with ArgoCD

• Kubernetes orchestration using AWS EKS

• Application monitoring using Prometheus

• Observability dashboards with Grafana


Lessons Learned

Building this platform reinforced several important DevOps principles:

1️⃣ Infrastructure must be reproducible.

2️⃣ Git should be the source of truth for deployments.

3️⃣ Observability is critical for production systems.

4️⃣ Kubernetes environments require automation and strong CI/CD practices.


Future Improvements

Possible improvements for this platform include:

• Horizontal Pod Autoscaling

• Slack alert integrations

• Multi-environment deployments (dev / staging / prod)

• Distributed tracing with OpenTelemetry


Conclusion

This project demonstrates how to build a production-style DevOps platform on AWS using modern cloud-native tooling.

By combining Terraform, Kubernetes, GitOps, CI/CD pipelines, and observability tools, we can create scalable infrastructure capable of running real-world applications.

If you are learning DevOps or cloud engineering, building a complete end-to-end platform like this is one of the best ways to understand how production systems operate.

Outputs Screenshots :


About the Author

Rasika Deshmukh is a DevOps and Cloud enthusiast focused on building cloud-native platforms using AWS, Kubernetes, Terraform, Docker, GitHub Actions, and GitOps with ArgoCD. She enjoys working on infrastructure automation, CI/CD pipelines, and observability systems that reflect real-world production environments.

She is currently actively exploring opportunities in DevOps, Cloud Engineering, and Platform Engineering roles.

LinkedIn: https://www.linkedin.com/in/rasika-deshmukh
GitHub: https://github.com/rasika-08061998