GitHub Actions Multi-Environment Deployment Pipeline
Built a comprehensive CI/CD pipeline using GitHub Actions for a microservices architecture deployed to AWS ECS. Implemented automated testing, security scanning, multi-environment promotion workflows, and one-click rollback capabilities. Reduced deployment time from 45 minutes to 12 minutes while improving deployment success rate to 99.5%.
DevOps SRE Platform Engineer
GitHub Actions Docker AWS ECR AWS ECS Trivy Node.js Jest AWS CLI Slack API YAML Terraform
The Problem
Our legacy deployment process required 2 hours of manual work per release: building Docker images locally, manually testing in staging, SSH-ing to servers, pulling images, and restarting containers. We deployed only twice per week due to the friction, creating large change batches that increased risk. Rollbacks required reverting Git commits and re-running the entire manual process, often taking 30-45 minutes during incidents. We had no automated testing between environments, no audit trail of who deployed what, and frequent "works on my machine" issues. Deployment failures occurred in 15% of releases, and there was zero visibility into deployment status across teams.
The Solution
**Multi-Stage Pipeline Architecture**: Built a GitHub Actions workflow with four distinct stages: build (compile + Docker image), test (unit, integration, security scanning), deploy-staging (automatic), and deploy-production (manual approval gate). Configured matrix builds for parallel testing across multiple Node.js versions.
**Environment Management**: Implemented GitOps pattern with environment-specific configurations stored in separate YAML files. Created GitHub Environments with protection rules: staging (auto-deploy on merge to main), production (requires approval from 2 SRE team members + passing health checks). Used GitHub Secrets for credential management with environment-specific scoping.
**Docker & Container Registry**: Optimized Dockerfile with multi-stage builds reducing image size from 1.2GB to 380MB. Configured AWS ECR with lifecycle policies for automatic cleanup of old images. Implemented image vulnerability scanning with Trivy, failing builds on HIGH/CRITICAL CVEs.
**Deployment Strategy**: Integrated with AWS ECS using blue-green deployments, running health checks before routing traffic. Created automated rollback triggers on increased error rates. Implemented Slack notifications with deployment status, approver tracking, and quick rollback buttons.
**Environment Management**: Implemented GitOps pattern with environment-specific configurations stored in separate YAML files. Created GitHub Environments with protection rules: staging (auto-deploy on merge to main), production (requires approval from 2 SRE team members + passing health checks). Used GitHub Secrets for credential management with environment-specific scoping.
**Docker & Container Registry**: Optimized Dockerfile with multi-stage builds reducing image size from 1.2GB to 380MB. Configured AWS ECR with lifecycle policies for automatic cleanup of old images. Implemented image vulnerability scanning with Trivy, failing builds on HIGH/CRITICAL CVEs.
**Deployment Strategy**: Integrated with AWS ECS using blue-green deployments, running health checks before routing traffic. Created automated rollback triggers on increased error rates. Implemented Slack notifications with deployment status, approver tracking, and quick rollback buttons.
Key Highlights
- Reduced deployment time from 2 hours to 8 minutes (93% improvement)
- Increased deployment frequency from 2x/week to 50+ times/week
- Achieved 99.2% deployment success rate (from 85%)
- Implemented automated rollbacks completing in under 90 seconds
- Created comprehensive deployment audit trail for SOC 2 compliance
- Reduced Docker image size by 68% speeding up deployments
- Blocked 23 deployments automatically due to security vulnerabilities
- Configured deployment windows with automatic scheduling for off-peak hours
- Built staging environment auto-refresh from production data daily
- Integrated smoke tests running post-deployment with 95% code coverage
- Set up parallel builds reducing total pipeline time by 60%
- Created self-service deployment dashboard showing real-time status
Project Screenshots
Interested in Similar Work?
Let's discuss how I can help with your project.