Open to Graduate / Junior Cloud & DevOps Roles — UK & Europe | Requires Visa Sponsorship

Shubham Shah

London-based MSc Cloud Computing graduate specialising in infrastructure automation, cloud cost optimisation, and DevSecOps-driven AI led systems.

Shubham Shah | Graduate Cloud & DevOps Engineer

Engineering Projects

Cloud Cost Forecasting System

Built an ML-based forecasting pipeline to predict cloud spend trends and support cost optimisation decisions.

Azure · Python · XGBoost · LSTM · GitHub Actions (CI/CD) View Project ↗

Self-Service Infrastructure Provisioning Platform

Designed a Terraform-based self-service platform allowing teams to provision Azure infrastructure through controlled GitHub Actions workflows.

Terraform · Azure · GitHub Actions · IaC · RBAC View Project ↗

Secure Azure RBAC Automation Framework

Built a Terraform-based automation framework to manage Azure role-based access control (RBAC) securely, consistently, and at scale.

Azure RBAC · Terraform · IAM · Infrastructure as Code · Security Best Practices View Project ↗

Automated Cost & Security Alert Dashboard on AWS

Created an automated AWS cost and security dashboard, reducing monitoring overhead and highlighting misconfigurations in real time.

AWS Lambda · CloudWatch · Boto3(python) · S3 + Static Website Hosting · AWS SNS View Project ↗

Automated Rollback System (Canary Deployment Demo)

Built a Kubernetes-based system that performs automated rollback of failing application versions using canary deployment, Prometheus monitoring, and a custom Python controller to maintain service reliability.

Kubernetes · Docker · Prometheus · Python · Canary Deployment · Observability View Project ↗

Technical Skill Set

Experience

Oct 2025 — May 2026

Junior DevOps Engineer | University of East London (Placement Year)
Automated log monitoring and validation using Python scripts, reducing manual operational overhead by ~50% and improving SLA adherence.
Implemented structured root cause analysis and monitoring workflows, decreasing mean time to resolution (MTTR) by ~45% and improving system reliability.
Designed and deployed a full observability stack using Prometheus and Grafana across development, staging, and production environments, enabling proactive issue detection and preventing critical outages.
Built and deployed Kubernetes-based services with automated rollback mechanisms, strengthening release reliability and minimising deployment-related failures.
Gained hands-on experience in production incident response, observability design, and maintaining high-availability cloud-native systems.

Jan 2024 — May 2024

GenAI Developer Intern | Tata Consultancy Services (On-site)
Designed and automated deployment workflows using Python, reducing manual provisioning effort and improving infrastructure delivery speed by ~25%.
Investigated production incidents through structured root cause analysis, reducing recurring failures by ~30% and improving service reliability.
Supported cloud migration planning and execution, contributing to secure workload transitions and risk mitigation strategies.
Provided hands-on production support during active development sprints, helping maintain system stability and minimise downtime.
Worked closely with product and engineering stakeholders to align GenAI tooling improvements with business and platform requirements.

May 2023 — Jul 2023

Cloud Infrastructure & Security Intern | Celebal Technologies (Remote)
Designed and implemented an Azure Hub-and-Spoke network architecture, improving security controls and reducing internal service latency by ~20%.
Configured VNet peering, VPN gateways, and Network Security Groups (NSGs) to strengthen network isolation and access control.
Contributed to modernising legacy systems by supporting containerisation initiatives, reducing deployment time by ~40% and simplifying release workflows.
Implemented CI/CD pipelines using Azure DevOps, improving build reliability and deployment efficiency by ~30%.
Collaborated with infrastructure teams to optimise server configurations, reducing resource overhead and improving application responsiveness.

Year 2021 — 2023

Undergraduate Projects & Hackathons
Designed small-scale cloud automation tools, implemented containerization, and explored Terraform for infrastructure management. Participated in hackathons organised by various clubs focusing on cloud solutions and DevOps practices.

Writing & Technical Notes

Short, practical write-ups on cloud engineering, DevOps workflows, and lessons learned while building production-grade systems.

Mar 2025

MLOps: Bringing AI Models from Notebooks to Production

Deploy AI models reliably at scale using automated pipelines, GPU infrastructure, and continuous monitoring.

MLOps Model lifecycle management AI/ML at scale
Read article →
Feb 2025

Platform Engineering: Building Golden Paths for Developers

Build self-service platforms that empower developers with golden paths, automation, and consistent guardrails.

IDPs Self-service infrastructure GitOps principles
Read article →
Nov 2025

FinOps: Taking Control of Your Cloud Spending

Master cloud spending through visibility, optimization strategies, and cross-team collaboration for business value.

FinOps Multi-cloud cost management CloudHealth
Read article →

Interested in working together?

If you’re hiring for cloud or DevOps roles, I’d be happy to discuss how my work could fit your team.

Start a conversation →

Currently Learning & Working On