Job Description
We help the world run better At SAP, we keep it simple: you bring your best to us, and we'll bring out the best in you. We're builders touching over 20 industries and 80% of global commerce, and we need your unique talents to help shape what's next. The work is challenging - but it matters. You'll find a place where you can be yourself, prioritize your wellbeing, and truly belong. What's in it for you Constant learning, skill growth, great benefits, and a team that wants you to grow and succeed. What you'll build Design, implement, and manage the observability platform Own the full lifecycle of telemetry pipelines (metrics and logs) across containerized and VM workloads, as well as cloud environments Manage architecture, capacity planning, and day-2 operations for the observability stack on Kubernetes and Docker Develop and manage large-scale alerting across multi-customer environments Author, tune, and maintain alert rules based on metrics and log signals Ensure alerts are reliable, actionable, and low-noise across distributed systems and customer deployments Maintain enterprise-scale alert coverage where missed alerts have direct customer impact Deliver the observability platform using DevOps practices Implement infrastructure as code, CI/CD pipelines, automated testing, and continuous delivery Ensure all changes are version-controlled and deployed through automated workflows, avoiding manual drift Core Skills: Monitoring: Prometheus, Grafana, Loki, ELK, Opentelemetry Automation & DevOps Tooling: Ansible, ArgoCD, github Actions, Helm Containers & Linux: Kubernetes, Docker, Linux Programming/Scripting: Python/Go, Bash ITSM & Data Integration: ServiceNow, Jira, MS teams What you bring: 7 to 9 years of experience with Deep technical understanding of server infrastructure Strong knowhow on Hyperscalers (AWS, Azure, GCP) and container technologies and orchestrating workloads on Kubernetes clusters on SAP Gardner. Establish and maintain monitoring for container-based ap