Job Description
Platform Engineer (Mid+) Claid.AI is delivering GenAI to enterprise. The platform team owns the cloud footprint and the developer experience around shipping those pipelines safely. Small team, lots of surface area, real ownership from day one. What you'll work on Run our infrastructure as code in Terraform across GCP, AWS, and GPU/neo-clouds. Keep CI/CD healthy on GitHub Actions across a multi-repo setup. Fix whatever is slowing product and ML engineers down — local dev, preview environments, secrets, deploys, observability. Take part in on-call, own the runbooks, and follow incidents through to a fix. You'll work closely with our Head of Platform/Engineering — 16 years in, previously ran platform at Grammarly and at Amazon Ring. He's been carrying most of this solo for a while; you're the second pair of hands on a two-person function, with real ownership from week one rather than a junior seat shadowing someone else's calls. Stack Terraform (heavy daily use — modules, state, drift), Kubernetes, GitHub Actions, GCP + AWS, GPU/neo-clouds. Also in the mix: Cloudflare, RabbitMQ, SQL and NoSQL datastores, Grafana + Mimir for monitoring. Nice to have: Argo / GitOps. Who we're looking for Mid+ — 2–4 years of real platform/DevOps/SRE work. You prefer to own things end-to-end and ship them yourself, not just contribute around the edges. You don't need to know every item in the stack, but you are hungry for knowledge. You've debugged a blocked deploy at 11pm and come out the other side (or similar). What we care about: - You've run workloads in GCP or AWS and can reason about IAM, networking, and cost without hand-holding. - You know your way across SQL/NoSQL databases. - You know Terraform — modules, state, and drift. - You've operated a CI/CD pipeline in production — built the stages, fixed the flaky ones, and debugged the failed deploy. - You write tools/scripts in Python, Go, or Rust. We use AI heavily, but you can write code without it — and we'll check. - You automate