About the company
Our mission is to bring blockchain to a billion people. The Alchemy Platform is a world class developer platform designed to make building on the blockchain easy. We've built leading infrastructure in the space, powering over $105 billion in transactions for tens of millions of users in 99% of countries worldwide. The Alchemy team draws from decades of deep expertise in massively scalable infrastructure, AI, and blockchain from leadership roles at leading companies and universities like Google, Microsoft, Facebook, Stanford, and MIT. Alchemy recently raised a Series C1 at a $10.2B valuation led by Lightspeed and Silver Lake. Previously, Alchemy raised from a16z, Coatue, Addition, Stanford University, Coinbase, the Chairman of Google, Charles Schwab, and the founders and executives of leading organizations. Alchemy powers the top blockchain companies globally and has been featured in TechCrunch, Forbes, Bloomberg, and elsewhere.
Job Summary
Responsibilities:
📍Dual focus on developer productivity and product reliability 📍Improve important infrastructure and systems from an operational standpoint (i.e. deployment, logging, monitoring, alerting, etc.) 📍Work with engineering to architect a migration from ECS to Kubernetes on EKS 📍Develop and own best practices for managing production infrastructure: provisioning, application scaling, configuration management, capacity planning, monitoring, etc 📍Develop and own best practices for developer processes: CI/CD, dev and staging environments, etc 📍Provide input into long-term platform requirements and operational guidelines with a focus on reliability 📍Continuously raise our standard of engineering excellence by implementing best practices for coding, testing, and deployment 📍Build and maintain documentation around process and workflows
What We're Looking For:
📍6+ years of experience as a DevOps or Site Reliability Engineer 📍Experience designing and operating large-scale, multi-region, multi-cloud production systems 📍Experience working with AWS and cloud infrastructures in general 📍Experience with container schedulers and runtimes such as Docker and Kubernetes 📍Experience with service mesh deployments such as Istio or Linkerd 📍Experience building deployment pipelines leveraging common CI/CD tools such as CircleCI and Spinnaker 📍Experience with real-time telemetry and tracing tools like Prometheus, Stackdriver, and DataDog 📍Experience building deployment pipelines leveraging common CI/CD tools 📍Experience with Infrastructure-as-Code (e.g. Terraform, Ansible, CloudFormation, Chef, Puppet, etc) 📍Experience with networking and configuring / managing VPC networks 📍An understanding of security best practices