About the company
Our mission is to bring blockchain to a billion people. The Alchemy Platform is a world class developer platform designed to make building on the blockchain easy. We've built leading infrastructure in the space, powering over $105 billion in transactions for tens of millions of users in 99% of countries worldwide. The Alchemy team draws from decades of deep expertise in massively scalable infrastructure, AI, and blockchain from leadership roles at leading companies and universities like Google, Microsoft, Facebook, Stanford, and MIT. Alchemy recently raised a Series C1 at a $10.2B valuation led by Lightspeed and Silver Lake. Previously, Alchemy raised from a16z, Coatue, Addition, Stanford University, Coinbase, the Chairman of Google, Charles Schwab, and the founders and executives of leading organizations. Alchemy powers the top blockchain companies globally and has been featured in TechCrunch, Forbes, Bloomberg, and elsewhere.
Job Summary
Responsibilities:
đź“ŤDual focus on developer productivity and product reliability đź“ŤImprove important infrastructure and systems from an operational standpoint (i.e. deployment, logging, monitoring, alerting, etc.) đź“ŤWork with engineering to architect a migration from ECS to Kubernetes on EKS đź“ŤDevelop and own best practices for managing production infrastructure: provisioning, application scaling, configuration management, capacity planning, monitoring, etc đź“ŤDevelop and own best practices for developer processes: CI/CD, dev and staging environments, etc đź“ŤProvide input into long-term platform requirements and operational guidelines with a focus on reliability đź“ŤContinuously raise our standard of engineering excellence by implementing best practices for coding, testing, and deployment đź“ŤBuild and maintain documentation around process and workflows
What We're Looking For:
đź“Ť6+ years of experience as a DevOps or Site Reliability Engineer đź“ŤExperience designing and operating large-scale, multi-region, multi-cloud production systems đź“ŤExperience working with AWS and cloud infrastructures in general đź“ŤExperience with container schedulers and runtimes such as Docker and Kubernetes đź“ŤExperience with service mesh deployments such as Istio or Linkerd đź“ŤExperience building deployment pipelines leveraging common CI/CD tools such as CircleCI and Spinnaker đź“ŤExperience with real-time telemetry and tracing tools like Prometheus, Stackdriver, and DataDog đź“ŤExperience building deployment pipelines leveraging common CI/CD tools đź“ŤExperience with Infrastructure-as-Code (e.g. Terraform, Ansible, CloudFormation, Chef, Puppet, etc) đź“ŤExperience with networking and configuring / managing VPC networks đź“ŤAn understanding of security best practices