About the company
QuickNode is a cloud-based infrastructure company that powers the blockchain ecosystem. Our mission is to be the indispensable utility that empowers companies and innovators globally to build next-generation, Web3 enabled businesses & applications using blockchain technology. QuickNode is backed by some of the worldās best investors including Tiger Global, Y Combinator, SoftBank, and the Seven Seven Six Fund. The QuickNode team has over 120 people maintaining high performance global data infrastructure for amazing customers serving billions of requests daily. We are a global remote first company HQād in Miami, Florida.
Job Summary
What You'll Do
šBlockchain Network Management: Lead the deployment, optimization, and operational management of new blockchain networks. Conduct thorough testing, benchmarking, and continuous improvement of chain reliability and performance. šComplex Web3 Issue Resolution: Address high-impact Web3 incidents through rigorous troubleshooting, detailed log analysis, JSON-RPC response debugging, and direct coordination with blockchain foundations and ecosystem partners. šProactive System Monitoring: Develop and maintain comprehensive monitoring and alerting solutions using advanced dashboards (e.g., Grafana, DataDog), identifying trends, anomalies, and performance bottlenecks before they become critical. šIncident & SLO Management: Define, implement, and enforce service-level objectives (SLOs) and agreements (SLAs), ensuring measurable standards of system reliability and performance are consistently met. šAutomation & Optimization: Implement and maintain automation solutions (Ansible, Terraform, Kubernetes) to streamline deployments, reduce manual tasks, and optimize cloud infrastructure cost and efficiency. šTechnical Collaboration: Actively collaborate with Tier-1 support, infrastructure, and development teams, ensuring alignment on system improvements, rapid issue resolution, and operational knowledge sharing. šOn-Call Support: Participate in a rotating 24/7 on-call schedule to swiftly address critical system incidents, maintain continuous service delivery, and uphold customer trust.
What You'll Bring
šMinimum of 5 years in Technical Operations, Site Reliability Engineering (SRE), or related roles. Proven Linux/Unix system administration and advanced troubleshooting capabilities. šDeep experience managing complex Web3 infrastructures (RPC services, validator setups, node operations). Skilled in interpreting blockchain logs, JSON-RPC responses, and debugging intricate Web3 protocol issues. šSolid hands-on experience with configuration management and infrastructure automation tools (Helm, Terraform, Ansible, Consul), including containerization expertise (Docker, Kubernetes), managing and scaling services in cloud environments. šCompetency in scripting/programming languages (Python, Go, JavaScript). šAdvanced proficiency in monitoring and analytics platforms (Grafana, DataDog), enabling proactive and data-driven operational decision-making.
The crypto industry is evolving rapidly, offering new opportunities in blockchain, web3, and remote crypto roles ā donāt miss your chance to be part of it.