About the company
Swan is a leading Bitcoin-only financial services company supporting individuals and companies throughout their Bitcoin journey. We hire passionate Bitcoiners who want to work with a self-motivated and fully distributed startup team.
Job Summary
The Role
📍In this position, you will work closely with our development team, the CTO, and cloud/infra engineers to develop and operate a robust and scalable platform to support Swan’s business lines. You’ll cover a wide range of activities, from day to day operations, error monitoring, and proactive communication, to engineering, bug fixes, and database analysis to improve performance of queries. While this position is focused on operational expertise, experience and desire to build software and systems will always be encouraged.
Skills and experience that will help you succeed:
đź“ŤExperience with Datadog or similar, setting up monitors, alerting systems, anomaly management and forecasting. A desire to drive a proactive approach to scalability. đź“ŤMedium to advanced level understanding of Postgres databases, having dealt with databases at scale, understanding how to tweak parameters, optimize sql queries, and knowledge of AWS RDS in particular. đź“ŤExcellent understanding of HA architectures built in AWS. đź“ŤAt least mid level knowledge of DNS, SSL, AWS networking, Docker, and ECS. đź“ŤWorking knowledge of security principles in the cloud and a familiarity with the AWS Well Architected Framework. đź“ŤCool under pressure, able to manage incidents involving multiple systems, communicate effectively internally and externally using tools like StatusPage and PagerDuty, marshal resources, and get things resolved, including writing blameless postmortems. đź“ŤComfortable in taking (very occasional) pager alerts during working hours and sometimes weekends (we generally try to avoid night time pager alerts, as we do have staff in Europe and can split pager duty across timezones). You will not be the only on-call staff, but you will be in charge of primary incident response and leadership and training of other developers in response and mitigation.