About the company
OpenAIās mission is to ensure that general-purpose artificial intelligence benefits all of humanity. Our Communications team is composed of PR/Media Relations, Events, Design, and other external-facing functions. The teamās ethos is to support OpenAI's mission and goals by clearly and authentically explaining our technology, values, and approach to safely building powerful AI. The Events team is a dynamic group dedicated to crafting extraordinary experiences that encompass our company's values and mission. Our team is driven by a passion for bringing people together to connect in meaningful ways.
Job Summary
In this role, you will:
šParticipate in architecture and engineering decisions, bringing your strong experience and knowledge to bear. šEnsure the security, integrity, and compliance of data according to industry and company standards. šEnsure our analytics and data platforms can scale reliably to the next several orders of magnitude šAccelerate company productivity by empowering your fellow engineers, researchers, and teammates with excellent data tooling and systems, providing a best in case experience šBring new features and capabilities to the world by partnering with product engineers, trust & safety and other teams to build the technical foundations šLike all other teams, we are responsible for the reliability of the systems we build. This includes an on-call rotation to respond to critical incidents as needed
You might thrive in this role if you have:
šExperience in building stream and batch data processing pipelines, using technologies such as, or equivalent to, Kafka, Spark, Flink. šProficient with modern infrastructure management tools, such as Kubernetes and Terraform. šHave a passion for observability systems (bonus if for ML training). You are excited by the idea of building bespoke analytics systems that provide answers to key ML research questions. šHave worked in a ML training organization and have experience with the problem of data transformation during pre-training. šAre a proficient software engineer, ideally in Python, and are used to working with large monorepo codebases. šHave worked on data lifecycle management systems for large organizations, and dealt with the problems of access control, provenance, auditing, data movement at scale, metadata management, etc. šAre a self-starter and are comfortable operating in a fast-paced environment.