Geospatial Data Engineer
Opportunity
- Be one of the initial hires at a remote startup, started by experienced entrepreneurs, developing a transformative approach to earth system modeling.
- Build the world’s best weather forecast and data analysis system using a data-driven, end-to-end learned approach.
- Join a multi-disciplinary team committed to open science and sharing results with the broader weather and climate communities.
Requirements
- BS, or MS in computer science, mathematics, applied statistics, machine learning, physics, meteorology, geography, or equivalent industry experience.
- 3+ years of industry experience developing peta-scale data infrastructure, ideally working with a multitude of geospatial data from weather/climate models, satellites, radar, and types of observation systems.
- Expert proficiency in Python.
- Practical experience working with diverse weather/environmental data formats, including HDF5, NetCDF, Tiff/GeoTiff, BUFR, GRIB, and various weather radar formats.
- Experience and proficiency working with systems and tools designed for large-scale data processing and archival, including Parquet, Apache Beam/Google Cloud Dataflow, BigQuery, or equivalent systems and tools on AWS or Azure.
- Experience designing, rapidly prototyping, and evaluating complex data processing systems
- Proficiency in communicating system designs to and with input from technical stakeholders including scientists and engineers.
- Ability to work independently.
- Flexibility and adaptability to work on diverse projects and pivot when necessary.
Great to Have
- Knowledge about weather and climate observation systems and data, especially from satellite platforms.
- Familiarity with NOAA/NASA/ESA/JAXA satellite data and other datasets commonly leveraged for numerical weather prediction and data assimilation applications.
- Familiarity developing in Fortran, C++.
- Familiarity with common open-source tools developed in the world of weather/climate for interacting with legacy data, including eccodes package from ECMWF and the various NCEPLIBS-* from NOAA.
- Experience collaborating or working with stakeholders from diverse communities, including academic researchers and civil servants working at NOAA or similar agencies.
- Hands-on experience designing and building applications on Google Cloud Platform leveraging managed services/products.
- Expertise in designing data systems which feed into large-scale AI/ML model training and inference.
Responsibilities
- Collaborate with the founding team to advance the state of the art in weather forecasting using a data-driven, end-to-end learned approach.
- Design and implement peta-scale data processing systems for building AI/ML-ready datasets core to the company’s scientific research and product portfolio.
- Work closely with the research team to design and generate datasets for AI/ML modeling.
- Help develop and implement standards and frameworks for creating AI/ML-ready weather/climate observational datasets, and use these to publish datasets developed as part of the company’s scientific research agenda.
- Establish best practices and workflows for data engineering across the company’s development portfolio.
- Promote engineering best practices by conducting code reviews and ensuring high-quality code.