GOOGLE CLOUD
Google Cloud Storage at gs://nnja-ai
GitHub
Download our Python SDK for working with this data on GitHub
As part of a Cooperative Research and Devlopment Agreement with NOAA, Brightband is building AI-ready observational datasets to power a new generation of machine learning-based weather and climate prediction tools. The first dataset we’re launching as part of this partnership is a re-processed version of the NOAA-NASA Joint Archive (NNJA) of Observations for Earth System Reanalysis. Initially designed to support R&D focused on improving weather and climate reanalysis modeling, the NNJA archive is an ideal dataset for developing observation-driven weather forecasting tools, as it includes a wide cross-section of data from a plethora of sensing platforms (satellites, surface stations, weather balloons, and more) and features data from 1979 to the present.
The original NNJA dataset was published in BUFR format, which is difficult and awkward to work with. NNJA-AI (pronounced “Ninja AI”) greatly simplifies this by providing a well-structured archive of the data re-processed into a contemporary, analysis-ready, cloud-optimized tabular format that can be easily integrated into any workflow a user might bring to bear on the data. Just point your favorite data analytics tools at the Hive-partitioned archive residing on Google Cloud Storage at gs://nnja-ai - perfect for users who would like to curate data at scale for feeding into ML model training programs. Or, leverage our Python SDK to quickly and easily download snapshots of data for visualization or experimentation.
We’re publishing a “Preview Release” of this data in January, 2025, to get feedback and input from users before we re-process the entire NNJA archive later in the quarter. Please give us a shout at hello@brightband.com if you’d like to chat with us more about this initiative!