-netCDF is a file format for storing multidimensional data, and it is the fundamental storage unit of most modelling and large-scale observational work carried out in climate, marine and atmospheric sciences. The format is self-describing, meaning that metadata is stored alongside the data, enabling computational methods to work for almost all netCDF files that follow suitable conventions. The scale and magnitude of netCDF data in use by scientists continues to grow rapdily. For example, the Coupled Model Intercomparison Project Phase 6 [@ONeill2016], produced approximately 20 PB of publicly available data [@Petrie2021]. This accumulation of data offers great opportunies to environmental scientists, however it also poses challenges because analysis software is often difficult to use by non-specialists [@Bates2018] or is inadequate. nctoolkit is a Python package that aims to fill critical gaps in the current netCDF software ecosystem. It provides a clean interface for working with netCDF files, and it has a particular focus in ensuring the compatibility of methods with oceanic model output, which often have irregular vertical grids. In contrast to other netCDF libraries, the use of CDO as a back-end allows nctoolkit users to carry out operations without having to specify the specific names of coordinates, such as longitude, latitude and time, which enables code written for one dataset to be easily applied to another.
0 commit comments