Skip to content

Commit 043cd79

Browse files
committed
elaborate on netCDF file benefits for non-users
1 parent 04e09b6 commit 043cd79

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

paper.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,9 @@ nctoolkit is a Python package for the analysis and post-processing of netCDF fil
3131

3232
# Statement of need
3333

34-
netCDF is a file format for storing multidimensional data, and it is the fundamental storage unit of most modelling and large-scale observational work carried out in climate, marine and atmospheric sciences. The format is self-describing, meaning that metadata is stored alongside the data, enabling computational methods to work for almost all netCDF files that follow suitable conventions. The scale and magnitude of netCDF data in use by scientists continues to grow rapdily. For example, the Coupled Model Intercomparison Project Phase 6 [@ONeill2016], produced approximately 20 PB of publicly available data [@Petrie2021]. This accumulation of data offers great opportunies to environmental scientists, however it also poses challenges because analysis software is often difficult to use by non-specialists [@Bates2018] or is inadequate. nctoolkit is a Python package that aims to fill critical gaps in the current netCDF software ecosystem. It provides a clean interface for working with netCDF files, and it has a particular focus in ensuring the compatibility of methods with oceanic model output, which often have irregular vertical grids. In contrast to other netCDF libraries, the use of CDO as a back-end allows nctoolkit users to carry out operations without having to specify the specific names of coordinates, such as longitude, latitude and time, which enables code written for one dataset to be easily applied to another.
34+
netCDF is a file format for storing multidimensional data, and it is the fundamental storage unit of most modelling and large-scale observational work carried out in climate, marine and atmospheric sciences. Files typically represent spatiotemporal data, such as atmospheric or oceanic temperatures. In contrast to other data formats, such as csv, netCDF files are self-describing and typically follow universally agreed conventions for coordinate names and file structure etc. As a result, it is possible to write software that can work with almost all netCDF files that follow those conventions, and there is no automatic need to burden users with the need to identify the names given to coordinates such as time with the files themselves. A key consequence is that software can carry out operations, such as calculating spatial averages, in one line of code that might otherwise require users to write multiple lines of code, and for these operations to largely work on any netCDF file.
35+
36+
The scale and magnitude of netCDF data in use by scientists continues to grow rapdily. For example, the Coupled Model Intercomparison Project Phase 6 [@ONeill2016], produced approximately 20 PB of publicly available data [@Petrie2021]. This accumulation of data offers great opportunies to environmental scientists, however it also poses challenges because analysis software is often difficult to use by non-specialists [@Bates2018] or is inadequate. nctoolkit is a Python package that aims to fill critical gaps in the current netCDF software ecosystem. It provides a clean interface for working with netCDF files, and it has a particular focus in ensuring the compatibility of methods with oceanic model output, which often have irregular vertical grids. In contrast to other netCDF libraries, the use of CDO as a back-end allows nctoolkit users to carry out operations without having to specify the specific names of coordinates, such as longitude, latitude and time, which enables code written for one dataset to be easily applied to another.
3537

3638

3739
# Overview of Functionality

0 commit comments

Comments
 (0)