Pandas and xarray in pyCIFΒΆ

pyCIF uses pandas and xarray to store data in a convenient format. Time series are stored as pandas.DataFrame and higher dimension data sets as xarray.Dataset or xarray.DataArray.

Pandas is mostly used for its convenient and efficient way of handling dates and reindexing. Xarray is only used as the default format to share data between functions in pyCIF.

Note

We do not use all the potential of pandas and xarray for two main reasons:

  • to avoid compatibility issues between versions of the two modules

  • no adjoint is coded for the methods of the two modules, so they are replaced by custom equivalents

Intermediate data in pyCIF have a standardized format as detailed here. It is possible to dump intermediate temporary data for debugging purposes using the option save_debug in the observation operator: see details here.