obsparsers class#

class pycif.utils.classes.obsparsers.ObsParser(plg_orig=None, orig_name='', **kwargs)[source]#

Bases: Plugin

Plugin type for handling time series parsing from different data providers and data file formats.

Concrete implementations live in pycif/plugins/obsparsers/.

initiate_template()[source]#

Initialise the ObsParser plugin template.

Loads the registered obs-parser module and attaches do_parse and parse_multiple_files as bound methods on this instance.

classmethod get_parser(plg)[source]#

Get the correct Parser for a provider and file_format_id

Parameters:
  • provider (str) – provider of the input file

  • file_format_id (str) – name of the type of file with a given format

Returns:

Parser for provider and file_format_id

Return type:

Parser

classmethod register_parser(provider, file_format_id, parse_module, **kwargs)[source]#

Register a parsing function for provider and format with default options

Parameters:
  • provider (str) – provider of the input file

  • file_format_id (str) – name of the type of file with a given format

  • parse_module (Module) – returns file content as pandas.DataFrame df[obssite_id, parameter]

  • **kwargs – default options for parse_function

Notes

The parse_function signature is the same as the Parser.parse_file()

parse_file(obs_file, **kwargs)[source]#

This function does the parsing (and post processing if necessary).

Parameters:

obs_file (str) – path to input file

Keyword Arguments:
  • encoding (str) – Encoding of input files

  • freq (str) – frequency after resampling; see Offset Aliases for valid strings

  • src_freq (str) – explicit setting of the frequency in the input file shouldn’t be necessary

Returns:

renamed, shifted, resampled Dataframe df[obssite_id, parameter] with t as index

Return type:

pandas.DataFrame

parse_multiple_files(**kwargs)[source]#

Parses multiple files specified by a glob pattern and stores the content into a datastore.

Parameters:

self – the plugin with its describing arguments (in particular dir_obs)

Returns:

{obs_file} = df[obssite_id, parameter]

Return type:

dict

Note

By default, the function calls self.parse_file, which filters out NaNs and check that all required columns are available.

static check_df(df, **kwargs)[source]#

Check that a parsed DataFrame contains all required columns.

Parameters:
  • df (pd.DataFrame) – DataFrame returned by the parser.

  • **kwargs – Accepted for compatibility; not used.

Returns:

True if all required columns are present.

Return type:

bool

Raises:

PluginError – If any of station, network, parameter, duration or obserror columns are missing.