obsparsers class#
- class pycif.utils.classes.obsparsers.ObsParser(plg_orig=None, orig_name='', **kwargs)[source]#
Bases:
PluginPlugin type for handling time series parsing from different data providers and data file formats.
Concrete implementations live in
pycif/plugins/obsparsers/.- initiate_template()[source]#
Initialise the ObsParser plugin template.
Loads the registered obs-parser module and attaches
do_parseandparse_multiple_filesas bound methods on this instance.
- classmethod get_parser(plg)[source]#
Get the correct Parser for a provider and file_format_id
- Parameters:
provider (str) – provider of the input file
file_format_id (str) – name of the type of file with a given format
- Returns:
Parser for provider and file_format_id
- Return type:
Parser
- classmethod register_parser(provider, file_format_id, parse_module, **kwargs)[source]#
Register a parsing function for provider and format with default options
- Parameters:
provider (str) – provider of the input file
file_format_id (str) – name of the type of file with a given format
parse_module (Module) – returns file content as pandas.DataFrame df[obssite_id, parameter]
**kwargs – default options for parse_function
Notes
The parse_function signature is the same as the
Parser.parse_file()
- parse_file(obs_file, **kwargs)[source]#
This function does the parsing (and post processing if necessary).
- Parameters:
obs_file (str) – path to input file
- Keyword Arguments:
encoding (str) – Encoding of input files
freq (str) – frequency after resampling; see Offset Aliases for valid strings
src_freq (str) – explicit setting of the frequency in the input file shouldn’t be necessary
- Returns:
renamed, shifted, resampled Dataframe df[obssite_id, parameter] with t as index
- Return type:
pandas.DataFrame
- parse_multiple_files(**kwargs)[source]#
Parses multiple files specified by a glob pattern and stores the content into a datastore.
- Parameters:
self – the plugin with its describing arguments (in particular dir_obs)
- Returns:
{obs_file} = df[obssite_id, parameter]
- Return type:
dict
Note
By default, the function calls self.parse_file, which filters out NaNs and check that all required columns are available.
- static check_df(df, **kwargs)[source]#
Check that a parsed DataFrame contains all required columns.
- Parameters:
df (pd.DataFrame) – DataFrame returned by the parser.
**kwargs – Accepted for compatibility; not used.
- Returns:
True if all required columns are present.
- Return type:
bool
- Raises:
PluginError – If any of
station,network,parameter,durationorobserrorcolumns are missing.