SWXSchema

class swxsoc.util.schema.SWXSchema(global_schema_layers: list[str] | None = None, variable_schema_layers: list[str] | None = None, use_defaults: bool | None = True)[source]

Bases: CdfAttributeManager

Class representing a schema for data requirements and formatting. The SWxSOC Default Schema only includes attributes required for ISTP compliance. Additional mission-specific attributes or requirements should be added through additional global and variable schema layers. For an example of how to layer schema files, please see the HERMES mission core package, and HermesDataSchema extension of the SWXSchema class.

There are two main components to the Space Weather Data Schema, including both global and variable attribute information.

Global schema information is loaded from YAML (dict-like) files in the following format:

attribute_name:
    description: >
        Include a meaningful description of the attribute and context needed to understand
        its values.
    default: <string> # A default value for the attribute if needed/desired
    derived: <bool> # Whether or not the attribute's value can be derived using a python function
    derivation_fn: <string> # The name of a Python function to derive the value. Must be a function member of the schema class and match the signature below.
    required: <bool> # Whether the attribute is required
    overwrite: <bool> # Whether an existing value for the attribute should be overwritten if a different value is derived.

The signature for all functions to derive global attributes should follow the format below. The function takes in a parameter data which is a SWXData object, or that of an extended data class, and returns a single attribute value for the given attribute to be derived.

def derivation_fn(self, data: SWXData):
    # ... do manipulations as needed from `data`
    return "attribute_value"

Variable schema information is loaded from YAML (dict-like) files in the following format:

attribute_key:
    attribute_name:
        description: >
            Include a meaningful description of the attribute and context needed to understand
            its values.
        derived: <bool> # Whether or not the attribute's value can be derived using a python function
        derivation_fn: <string> # The name of a Python function to derive the value. Must be a function member of the schema class and match the signature below.
        required: <bool> # Whether the attribute is required
        overwrite: <bool> # Whether an existing value for the attribute should be overwritten if a different value is derived.
        valid_values: <list> # A list of valid values that the attribute can take. The value of the attribute is checked against the `valid_values` in the Validation module.
        alternate: <string> An additional attribute name that can be treated as an alternative of the given attribute.
data:
    - attribute_name
    - ...
support_data:
    - ...
metadata:
    - ...

The signature for all functions to derive variable attributes should follow the format below. The function takes in parameters var_name, var_data, and guess_type, where:

  • var_name is the variable name of the variable for which the attribute is being derived

  • var_data is the variable data of the variable for which the attribute is being derived

  • guess_type is the guessed CDF variable type of the data for which the attribute is being derived.

The function must return a single attribute value for the given attribute to be derived.

def derivation_fn(self, var_name: str, var_data: Union[Quantity, NDData, NDCube], guess_type: ctypes.c_long):
    # ... do manipulations as needed from data
    return "attribute_value"
Parameters:
  • global_schema_layers (Optional[list[Path]]) – Absolute file paths to global attribute schema files. These schema files are layered on top of one another in a latest-priority ordering. That is, the latest file that modifies a common schema attribute will take precedence over earlier values for a given attribute.

  • variable_schema_layers (Optional[list[Path]]) – Absolute file paths to variable attribute schema files. These schema files are layered on top of one another in a latest-priority ordering. That is, the latest file that modifies a common schema attribute will take precedence over earlier values for a given attribute.

  • use_defaults (Optional[bool]) – Whether or not to load the default global and variable attribute schema files. These default schema files contain only the requirements for CDF ISTP validation.

Attributes Summary

default_global_attributes

Function to load the default global attributes from the SWxSOC schema.

Methods Summary

derive_global_attributes(data)

Function to derive global attributes for the given measurement data.

derive_measurement_attributes(data, var_name)

Function to derive metadata for the given measurement.

global_attribute_info([attribute_name])

Function to generate a astropy.table.Table of information about each global metadata attribute.

global_attribute_template()

Function to generate a template of required global attributes that must be set for a valid CDF.

measurement_attribute_info([attribute_name])

Function to generate a astropy.table.Table of information about each variable metadata attribute.

measurement_attribute_template()

Function to generate a template of required measurement attributes that must be set for a valid CDF measurement variable.

Attributes Documentation

default_global_attributes

Function to load the default global attributes from the SWxSOC schema.

Returns:

default_global_attributes (dict) – A dictionary of default global attributes.

Methods Documentation

derive_global_attributes(data) OrderedDict[source]

Function to derive global attributes for the given measurement data.

Parameters:

data (swxsoc.swxdata.SWXData) – An instance of SWXData to derive metadata from.

Returns:

attributes (OrderedDict) – A dict containing key: value pairs of global metadata attributes.

derive_measurement_attributes(data, var_name: str, guess_types: list[int] | None = None) OrderedDict[source]

Function to derive metadata for the given measurement.

Parameters:
  • data (swxsoc.swxdata.SWXData) – An instance of SWXData to derive metadata from

  • var_name (str) – The name of the measurement to derive metadata for

  • guess_types (list[int], optional) – Guessed CDF Type of the variable

Returns:

attributes (OrderedDict) – A dict containing key: value pairs of derived metadata attributes.

global_attribute_info(attribute_name: str | None = None) Table[source]

Function to generate a astropy.table.Table of information about each global metadata attribute. The astropy.table.Table contains all information in the SWxSOC global attribute schema including:

  • description: (str) A brief description of the attribute

  • default: (str) The default value used if none is provided

  • derived: (bool) Whether the attibute can be derived by the SWxSOC

    SWXSchema class

  • required: (bool) Whether the attribute is required by SWxSOC standards

  • overwrite: (bool) Whether the SWXSchema

    attribute derivations will overwrite an existing attribute value with an updated attribute value from the derivation process.

Parameters:

attribute_name (str, optional, default None) – The name of the attribute to get specific information for.

Returns:

info (astropy.table.Table) – A table of information about global metadata.

Raises:

KeyError – If attribute_name is not a recognized global attribute.:

global_attribute_template() OrderedDict[source]

Function to generate a template of required global attributes that must be set for a valid CDF.

Returns:

template (OrderedDict) – A template for required global attributes that must be provided.

measurement_attribute_info(attribute_name: str | None = None) Table[source]

Function to generate a astropy.table.Table of information about each variable metadata attribute. The astropy.table.Table contains all information in the SWxSOC variable attribute schema including:

  • description: (str) A brief description of the attribute

  • derived: (bool) Whether the attibute can be derived by the SWxSOC

    SWXSchema class

  • required: (bool) Whether the attribute is required by SWxSOC standards

  • overwrite: (bool) Whether the SWXSchema

    attribute derivations will overwrite an existing attribute value with an updated attribute value from the derivation process.

  • valid_values: (str) List of allowed values the attribute can take for SWxSOC products,

    if applicable

  • alternate: (str) An additional attribute name that can be treated as an alternative

    of the given attribute. Not all attributes have an alternative and only one of a given attribute or its alternate are required.

  • var_types: (str) A list of the variable types that require the given

    attribute to be present.

Parameters:

attribute_name (str, optional, default None) – The name of the attribute to get specific information for.

Returns:

info (astropy.table.Table) – A table of information about variable metadata.

Raises:

KeyError – If attribute_name is not a recognized global attribute.:

measurement_attribute_template() OrderedDict[source]

Function to generate a template of required measurement attributes that must be set for a valid CDF measurement variable.

Returns:

template (OrderedDict) – A template for required variable attributes that must be provided.