Skip to content

input4mips-validation#

Entrypoint for the command-line interface

Usage:

$ input4mips-validation [OPTIONS] COMMAND [ARGS]...

Options:

  • --version: Print the version number and exit
  • --no-logging: Disable all logging. If supplied, overrides '--logging-config'.
  • --logging-level TEXT: Logging level to use. This is only applied if no other logging configuration flags are supplied.
  • --logging-config PATH: Path to the logging configuration file. This will be loaded with (https://github.com/erezinman/loguru-config). If supplied, this overrides any value provided with --log-level.For a sample configuration file, see [How to configure logging with input4MIPs-validation?]
  • --install-completion: Install completion for the current shell.
  • --show-completion: Show completion for the current shell, to copy it or customize the installation.
  • --help: Show this message and exit.

Commands:

  • validate-file: Validate a single file
  • validate-tree: Validate a tree of files
  • upload-ftp: Upload files to an FTP server
  • db

input4mips-validation validate-file#

Validate a single file

This validation is only partial because some validation can only be performed if we have the entire file tree. See the validate-tree command for this validation.

Usage:

$ input4mips-validation validate-file [OPTIONS] FILE

Arguments:

  • FILE: The file to validate [required]

Options:

  • --cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variable INPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise a NotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).
  • --write-in-drs PATH: If supplied, the file will be re-written into the DRS if it passes validation.The supplied value is assumed to be the root directory into which to write the file (following the DRS).
  • --bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifying infile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]
  • --frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]
  • --no-time-axis-frequency TEXT: The value of frequency_metadata_key in the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]
  • --time-dimension TEXT: The time dimension of the data. [default: time]
  • --allow-cf-checker-warnings: Allow validation to pass, even if the CF-checker raises warnings
  • --help: Show this message and exit.

input4mips-validation validate-tree#

Validate a tree of files

This checks things like whether all external variables are also provided and all tracking IDs are unique.

Usage:

$ input4mips-validation validate-tree [OPTIONS] TREE_ROOT

Arguments:

  • TREE_ROOT: The root of the tree to validate [required]

Options:

  • --cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variable INPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise a NotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).
  • --bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifying infile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]
  • --frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]
  • --no-time-axis-frequency TEXT: The value of frequency_metadata_key in the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]
  • --time-dimension TEXT: The time dimension of the data. [default: time]
  • --rglob-input TEXT: String to use when applying rglob to find input files. [default: *.nc]
  • --allow-cf-checker-warnings: Allow validation to pass, even if the CF-checker raises warnings
  • --output-html PATH: Output the result as HTML to this file too.
  • --help: Show this message and exit.

input4mips-validation upload-ftp#

Upload files to an FTP server

We recommend running this with a log level of INFO to start, then adjusting from there.

Usage:

$ input4mips-validation upload-ftp [OPTIONS] TREE_ROOT

Arguments:

  • TREE_ROOT: The root of the tree to upload [required]

Options:

  • --ftp-dir-rel-to-root TEXT: Directory, relative to root_dir_ftp_incoming_files, in which to upload the files on the FTP server. For example, "my-institute-input4mips". [required]
  • --password TEXT: Password to use when logging in. If you are uploading to LLNL's FTP server, please use your email address here. [required]
  • --username TEXT: Username to use when logging in to the server. [default: anonymous]
  • --ftp-server TEXT: FTP server to upload to. [default: ftp.llnl.gov]
  • --ftp-dir-root TEXT: Root directory on the FTP server for receiving files [default: /incoming]
  • --n-threads INTEGER: Number of threads to use during upload [default: 4]
  • --cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variable INPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise a NotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).
  • --dry-run: Perform a dry run. In other words, don't actually upload the files, but show what would be uploaded.
  • --continue-on-error: Continue trying to upload the rest of the files, even if an error is raised while trying to upload a file.
  • --help: Show this message and exit.

input4mips-validation db#

Usage:

$ input4mips-validation db [OPTIONS] COMMAND [ARGS]...

Options:

  • --help: Show this message and exit.

Commands:

  • create: Create a database from a tree of files
  • add-tree: Add files from a tree to a database
  • validate: Validate the entries in a database

input4mips-validation db create#

Create a database from a tree of files

Usage:

$ input4mips-validation db create [OPTIONS] TREE_ROOT

Arguments:

  • TREE_ROOT: The root of the tree for which to create the database [required]

Options:

  • --db-dir DIRECTORY: The directory in which to write the database entries. [required]
  • --cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variable INPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise a NotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).
  • --bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifying infile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]
  • --frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]
  • --no-time-axis-frequency TEXT: The value of frequency_metadata_key in the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]
  • --time-dimension TEXT: The time dimension of the data. [default: time]
  • --rglob-input TEXT: String to use when applying rglob to find input files. [default: *.nc]
  • --n-processes INTEGER: Number of parallel processes to use [default: 1]
  • --mp-context-id [fork|spawn]: Multiprocessing context ID. In short, forking is faster and will preserve logging. However, it is harder to debug and only available on POSIX systems. In contrast, spawn is slower and will (probably) not preserve logging in sub-processes, but it will work on windows. [default: fork]
  • --help: Show this message and exit.

input4mips-validation db add-tree#

Add files from a tree to a database

Usage:

$ input4mips-validation db add-tree [OPTIONS] TREE_ROOT

Arguments:

  • TREE_ROOT: The root of the tree from which to add entries to the database [required]

Options:

  • --db-dir DIRECTORY: The database's directory. [required]
  • --cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variable INPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise a NotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).
  • --bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifying infile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]
  • --frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]
  • --no-time-axis-frequency TEXT: The value of frequency_metadata_key in the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]
  • --time-dimension TEXT: The time dimension of the data. [default: time]
  • --rglob-input TEXT: String to use when applying rglob to find input files. [default: *.nc]
  • --n-processes INTEGER: Number of parallel processes to use [default: 1]
  • --mp-context-id [fork|spawn]: Multiprocessing context ID. In short, forking is faster and will preserve logging. However, it is harder to debug and only available on POSIX systems. In contrast, spawn is slower and will (probably) not preserve logging in sub-processes, but it will work on windows. [default: fork]
  • --help: Show this message and exit.

input4mips-validation db validate#

Validate the entries in a database

Usage:

$ input4mips-validation db validate [OPTIONS]

Options:

  • --db-dir DIRECTORY: The database's directory. [required]
  • --cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variable INPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise a NotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).
  • --bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifying infile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]
  • --frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]
  • --no-time-axis-frequency TEXT: The value of frequency_metadata_key in the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]
  • --time-dimension TEXT: The time dimension of the data. [default: time]
  • --allow-cf-checker-warnings: Allow validation to pass, even if the CF-checker raises warnings
  • --n-processes INTEGER: Number of parallel processes to use [default: 1]
  • --mp-context-id [fork|spawn]: Multiprocessing context ID. In short, forking is faster and will preserve logging. However, it is harder to debug and only available on POSIX systems. In contrast, spawn is slower and will (probably) not preserve logging in sub-processes, but it will work on windows. [default: fork]
  • --force: Force re-validation of all entries. This means that any previous validation of the entries is ignored.
  • --help: Show this message and exit.