input4mips-validation#
Entrypoint for the command-line interface
Usage:
Options:
--version: Print the version number and exit--no-logging: Disable all logging. If supplied, overrides '--logging-config'.--logging-level TEXT: Logging level to use. This is only applied if no other logging configuration flags are supplied.--logging-config PATH: Path to the logging configuration file. This will be loaded with (https://github.com/erezinman/loguru-config). If supplied, this overrides any value provided with--log-level.For a sample configuration file, see [How to configure logging with input4MIPs-validation?]--install-completion: Install completion for the current shell.--show-completion: Show completion for the current shell, to copy it or customize the installation.--help: Show this message and exit.
Commands:
validate-file: Validate a single filevalidate-tree: Validate a tree of filesupload-ftp: Upload files to an FTP serverdb
input4mips-validation validate-file#
Validate a single file
This validation is only partial
because some validation can only be performed if we have the entire file tree.
See the validate-tree command for this validation.
Usage:
Arguments:
FILE: The file to validate [required]
Options:
--cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variableINPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise aNotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).--write-in-drs PATH: If supplied, the file will be re-written into the DRS if it passes validation.The supplied value is assumed to be the root directory into which to write the file (following the DRS).--bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifyinginfile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]--frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]--no-time-axis-frequency TEXT: The value offrequency_metadata_keyin the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]--time-dimension TEXT: The time dimension of the data. [default: time]--allow-cf-checker-warnings: Allow validation to pass, even if the CF-checker raises warnings--help: Show this message and exit.
input4mips-validation validate-tree#
Validate a tree of files
This checks things like whether all external variables are also provided and all tracking IDs are unique.
Usage:
Arguments:
TREE_ROOT: The root of the tree to validate [required]
Options:
--cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variableINPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise aNotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).--bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifyinginfile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]--frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]--no-time-axis-frequency TEXT: The value offrequency_metadata_keyin the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]--time-dimension TEXT: The time dimension of the data. [default: time]--rglob-input TEXT: String to use when applyingrglobto find input files. [default: *.nc]--allow-cf-checker-warnings: Allow validation to pass, even if the CF-checker raises warnings--output-html PATH: Output the result as HTML to this file too.--help: Show this message and exit.
input4mips-validation upload-ftp#
Upload files to an FTP server
We recommend running this with a log level of INFO to start, then adjusting from there.
Usage:
Arguments:
TREE_ROOT: The root of the tree to upload [required]
Options:
--ftp-dir-rel-to-root TEXT: Directory, relative toroot_dir_ftp_incoming_files, in which to upload the files on the FTP server. For example, "my-institute-input4mips". [required]--password TEXT: Password to use when logging in. If you are uploading to LLNL's FTP server, please use your email address here. [required]--username TEXT: Username to use when logging in to the server. [default: anonymous]--ftp-server TEXT: FTP server to upload to. [default: ftp.llnl.gov]--ftp-dir-root TEXT: Root directory on the FTP server for receiving files [default: /incoming]--n-threads INTEGER: Number of threads to use during upload [default: 4]--cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variableINPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise aNotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).--dry-run: Perform a dry run. In other words, don't actually upload the files, but show what would be uploaded.--continue-on-error: Continue trying to upload the rest of the files, even if an error is raised while trying to upload a file.--help: Show this message and exit.
input4mips-validation db#
Usage:
Options:
--help: Show this message and exit.
Commands:
create: Create a database from a tree of filesadd-tree: Add files from a tree to a databasevalidate: Validate the entries in a database
input4mips-validation db create#
Create a database from a tree of files
Usage:
Arguments:
TREE_ROOT: The root of the tree for which to create the database [required]
Options:
--db-dir DIRECTORY: The directory in which to write the database entries. [required]--cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variableINPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise aNotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).--bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifyinginfile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]--frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]--no-time-axis-frequency TEXT: The value offrequency_metadata_keyin the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]--time-dimension TEXT: The time dimension of the data. [default: time]--rglob-input TEXT: String to use when applyingrglobto find input files. [default: *.nc]--n-processes INTEGER: Number of parallel processes to use [default: 1]--mp-context-id [fork|spawn]: Multiprocessing context ID. In short, forking is faster and will preserve logging. However, it is harder to debug and only available on POSIX systems. In contrast, spawn is slower and will (probably) not preserve logging in sub-processes, but it will work on windows. [default: fork]--help: Show this message and exit.
input4mips-validation db add-tree#
Add files from a tree to a database
Usage:
Arguments:
TREE_ROOT: The root of the tree from which to add entries to the database [required]
Options:
--db-dir DIRECTORY: The database's directory. [required]--cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variableINPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise aNotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).--bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifyinginfile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]--frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]--no-time-axis-frequency TEXT: The value offrequency_metadata_keyin the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]--time-dimension TEXT: The time dimension of the data. [default: time]--rglob-input TEXT: String to use when applyingrglobto find input files. [default: *.nc]--n-processes INTEGER: Number of parallel processes to use [default: 1]--mp-context-id [fork|spawn]: Multiprocessing context ID. In short, forking is faster and will preserve logging. However, it is harder to debug and only available on POSIX systems. In contrast, spawn is slower and will (probably) not preserve logging in sub-processes, but it will work on windows. [default: fork]--help: Show this message and exit.
input4mips-validation db validate#
Validate the entries in a database
Usage:
Options:
--db-dir DIRECTORY: The database's directory. [required]--cv-source TEXT: String identifying the source of the CVs. If not supplied, this is retrieved from the environment variableINPUT4MIPS_VALIDATION_CV_SOURCE. If this environment variable is also not set, we raise aNotImplementedError. If this starts with 'gh:', we retrieve the data from PCMD's GitHub, using everything after the colon as the ID for the Git object to use (where the ID can be a branch name, a tag or a commit ID). Otherwise we simply return the path as provided and use the (https://validators.readthedocs.io/en/stable) package to decide if the source points to a URL or not (i.e. whether we should look for the CVs locally or retrieve them from a URL).--bnds-coord-indicators TEXT: A semi-colon (';') separated list of strings that indicate that a variable is a bounds co-ordinate. This helps us with identifyinginfile's variables correctly in the absence of an agreed convention for doing this (xarray has a way, but it conflicts with the CF-conventions hence iris, so here we are). This interface gives limited control over this. For more complex control, use the Python API directly. [default: bnds;bounds]--frequency-metadata-key TEXT: The key in the data's metadata which points to information about the data's frequency. [default: frequency]--no-time-axis-frequency TEXT: The value offrequency_metadata_keyin the metadata which indicates that the file has no time axis i.e. is fixed in time. [default: fx]--time-dimension TEXT: The time dimension of the data. [default: time]--allow-cf-checker-warnings: Allow validation to pass, even if the CF-checker raises warnings--n-processes INTEGER: Number of parallel processes to use [default: 1]--mp-context-id [fork|spawn]: Multiprocessing context ID. In short, forking is faster and will preserve logging. However, it is harder to debug and only available on POSIX systems. In contrast, spawn is slower and will (probably) not preserve logging in sub-processes, but it will work on windows. [default: fork]--force: Force re-validation of all entries. This means that any previous validation of the entries is ignored.--help: Show this message and exit.