The [validation toolkit](https://github.com/Big-Life-Lab/PHES-ODM-Validation) for the [Population Health Environmental Surveillance Open Data Model (PHES-ODM)](https://github.com/Big-Life-Lab/PHES-ODM) ensures your ODM data is complete and interoperable. Users can check whether their data meets the ODM dictionary format. # At a glance Validation is performed based on a set of rules defined in a schema. The `validate_data` function is used to perform the validation on `data` with the `schema` specified. ``` python schema = import_schema("schema.yml") errors = validate_data(schema, data) ``` # Overview The ODM library has the following features: - Validate any ODM data table as a CSV file. - Generate a report with warning and errors that indicate which data field(s) contain invalid data. - Validate any version of the ODM dicionary. - Users can add rules for their specific program. For example, you can add a list of valid testing sites for your surveillance program. - Users can request additional default validation rules. There are three parts to the ODM validation toolkit. 1. **Python functions to validate ODM-formatted data**. These functions check ODM data for missing, incomplete, or incompatible data. The functions return a list of errors and warnings. 2. **A list of validation rules**. The rules define what are valid ODM data. Examples of rules include [mandatory data fields](validation-rules/missing_mandatory_column.md) and valid data types. For example, every measurent must have a date the measurement was performed, and the [date type](validation-rules/invalid_type.md#datetime) must follow [ISO 8601](https://www.iso.org/iso-8601-date-and-time-format.html) format. Rules are combined together in a validation schema. 3. **A standard schema method**. The validation toolkit extends the [cerebrus](https://docs.python-cerberus.org/en/stable/index.html) validation method to provide a standard approach to defining validation rules that allows the rule list to be easily extended. For example, a wastewater surveillance program can can generate a list of the sample sites within their program; and then use the ODM validation toolkit to ensure all their data include a `siteID` with a valid idenfitication. See [tutorial X]() for how to extend the ODM schema. # Installation Make sure you have the latest version of Python installed on your system before starting. The required version can be found in pyproject.toml. If you only want to install the package, you can run the following command: `pip install "git+https://github.com/Big-Life-Lab/PHES-ODM-Validation.git@main"` If you want to set up a development environment instead, run the following commands: git clone https://github.com/Big-Life-Lab/PHES-ODM-Validation.git odm-validation cd odm-validation pip install -r ./requirements.txt pip install -r ./requirements-dev.txt pip install -r ./tests/requirements.txt pip install -e . Tests can be run with the following commands (after setting up a dev. env.): # static typechecking mypy # unittests python -m unittest discover ./tests The package documentation can be generated by following the instructions in docs/README.md. # Usage Quick ‘how to’ - defined as the steps to solve a specific problem. These compliment our tutorials that are learning oriented documentation for newcommers. ## Validate a specific ODM version ## Validate with you own schema ## Generate a schema from the ODM dictionary You can genereate your own schema directly from the ODM parts table. ``` python parts = import_dataset("parts.csv") schema = generate_cerberus_schema(parts) errors = validate_data(schema, data) ``` ## Make you own validation rule {example of validating a list of siteIDs} # Fequently asked questions and support Get support on the [ODM Discourse community forum](https://odm.discourse.group/c/support). Use the `#validate` in your question. Use the `#faq` and `#validate` to find frequently asked questions. See [Contributing](links/contributing_link.md) for how to suggest a new rule. Error or bugs can be reported on GitHub [Issues](https://github.com/Big-Life-Lab/PHES-ODM-Validation/issues).