ODM How-to’s

This file specifies methods for navigating the ODM data dictionary that can be referenced by different specification files in this repo.

How to get the names of tables that are part of the ODM

We can use the partID and partType columns for this. Any part ID whose partType is tables is considered to be a table. For example in the following dictionary,

[
    {
        "partID": "addresses",
        "partType": "tables"
    },
    {
        "partID": "measures",
        "partType": "tables"
    },
    {
        "partID": "addId",
        "partType": "attributes"
    }
]

addresses and measures are table names.

How to get the columns names for a table

Once we have the table names part of the dictionary, the dictionary has columns that match with these table names giving information on whether a part is a column for it. For example in the dictionary snippet below,

[
    {
        "partID": "addresses",
        "partType": "tables",
        "addresses": "NA",
        "contacts": "NA"
    },
    {
        "partID": "addId",
        "partType": "attributes",
        "addresses": "pK",
        "contacts": "NA"
    },
    {
        "partID": "addL1",
        "partType": "attributes",
        "addresses": "addL1",
        "contacts": "NA"
    },
    {
        "partID": "contacts",
        "partType": "tables",
        "addresses": "addL1",
        "contacts": "NA"
    },
    {
        "partID": "contID",
        "partType": "attributes",
        "addresses": "NA",
        "contacts": "pK"
    },
]

The tables are addresses and contacts and the addresses and contacts columns give information on which parts are their columns. Here, addId and addL1 are columns in the addresses table and contID is a column in the contacts table. A part is a column in a table if it does not have a value of NA or input in the table column. In addition, the table column has additional possible values to describe the role a column plays in the table. All the possible values are,

  • pK: The column is the primary key for the table

  • fK The column is the foreign referencing a row in another table

  • header: No special role

  • input:

  • NA: The part is a not a column in the table

Getting the version 1 equivalent for a part

The parts sheet in the ODM dictionary has columns that indicate whether a part has a ODM version 1 equivalent. These columns can indicate whether a part was a table, column, or category in version 1 or if it does not have an equivalent in version 1. The column names and their metadata are shown below,

  • version1Location: Whether this part was a table, column, or category in version 1 or NA if it does not have a version 1 equivalent. The possible values are:

    • tables: This part was a table in version 1

    • variables: This part was a column in version 1

    • variableCategories: This was was a variable category in version 1

  • version1Table: The name of the version 1 table or NA. If version1Location is tables then this contains the version 1 table name for this part otherwise its the name of the table that the version 1 column or variable category belongs to.

  • version1Variable: The name of the version 1 column or NA. If version1Location is variables then this contains the version 1 column name for this part otherwise its the name of the column that the version 1 variable category belongs to.

  • version1Category: The name of the version 1 category or NA.

A part can have multiple version 1 equivalents each of which are seperated by a semi-colon (;) for example Site;WWMeasure.

Examples are given below.

The instruments part in version 2 is a table in version 1 called Instrument

[
    {
        "partID": "instruments",
        "version1Location": "tables",
        "version1Table": "Instrument",
        "version1Variable": "NA",
        "version1Category": "NA"
    }
]

The measRepID part in version 2 is a column in version 1 called uWwMeasureID in the WWMeasure version 1 table

[
    {
        "partID": "measRepID",
        "version1Location": "variables",
        "version1Table": "WWMeasure",
        "version1Variable": "uWwMeasureID",
        "version1Category": "NA"
    }
]

The refLink part in version 2 is a column in version 1 called referenceLink in the AssayMethod and Instrument version 1 tables

[
    {
        "partID": "refLink",
        "version1Location": "variables",
        "version1Table": "AssayMethod;Instrument",
        "version1Variable": "refLink",
        "version1Category": "NA"
    }
]

Finally, the airTemp part in version 2 is a category in version 1 called envTemp in the SiteMeasure version 1 table for the type column.

[
    {
        "partID": "airTemp",
        "version1Location": "variableCategories",
        "version1Table": "SiteMeasure",
        "version1Variable": "type",
        "version1Category": "envTemp"
    }
]

Checking if a column is mandatory for a table

This can be done by looking at the \<table_name\>Required columns in the parts sheet. The table names will need to be retreived first to identify these columns. The possible values for the \<table_name\>Required columns are:

  • mandatory: This column is mandatory

  • optional: This column is optional

  • NA: This column is not valid for this table

For example in the ODM snippet below,

[
    {
        "partID": "geoLat",
        "sites": "header",
        "sitesRequired": "mandatory"
    },
    {
        "partID": "geoLong",
        "sites": "header",
        "sitesRequired": "optional"
    },
    {
        "partID": "measureID",
        "sites": "NA",
        "sitesRequired": "NA"
    }
]

The geoLat and geoLong columns in the sites table are mandatory and optional required while the measureID column is not present in the sites table.

Getting the data type for a column

Once we have the columns for a table, we can get its data type by using the dataType column in the parts sheet. The column has one of the following values:

  • varchar

  • integer

  • float

  • boolean

  • categorical

  • blob

  • datetime

  • email

  • see measure: The data type for this part depends on the measure