# greater_than_max_length This rule ensures that the number of characters in a string type column is not greater than a defined maximum value. As an example, consider the `phone` column in the `contacts` table whose maximum length value is 10. The following dataset snippet would fail validation,
                                                  Invalid Dataset                                                  
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ contactID                                           phone                                                      ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1                                                  │ 123-456-111                                                │
└────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────┘
The `phone` column in the first row has 11 characters which is greater than the maximum length value of 10, whereas the following dataset snippet would pass validation,
                                                   Valid Dataset                                                   
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ contactID                                             phone                                                    ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ 1                                                    │ 123-456-23                                               │
│ 2                                                    │ 123-456-2                                                │
└──────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────┘
The `phone` columns in both rows have a number of characters equal to 10 and 9 respectively which is less than or equal to the maximum length value of 10. ## Error report The error report should have the following fields, - **errorType**: greater_than_max_length - **tableName**: The name of the table with the invalid value - **columnName**: The name of the column with the invalid value - **rowNumber**: The index of the row with the invalid value - **row**: The dictionary containing the invalid row - **invalidValue**: The invalid value - **validationRuleFields**: The ODM data dictionary rows used to generate this rule - **message**: Value \ in row \ in column \ in table \ has length \ which is greater than the max length of \ The error report object for the example invalid row above is shown below,
{
'errors': [
│   │   {
│   │   │   'errorType': 'greater_than_max_length',
│   │   │   'tableName': 'contacts',
│   │   │   'columnName': 'phone',
│   │   │   'rowNumber': 1,
│   │   │   'row': {
│   │   │   │   'contactID': '1',
│   │   │   │   'phone': '123-456-111'
│   │   │   },
│   │   │   'invalidValue': '123-456-111',
│   │   │   'validationRuleFields': [
│   │   │   │   {
│   │   │   │   │   'partID': 'phone',
│   │   │   │   │   'maxLength': '10',
│   │   │   │   │   'contacts': 'header'
│   │   │   │   }
│   │   │   ],
│   │   │   'message': 'greater_than_max_length rule violated in table contacts, column phone, row(s) 1: Value "123-456-111" (of length 11) exceeds the max length of "10"'
│   │   }
],
'warnings': []
}
## Rule metadata All the metadata for this rule is contained in the parts sheet in the ODM dictionary. The steps involved are: 1. [Get the table names in the dictionary](../specs/odm-how-tos.md#how-to-get-the-names-of-tables-that-are-part-of-the-odm) 2. [Get all the columns for each table](../specs/odm-how-tos.md#how-to-get-the-names-of-tables-that-are-part-of-the-odm) 3. Filter the rows to only include those whose [data type is a string](../specs/odm-how-tos.md#getting-the-data-type-for-a-column). Currently, these are `varchar` columns. 4. Check whether the rows has a maximum length value. If it does then add the validation rule for that column. - The `maxLength` column contains this metadata - The possible values are a number or `NA` - If the value is `NA` then this column does not have a maximum length value For example in the following ODM dictionary snippet,
                                                     Parts v2                                                      
┏━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ partID      partType     addresses   contacts   dataType   maxLength  status  firstRelease  lastUpdated ┃
┡━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ addresses  │ tables      │ NA         │ NA        │ NA        │ NA        │ active │ 1.0.0        │ 2.0.0       │
│ addL2      │ attributes  │ header     │ NA        │ varchar   │ NA        │ active │ 1.0.0        │ 2.0.0       │
│ contacts   │ tables      │ NA         │ NA        │ NA        │ NA        │ active │ 1.0.0        │ 2.0.0       │
│ phone      │ attributes  │ NA         │ header    │ varchar   │ 10        │ active │ 1.0.0        │ 2.0.0       │
└────────────┴─────────────┴────────────┴───────────┴───────────┴───────────┴────────┴──────────────┴─────────────┘
The `phone` column in the `contacts` table would have this validation rule but the `addL2` column in the `addresses` table would not. ## Cerberus Schema We can use the [maxlength](https://docs.python-cerberus.org/en/stable/validation-rules.html#minlength-maxlength) validation rule in the cerberus library to implement this. The generated cerberus object for the example above is shown below,
{
'schemaVersion': '2.0.0',
'schema': {
│   │   'contacts': {
│   │   │   'type': 'list',
│   │   │   'schema': {
│   │   │   │   'type': 'dict',
│   │   │   │   'schema': {
│   │   │   │   │   'phone': {
│   │   │   │   │   │   'maxlength': 10,
│   │   │   │   │   │   'meta': [
│   │   │   │   │   │   │   {
│   │   │   │   │   │   │   │   'ruleID': 'greater_than_max_length',
│   │   │   │   │   │   │   │   'meta': [
│   │   │   │   │   │   │   │   │   {
│   │   │   │   │   │   │   │   │   │   'partID': 'phone',
│   │   │   │   │   │   │   │   │   │   'maxLength': '10',
│   │   │   │   │   │   │   │   │   │   'contacts': 'header'
│   │   │   │   │   │   │   │   │   }
│   │   │   │   │   │   │   │   ]
│   │   │   │   │   │   │   }
│   │   │   │   │   │   ]
│   │   │   │   │   }
│   │   │   │   },
│   │   │   │   'meta': [
│   │   │   │   │   {
│   │   │   │   │   │   'partID': 'contacts',
│   │   │   │   │   │   'partType': 'tables'
│   │   │   │   │   }
│   │   │   │   ]
│   │   │   }
│   │   }
}
}
The metadata for this rule should have the row from the ODM parts sheet for the column with its `partID` and `maxLength` column values. ## ODM Version 1 When generating the schema for version 1, we check whether the column has an [equivalent part in version 1](../specs/odm-how-tos.md#getting-the-version-1-equivalent-for-a-part). If it does, then we add it to the cerberus schema. For example,
                                                     Parts v1                                                      
┏━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┓
┃ partID   partT…  addres…  conta…  dataTy…  maxLe…  versio…  versi…  versio…  status  first…  lastUp… ┃
┡━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━╇━━━━━━━━╇━━━━━━━━━┩
│ addres… │ tables │ NA      │ NA     │ NA      │ NA     │ tables  │ Addre… │ NA      │ active │ 1.0.0  │ 2.0.0   │
│ addL2   │ attri… │ header  │ NA     │ varchar │ NA     │ variab… │ Addre… │ AddLin… │ active │ 1.0.0  │ 2.0.0   │
│ contac… │ tables │ NA      │ NA     │ NA      │ NA     │ tables  │ Conta… │ NA      │ active │ 1.0.0  │ 2.0.0   │
│ phone   │ attri… │ NA      │ header │ varchar │ 10     │ variab… │ Conta… │ phoneN… │ active │ 1.0.0  │ 2.0.0   │
└─────────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴────────┴─────────┴────────┴────────┴─────────┘
The corresponding cerberus schema would be,
{
'schemaVersion': '1.0.0',
'schema': {
│   │   'Contact': {
│   │   │   'type': 'list',
│   │   │   'schema': {
│   │   │   │   'type': 'dict',
│   │   │   │   'schema': {
│   │   │   │   │   'phoneNumber': {
│   │   │   │   │   │   'maxlength': 10,
│   │   │   │   │   │   'meta': [
│   │   │   │   │   │   │   {
│   │   │   │   │   │   │   │   'ruleID': 'greater_than_max_length',
│   │   │   │   │   │   │   │   'meta': [
│   │   │   │   │   │   │   │   │   {
│   │   │   │   │   │   │   │   │   │   'partID': 'phone',
│   │   │   │   │   │   │   │   │   │   'contacts': 'header',
│   │   │   │   │   │   │   │   │   │   'maxLength': '10',
│   │   │   │   │   │   │   │   │   │   'version1Location': 'variables',
│   │   │   │   │   │   │   │   │   │   'version1Table': 'Contact',
│   │   │   │   │   │   │   │   │   │   'version1Variable': 'phoneNumber'
│   │   │   │   │   │   │   │   │   }
│   │   │   │   │   │   │   │   ]
│   │   │   │   │   │   │   }
│   │   │   │   │   │   ]
│   │   │   │   │   }
│   │   │   │   },
│   │   │   │   'meta': [
│   │   │   │   │   {
│   │   │   │   │   │   'partID': 'contacts',
│   │   │   │   │   │   'partType': 'tables',
│   │   │   │   │   │   'version1Location': 'tables',
│   │   │   │   │   │   'version1Table': 'Contact'
│   │   │   │   │   }
│   │   │   │   ]
│   │   │   }
│   │   }
}
}
The metadata should include the following columns, - The `partID` column value - The `maxLength` column value - The `version1Location` column value - The `version1Table` column value - The `version1Variable` column value