Data quality checks before upgrading to V6.0
Invalid / space coordinates (mandatory)
| How To |
|---|
|
{{#if:Space coordinates | |}}{{#if:MGRS coordinates | |}}{{#if:Coordinate display rules | |}}{{#if:Point ID Length | |}}{{#if:Missing approval information | |}}{{#if:Missing Date of Information | |}}{{#if:Missing Local ID | |}}{{#if:Templates | |}}{{#if:DIM Categories | |}}{{#if:Enum categories and values | |}}{{#if:Translations | |}}{{#if:CDF display format | |}}{{#if:CDF missing in field | |}}{{#if:Duplicate CDF | |}}{{#if:CDF not used | |}}{{#if:Empty CDF | |}}{{#if:Duplicate allow value | |}}{{#if:Link invalid date | |}}{{#if:Country structure orphans | |}}{{#if:Country structure duplicates | |}}{{#if:|
|
It was not possible to upgrade to V5.08.04 if there are invalid coordinates in the database. Please run this SQL query anyhow because importing may overrun application rules. If you get any rows here, you have more SQL queries that will identify which object has the invalid coordinates in “Invalid_coordinates_ver2.doc” on PI.
MGRS coordinates (mandatory)
In V5.xxx it was possible to enter MGRS with wrong format. The first query checks for coordinates written with lower-case letters e.g. 4qfj12345678 and the second query checks for blanks e.g. 4 QFJ 1234 5678.
Coordinates display rules
In IMSMA NG there are 3 rules on how to handle and display coordinates, based on how they were entered;
- MGRS,
- Distance/Bearing
- X and Y.
When data has been migrated or imported it is possible that the column userinputformat in the table geopoint, which controls coordinate display, has not been set correctly. When the display field is set correctly the columns are high-lighted with yellow. If the user tries to edit a row which does not have one of the rules set, the coordinates will not be populated in the Point window. This issue only affects the edit of Auxiliary data coordinates and Data Entry Forms in the Workbench (query).
Length of Point ID
Point (local) ID should be unique and clearly identify single and polygon points. But it is possible to type blanks instead of characters/numbers. If spaces have been used it is not possible for the users to see difference when entering/editing the points. If you get length 0, 1 or 2 I recommend to look into the values of Point ID (query).
Missing approval information (mandatory)
Due to changed behaviour of migration/import scripts in version 5.08.02 and a bug in 5.08.04 approval information may be missing. Note that Last updated gets updated e.g. when templates are switched and should not be used for setting Approved Date. If the Data Entry Form that misses Approved Date is an Activity or Education then you might be able to use End date (query).
| Column | Old label | New label |
|---|---|---|
| reportReceivedDate | Initiated | Data Entry Date |
| reportCompletedDate | Submitted | Submitted Date |
| reportVerifiedDate | Approved | Approved Date |
| dateofreport | Date of Report | Date of Information |
| dataentrydate | Last updated | (no change) |
Missing Date of Information (mandatory)
Date of Information (stored in table fieldreport) is used when IMSMANG calculates the Summary (former Current view). If this data is missing then the result may not be as the expected when the Summary is updated. Note that Last updated gets updated e.g. when templates are switched and should not be used for setting Date of Information (query).
Missing Local ID (mandatory)
In some database also Form ID and the items' local ID have been missing. The queries do not include Country structure, Organisation, Place, Task and WorkItem.
Data Entry Form Templates
Some countries have many published templates that have never been used which makes it difficult to know which template to update etc. It is also good to know which templates that have been used for data entry in case you need to update or switch them (queries).
DIM categories (mandatory)
If categories have been deleted there will be errors when the upgrade script is applied. As a quick indication if there will be errors, numbers of categories may be used. There should be at least 52 categories.
These 5 categories for Task and Work Item should exists (queries).
| Parent | Category |
|---|---|
| TASK | GENERAL_INFO |
| TASK | PLANNING_MONITORING |
| TASK | UNCATEGORISED |
| WORK_ITEM | GENERAL_INFO |
| WORK_ITEM | UNCATEGORISED |
Number of enumeration categories and values (mandatory)
If the country has deleted standard enums from the table imsmaenum then upgrade scripts will not give the expected result which will create a lot of problems. As a quick indication if there will be errors, numbers of categories and values may be used. There should be at least 117 categories and 978 values in the result set of these queries.
Number of translations (mandatory)
If the country has deleted standard translations from the table translation then upgrade scripts will not give the expected result which will create a lot of problems. The result of this query should be at least 1269 (English) translations.
CDF display format (mandatory)
The combination of cdf_datatype in table customdefinedfield and fieldtype in table field must be correct. When data type has been changed directly in table customdefinedfield or data has been migrated these combination might be wrong.
Common errors are.
- display_mechanism for multi select is set to RADIO_BUTTON;
- no display_mechanism is set for single select fields;
- wrong fieldtype for multi and single selects;
- display_mechanism is "" i.e. length is 0.
The combinations in the table below is the correct 5.08.04 values
| fieldtype | display_mechanism | cdf_datatype |
|---|---|---|
| cdf | (null) | DATE |
| cdf | (null) | GAZETTEER |
| cdf | (null) | MULTI_SELECT |
| cdf | (null) | NUMBER |
| cdf | (null) | ORGANISATION |
| cdf | (null) | PLACE |
| select_cdf | COMBO_BOX | SINGLE_SELECT |
| select_cdf | RADIO_BUTTONS | SINGLE_SELECT |
| cdf | (null) | TEXT_FIELD |
CDF missing in table field (mandatory)
We have had a few cases where CDFs have been missing in the table field or in the table customdefinedfield.
Duplicate CDFs (mandatory)
In 5.08.04 it was allowed that CDFs had the same name if they had different data type.
CDF never used for Data Entry
Note that a CDF may be included in Data Entry templates, Summary templates, Saved searches, etc. but have no values in table cdfvalue. This is not an error but if they have many non-used CDFs it is an indication that IM procedures may be improved.
Records with empty values in table cdfvalue
The next level of check is to check if the value ‘ ‘ have been stored AND if all different data types are NULL. An old bug created empty strings for specially Auxiliary data CDFs. This bug created unnecessary many rows in the table cdfvalue and makes it impossible to delete CDFs that actually never have been used for data entry.
Duplicated values in allow_value_set
Duplicates may have been by migration and import mistakes.
Invalid date in link table (mandatory)
The date value 0000-00-00 00:00:00 is invalid and has been found in the table link.
Attachment table (mandatory)
One country had changed the name of the column Filedescription. Note it has to be capital F.
Orphans in Country Structure (mandatory)
If there are orphans in the country structure there will be problems e.g. with creating a staging area database.
Duplicates in the Country Structure
Having duplicates on name (on the same level/node) in the country structure e.g. two villages called “Berg” in the one municipality creates a lot of problems when importing data and statistics. Note that the duplicates are not shown in the GUI.