Data quality checks before upgrading to V6.0: Difference between revisions
No edit summary |
No edit summary |
||
| Line 9: | Line 9: | ||
|[[Missing approval information]] | |[[Missing approval information]] | ||
|[[Missing Date of Information]] | |[[Missing Date of Information]] | ||
|[[Missing Local ID]] | |||
|[[SQL Templates]] | |[[SQL Templates]] | ||
|[[DIM Categories]] | |[[DIM Categories]] | ||
| Line 62: | Line 63: | ||
===Missing Date of Information===__NOEDITSECTION__ | ===Missing Date of Information===__NOEDITSECTION__ | ||
Date of Information (stored in table ''fieldreport'') is used when {{IMSMANG}} calculates the Summary (former Current view). If this data is missing then the result may not be as the expected when the Summary is updated. Note that ''Last updated'' gets updated e.g. when templates are switched and should not be used for setting Date of Information. If the Data Entry Form that misses Date of Information is an Activity or Education then you might be able to use End date. | Date of Information (stored in table ''fieldreport'') is used when {{IMSMANG}} calculates the Summary (former Current view). If this data is missing then the result may not be as the expected when the Summary is updated. Note that ''Last updated'' gets updated e.g. when templates are switched and should not be used for setting Date of Information. If the Data Entry Form that misses Date of Information is an Activity or Education then you might be able to use End date. | ||
===Missing Local ID===__NOEDITSECTION__ | |||
In some database also Form ID and the items' local ID have been missing. | |||
===Data Entry Form Templates===__NOEDITSECTION__ | ===Data Entry Form Templates===__NOEDITSECTION__ | ||
Revision as of 07:02, 12 February 2014
Invalid / space coordinates
| How To |
|---|
|
{{#if:Space coordinates | |}}{{#if:MGRS coordinates | |}}{{#if:Coordinate display rules | |}}{{#if:Point ID Length | |}}{{#if:Missing approval information | |}}{{#if:Missing Date of Information | |}}{{#if:Missing Local ID | |}}{{#if:SQL Templates | |}}{{#if:DIM Categories | |}}{{#if:Enum categories and values | |}}{{#if:Translations | |}}{{#if:CDF display format | |}}{{#if:CDF missing in field | |}}{{#if:Duplicate CDF | |}}{{#if:CDF not used | |}}{{#if:Empty CDF | |}}{{#if:Duplicate allow value | |}}{{#if:Link invalid date | |}}{{#if:Country structure orphans | |}}{{#if:Country structure duplicates | |}}{{#if:|
|
It was not possible to upgrade to V5.08.04 if there are invalid coordinates in the database. Please run this SQL query anyhow because importing may overrun application rules. If you get any rows here, you have more SQL queries that will identify which object has the invalid coordinates in “Invalid_coordinates_ver2.doc” on PI.
MGRS coordinates
In V5.xxx it was possible to enter MGRS with wrong format. The first query checks for coordinates written with lower-case letters e.g. 4qfj12345678 and the second query checks for blanks e.g. 4 QFJ 1234 5678.
Coordinates display rules
In IMSMA NG there are 3 rules on how to handle and display coordinates, based on how they were entered;
- MGRS,
- Distance/Bearing
- X and Y.
When data has been migrated or imported it is possible that the column userinputformat in the table geopoint, which controls coordinate display, has not been set correctly. When the display field is set correctly the columns are high-lighted with yellow. If the user tries to edit a row which does not have one of the rules set, the coordinates will not be populated in the Point window. This issue only affects the edit of Auxiliary data coordinates and Data Entry Forms in the Workbench (query).
Length of Point ID
Point (local) ID should be unique and clearly identify single and polygon points. But it is possible to type blanks instead of characters/numbers. If spaces have been used it is not possible for the users to see difference when entering/editing the points. If you get length 0, 1 or 2 I recommend to look into the values of Point ID (query).
Missing approval information
Due to changed behaviour of migration/import scripts in version 5.08.02 and a bug in 5.08.04 approval information may be missing. Note that Last updated gets updated e.g. when templates are switched and should not be used for setting Approved Date. If the Data Entry Form that misses Approved Date is an Activity or Education then you might be able to use End date (query).
| Column | Old label | New label |
|---|---|---|
| reportReceivedDate | Initiated | Data Entry Date |
| reportCompletedDate | Submitted | Submitted Date |
| reportVerifiedDate | Approved | Approved Date |
| dateofreport | Date of Report | Date of Information |
| dataentrydate | Last updated | (no change) |
Missing Date of Information
Date of Information (stored in table fieldreport) is used when IMSMANG calculates the Summary (former Current view). If this data is missing then the result may not be as the expected when the Summary is updated. Note that Last updated gets updated e.g. when templates are switched and should not be used for setting Date of Information. If the Data Entry Form that misses Date of Information is an Activity or Education then you might be able to use End date.
Missing Local ID
In some database also Form ID and the items' local ID have been missing.
Data Entry Form Templates
Some countries have many published templates that have never been used which makes it difficult to know which template to update etc. It is also good to know which templates that have been used for data entry in case you need to update or switch them.
DIM categories
If categories have been deleted there will be errors when the upgrade script is applied. As a quick indication if there will be errors, numbers of categories may be used. There should be at least 52 categories.
There should be 5 categories for Task and Work Item.
| Parent | Category |
|---|---|
| TASK | GENERAL_INFO |
| TASK | PLANNING_MONITORING |
| TASK | UNCATEGORISED |
| WORK_ITEM | GENERAL_INFO |
| WORK_ITEM | UNCATEGORISED |
Number of enumeration categories and values
If the country has deleted standard enums from the table imsmaenum then upgrade scripts will not give the expected result which will create a lot of problems. As a quick indication if there will be errors, numbers of categories and values may be used. There should be at least 117 categories and 978 values.
Number of translations
If the country has deleted standard translations from the table translation then upgrade scripts will not give the expected result which will create a lot of problems. The number of (English) translations should be at least 1269.
CDF display format
The combination of cdf_datatype in table customdefinedfield and fieldtype in table field must be correct. When data type has been changed directly in table customdefinedfield or data has been migrated these combination might be wrong.
Common errors are.
- display_mechanism for multi select is set to RADIO_BUTTON;
- no display_mechanism is set for single select fields;
- wrong fieldtype for multi and single selects;
- display_mechanism is "" i.e. length is 0.
The combinations in the table below is the correct 5.08.04 values
| fieldtype | display_mechanism | cdf_datatype |
|---|---|---|
| cdf | (null) | DATE |
| cdf | (null) | GAZETTEER |
| cdf | (null) | MULTI_SELECT |
| cdf | (null) | NUMBER |
| cdf | (null) | ORGANISATION |
| cdf | (null) | PLACE |
| select_cdf | COMBO_BOX | SINGLE_SELECT |
| select_cdf | RADIO_BUTTONS | SINGLE_SELECT |
| cdf | (null) | TEXT_FIELD |
CDF missing in table field
We have had a few cases where CDFs have been missing in the table field or in the table customdefinedfield.
Duplicate CDFs
In 5.08.04 it was allowed that CDFs had the same name if they had different data type.
CDF never used for Data Entry
Note that a CDF may be included in Data Entry templates, Summary templates, Saved searches, etc. but have no values in table cdfvalue. This is not an error but if they have many non-used CDFs it is an indication that IM procedures may be improved.
Records with empty values in table cdfvalue
The next level of check is to check if the value ‘ ‘ have been stored AND if all different data types are NULL. An old bug created empty strings for specially Auxiliary data CDFs. This bug created unnecessary many rows in the table cdfvalue and makes it impossible to delete CDFs that actually never have been used for data entry.
Duplicated values in allow_value_set
Duplicates may have been by migration and import mistakes.
Invalid date in link table
The date value 0000-00-00 00:00:00 is invalid and has been found in the table link.
Attachment table
One country had changed the name of the column Filedescription. Note it has to be capital F.
Orphans in Country Structure
If there are orphans in the country structure there will be problems e.g. with creating a staging area database.
Duplicates in the Country Structure
Having duplicates on name (on the same level/node) in the country structure e.g. two villages called “Berg” in the one municipality creates a lot of problems when importing data and statistics. Note that the duplicates are not shown in the GUI.