Name Rules Staging Area
When creating Custom Defined Fields (CDFs) in IMSMA, a name (and label) must be specified. Rules related to the CDF name and its uniqueness are applied in IMSMA and need to be considered when generating a staging area. In IMSMANG database the rule is unique name per data type and item. In version 5.x it was not possible to specify name when creating Custom Defined Fields (CDFs). It was only possible to specify the label which then also was set as name. Therefore there are CDFs with names like If the answer is No on question 14.1 then fill in the Comment field Medical and Physio or Specify.
In version 6.0 it is possible to set the CDF's name and the name must be according to PostgreSQL rules. The name should be no longer than 60 characters and should only contain a - z and 0 - 9, see full rules on Add Custom Defined Fields. But CDFs created with version 5.x do not follow these rules so they may be up 250 characters long and contain any character (\ ? é ô . space). They may also be written with another alphabet e.g. Arabic.
In IMSMANG the CDF name is contents of a database column but in the Staging area database it will be used as column name which of course must be according to PostgreSQL rules. The Staging Area Generator handles CDF names that do not follow the rules:
- any name longer than 63 characters will be cut off
- any not supported character will be replaced with underscore including trailing spaces
- to facilitate working with the Staging area or using it with other applications the column name will be in lower case.
The geospatial part of the Staging area is creating SQL views that will be used by GIS software and here the maximum length for column names is 31 characters. The Staging Area Generator handles automatically duplicates by renaming them which sometimes is resulting in confusing column names.
Geospatial part creation error caused by duplicate CDF names according to SAG geospatial rules
There are cases where the Staging Area Generator cannot handle the duplicates. The table below illustrates the behavior with some examples.
CDF name | SA column name | SA view | Comment |
---|---|---|---|
Economic aid received - from whom (comment) | economic_aid_received___from_whom__comment__ | economic_aid_received___from_wh | The CDF name has one trailing space |
Specify | specify | specify | In IMSMANG the CDF name is starting with Upper case |
specify | specify | specify | SAG will consider Specify and specify as duplicates |
Dangerous Area User Defined Field 1 | dangerous_area_user_defined_field_1 | dangerous_area_user_defined_f5 | Depending on data it could be confusing that CDF#1 is #5 in the geospatial part |
Dangerous Area User Defined Field 2 | dangerous_area_user_defined_field_2 | dangerous_area_user_defined_f4 | |
Dangerous Area User Defined Field 20 | dangerous_area_user_defined_field_20 | dangerous_area_user_defined_fie | |
Dangerous Area User Defined Field 3 | dangerous_area_user_defined_field_3 | dangerous_area_user_defined_f3 | |
Dangerous Area User Defined Field 4 | dangerous_area_user_defined_field_4 | dangerous_area_user_defined_f2 | |
Dangerous Area User Defined Field 5 | dangerous_area_user_defined_field_5 | dangerous_area_user_defined_f6 | |
Encouraged to complete education | encouraged_to_complete_education | encouraged_to_complete_educatio | |
Encouraged to complete education - comments | encouraged_to_complete_education___comments | encouraged_to_complete_educat2 |
This error message There are duplicate normalized CDF Column Names, staging processing cannot continue may also indicate that there are other problems with CDFs than the names.
Contact your GICHD IM advisor if you need help. |
|