Name Rules Staging Area

From IMSMA Wiki
Jump to: navigation, search

When creating Custom Defined Fields (CDFs) in IMSMA, a name (and label) must be specified. Rules related to the CDF name and its uniqueness are applied in IMSMA and need to be considered when generating a staging area. In IMSMANG database the rule is unique name per data type and item. In version 5.x it was not possible to specify name when creating Custom Defined Fields (CDFs). It was only possible to specify the label which then also was set as name. Therefore there are CDFs with names like If the answer is No on question 14.1 then fill in the Comment field Medical and Physio or Specify.

In version 6.0 it is possible to set the CDF's name and the name must be according to PostgreSQL rules. The name should be no longer than 60 characters and should only contain a - z and 0 - 9, see full rules on Add Custom Defined Fields. But CDFs created with version 5.x do not follow these rules so they may be up 250 characters long and contain any character (\ ? é ô . space). They may also be written with another alphabet e.g. Arabic.

In IMSMANG the CDF name is contents of a database column but in the Staging area database it will be used as column name which of course must be according to PostgreSQL rules. The Staging Area Generator handles CDF names that do not follow the rules:

  • any name longer than 63 characters will be cut off
  • any not supported character will be replaced with underscore including trailing spaces
  • to facilitate working with the Staging area or using it with other applications the column name will be in lower case.

The geospatial part of the Staging area is creating SQL views that will be used by GIS software and here the maximum length for column names is 31 characters. The Staging Area Generator handles automatically duplicates by renaming them which sometimes is resulting in confusing column names.

Staging area geo error.png

Geospatial part creation error caused by duplicate CDF names according to SAG geospatial rules

There are cases where the Staging Area Generator cannot handle the duplicates. The table below illustrates the behavior with some examples.

CDF name SA column name SA view Comment
Economic aid received - from whom (comment) economic_aid_received___from_whom__comment__ economic_aid_received___from_wh The CDF name has one trailing space
Specify specify specify In IMSMANG the CDF name is starting with Upper case
specify specify specify SAG will consider Specify and specify as duplicates
Dangerous Area User Defined Field 1 dangerous_area_user_defined_field_1 dangerous_area_user_defined_f5 Depending on data it could be confusing that CDF#1 is #5 in the geospatial part
Dangerous Area User Defined Field 2 dangerous_area_user_defined_field_2 dangerous_area_user_defined_f4
Dangerous Area User Defined Field 20 dangerous_area_user_defined_field_20 dangerous_area_user_defined_fie
Dangerous Area User Defined Field 3 dangerous_area_user_defined_field_3 dangerous_area_user_defined_f3
Dangerous Area User Defined Field 4 dangerous_area_user_defined_field_4 dangerous_area_user_defined_f2
Dangerous Area User Defined Field 5 dangerous_area_user_defined_field_5 dangerous_area_user_defined_f6
Encouraged to complete education encouraged_to_complete_education encouraged_to_complete_educatio
Encouraged to complete education - comments encouraged_to_complete_education___comments encouraged_to_complete_educat2
Note.jpg This error message There are duplicate normalized CDF Column Names, staging processing cannot continue may also indicate that there are other problems with CDFs than the names.
  1. Double-check that there are no issues with names according to the rules.
  2. Check if any records have the wrong value in the column value_type in table cdfvalue compared to the current cdf_datatype in table customdefinedfield.

Contact your GICHD IM advisor if you need help.