Difference between revisions of "IMSMA Staging Area"

From IMSMA Wiki
Jump to: navigation, search
m
(4 intermediate revisions by 2 users not shown)
Line 1: Line 1:
{{TOC right}}
+
The IMSMA staging area is a '''flattened''' and '''read-only''' version of the {{IMSMANG}} operational database. In terms of content, the data is an exact copy of the '''approved''' information in the {{IMSMANG}} database, but the structure is less complex and therefore easier to query.  
==What is the IMSMA Staging Area?==__NOEDITSECTION__
 
The IMSMA staging area is a '''flattened''' and '''read-only''' version of the {{IMSMANG}} operational database. In terms of content, the data is an exact copy of the '''approved''' information in the {{IMSMANG}} database, but the structure is much less complex and therefore easier to query. While in the operational {{IMSMANG}} database the core object data (e.g. the ID and name of a Land) is stored in one table (e.g. in the ''hazard'' table) and descriptive attributes (such as single- and multi-select standard and custom defined fields) are stored in separate tables, in the staging area all this information is available in one single table. Therefore, there is no need to write complex queries to get to all the attributes of an object. Ultimately, reporting and data analysis tools can easily be connected to the staging area.<br />
 
A staging area can be created out of an {{IMSMANG}} v 6.0 (and higher) database. It is typically used as a '''reporting database''', via [[lightMINT]], [[MINT]] or any other [[External Reporting Tools|reporting/analysis tool]].
 
  
==Structure of the IMSMA Staging Area==__NOEDITSECTION__
+
While in the operational {{IMSMANG}} database the item data (e.g. the ID, status and name of a Land) is stored in one table and other information about the Land (such as multi-select and custom defined fields) are stored in separate tables. Furthermore in the {{IMSMANG}} information are also stored with enumeration values and as links between objects. In the staging area all this information is flattened in one single table for each item. Therefore, there is lesser need to write complex queries to retrieve attributes of an object.  
===Flattening principles===__NOEDITSECTION__
 
{{Note | This section requires basic knowledge of the {{IMSMANG}} database table structure.}}
 
  
{| class="wikitable" width="1000"
+
A staging area can be created out of an {{IMSMANG}} v 6.0 (and higher) database. It is typically used as a '''reporting database''', via [[MINT]] or any other [[Other Reporting Tools|reporting/analysis tool]] such as Excel or ArcGIS.
|-
 
| align="center" colspan="3" | '''Where to find which data in the staging area'''
 
|-
 
| width="150pt" | '''Type of attribute'''
 
| width="250pt" | '''Where?'''
 
| width="600pt" | '''Example'''
 
|-
 
| align="left" | '''Standard text, numeric, date and time attributes'''
 
| align="left" | Stored directly in the main object table, similar to {{IMSMANG}}
 
| align="left" | The HAZARD (Land) attribute hazard_localid is stored in a column also called hazard_localid in the HAZARD table of the staging area.
 
|-
 
| align="left" | '''Standard single-selects'''
 
| align="left" | Stored directly in the main object table (whereas in {{IMSMANG}} the main object table only contains a GUID reference to the IMSMAENUM table)
 
| align="left" | The Land attribute ''status'' is stored in a column called status_enum in the HAZARD table of the staging area. It contains the '''translated value''' of the status base value, e.g. ''Open'' or ''Closed'' in English or the equivalent values in another language that has been specified at staging area generation.
 
|-
 
| align="left" | '''Standard multi-selects'''
 
| align="left" | Stored directly in the main object table as '''comma-separated list of values''' '''AND''' in a normalized way in the <OBJECT>_STD_MULTISELECT table (in {{IMSMANG}} the values for a multi select have to be looked up in the <OBJECT>_HAS_IMSMAENUM and IMSMAENUM tables)
 
| align="left" | The Land object has a multi-select attribute called ''Marking'' (Marking Method). In the staging area, the HAZARD table has an attribute marking_method with a comma-separated list of values, for example 'Official Signs, Local Signs'. Additionally, the table HAZARD_STD_MULTISELECT can be joined with the HAZARD table in order to get to the same values as rows. Reusing the above example, there will be two rows, one with 'Official Signs' and one with 'Local Signs' as values (or the translated values if any translations are available).
 
|-
 
| align="left" | '''CDF text, numeric, date and time attributes and CDF single-selects'''
 
| align="left" | Stored directly in the main object table (as opposed to {{IMSMANG}} where a lookup has to be made through the <OBJECT>_HAS_CDFVALUE, CDFVALUE and CUSTOMDEFINEDFIELD tables)
 
| align="left" | Each CDF will be turned into a column in the staging area. For example, a CDF called ''My Hazard CDF'' defined on Land will result in a column named my_hazard_cdf in the HAZARD table of the staging area. {{Note | Note that non-standard characters in CDF names, such as blanks, are replaced by underscores in the column names}}
 
|-
 
| align="left" | '''CDF multi-selects'''
 
| align="left" | Stored directly in the main object table as '''comma-separated list of values''' '''AND''' in a normalized way in the <OBJECT>_CDF_MULTISELECT table (in {{IMSMANG}} the values for a multi select have to be looked up in the <OBJECT>_HAS_CDFVALUE, CDFVALUE and CUSTOMDEFINEDFIELD tables)
 
| align="left" | Let's assume that the Land object has a multi-select CDF attribute called ''My Land Multi-Select''. In the staging area, this will result in a column in the HAZARD table called my_land_multi_select with a comma-separated list of values, for example 'Value1, Value2'. Additionally, the table HAZARD_CDF_MULTISELECT can be joined with the HAZARD table in order to get to the same values as rows. Reusing the above example, there will be two rows, one with 'Value1' and one with 'Value2' as values.
 
|-
 
| align="left" | '''Location data'''
 
| align="left" | The guid, localid and name of the location an object is assigned to are directly stored in the main object table.
 
| align="left" | The HAZARD table in the staging area has the columns location_guid, location_localid and location_name with the values of the location that each object is assigned to.
 
|-
 
| align="left" | '''Organisation data'''
 
| align="left" | The guid, localid and name of the organisation defined on an object are directly stored in the main object table.
 
| align="left" | The HAZARD table in the staging area has the columns org_guid, org_localid and org_name with the values of the organisation defined for each object. If CDFs of type organisation have been created in IMSMA, these will also be available in the staging area.
 
|-
 
| align="left" | '''Place data'''
 
| align="left" | The localid and name of the place linked to an object are directly stored in the main object table.
 
| align="left" | The HAZARD table in the staging are has the columns ammunition_storage_localid and ammunition_storage_name that refer to a place object.
 
|-
 
| align="left" | '''Classification data (Country structure, Assistance Classification, Cause Classification and Needs Assessment Classification)'''
 
| align="left" | The guid, localid and name of classifications (country structure (gazetteer), assistance classification, cause classification and needs assessment classification) associated to an object are directly stored in the main object table. Since a classification can have several levels, there is a placeholder for each level, up to the maximum number of levels.
 
| align="left" | In the HAZARD table in the staging area, there are the following columns regarding the country structure classification (gazetteer): gazetteer_guid, gazetteer_level1_localid, gazetteer_level1_name, gazetteer_level2_localid, gazetteer_level2_name, ..., gazetteer_level7_localid, gazetteer_level7_name. There are seven placeholders because the country structure in {{IMSMANG}} can have up to seven levels.
 
|-
 
|}
 
  
===Database model===__NOEDITSECTION__
 
A simplified staging area database model for the Land (Hazard) object is depicted below. For all the other main objects, the model looks very similar.
 
[[Image:Staging_Area_Model_Hazard.png|center|700px]]
 
 
===Special table===__NOEDITSECTION__
 
{| class="wikitable" width="1000'"
 
|-
 
| width="150pt" | '''Table name'''
 
| width="850pt" | '''Description'''
 
|-
 
| align="left" | '''ETLHISTORY'''
 
| align="left" | This table is the only one that does never get dropped from the staging area, but persists as the database is re-generated again and again (see [[Staging Area Generator]]). It contains information about all the staging area generations on a specific database (i.e. for example when a staging area database with the same name is generated regularly, e.g. once per day or week). It has two columns:
 
* '''etlruntime''': this is the timestamp of the start of the generation
 
* '''runtime''': this is the time in minutes that the generation took
 
Having x rows in this table means that the database has been generated x times, and allows to keep a history of the individual runs. It also allows to determine how old the information in the staging area is, by looking at latest timestamp.
 
|-
 
|}
 
===Point, Polyline and Polygon tables===__NOEDITSECTION__
 
If the option of generating geodata has been chosen on the [[Staging Area Generator]] interface, then three related tables with geodata are generated for every main object/table (ACCIDENT, VICTIM, QM, HAZARD (Land), etc.):
 
* '''<OBJECT>_POINT:''' contains all the data related to points defined on the object, with the following main information:
 
** '''point_type:''' e.g. Benchmark, Landmark, Turning Point (all points making up a polygon are also recorded in this table)
 
** '''lat:''' latitude
 
** '''lon:''' longitude
 
** '''coordinate_reference_system:''' e.g. "WGS 1984"
 
** '''coordinate_format:''' e.g. "Decimal Degrees"
 
** '''shape:''' PostGIS-representation of the shape that can be accessed with a PostGIS query
 
* '''<OBJECT>_POLYLINE:''' contains all the data related to polylines defined on the object. The individual points making up the polyline are stored in the <OBJECT>_POINT table, but the polyline object is additionally stored in the <OBJECT>_POLYLINE table. The main column is '''shape''', a PostGIS-representation of the shape that can be accessed with a PostGIS query.
 
* '''<OBJECT>_POLYGON:''' similar to <OBJECT>_POLYLINE, but for polygon objects.
 
 
===Views===__NOEDITSECTION__
 
A series of database views are pre-defined on the staging area, three for each object: <OBJECT>_POINT_VIEW, <OBJECT>_POLYGON_VIEW and <OBJECT>_POLYLINE_VIEW - where <OBJECT> is HAZARD, GAZETTEER, MRE, etc. - each object that potentially has associated geodata. Each view is defined as an '''INNER JOIN''' between the <OBJECT> table, e.g. HAZARD, and each of the three geodata tables. The inner join means that data from e.g. the HAZARD table that has '''no''' polygon associated will not be found in the HAZARD_POLYGON_VIEW. Depending on the use case, either the tables or the views can be accessed.
 
 
==Geographical data in the staging area==__NOEDITSECTION__
 
{{Under construction| This section is under construction}}
 
When polygon/polyline is linked with the main object the properties from the main object applies to the entire object. This also applies to calculated area and length e.g. if Land-35 has three polygons with (total) calculated area 200 m2 then 200 is repeated on the three polygon rows.
 
 
{{NavBox Business Intelligence}}
 
{{NavBox Business Intelligence}}
[[Category:VIE]]
+
[[Category:NAA]]

Revision as of 20:29, 20 February 2020

The IMSMA staging area is a flattened and read-only version of the IMSMANG operational database. In terms of content, the data is an exact copy of the approved information in the IMSMANG database, but the structure is less complex and therefore easier to query.

While in the operational IMSMANG database the item data (e.g. the ID, status and name of a Land) is stored in one table and other information about the Land (such as multi-select and custom defined fields) are stored in separate tables. Furthermore in the IMSMANG information are also stored with enumeration values and as links between objects. In the staging area all this information is flattened in one single table for each item. Therefore, there is lesser need to write complex queries to retrieve attributes of an object.

A staging area can be created out of an IMSMANG v 6.0 (and higher) database. It is typically used as a reporting database, via MINT or any other reporting/analysis tool such as Excel or ArcGIS.