Using Staging Area Generator

From IMSMA Wiki
Revision as of 16:47, 28 March 2015 by Alnaucler (talk | contribs) (Created page with "==Using the IMSMA Staging Area Generator==__NOEDITSECTION__ To start the Staging Area Generator, click on the Staging Area Generator Icon in the Start menu → All Programs...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Using the IMSMA Staging Area Generator

To start the Staging Area Generator, click on the Staging Area Generator Icon in the Start menu → All Programs → IMSMA. This will open the following graphical user interface.

Staging Area Generator GUI.png

Sections in the Staging Area Generator user interface
Section Description
1 - Source Database This section defines the connection of the source database, i.e. the IMSMANG database from which a staging area should be generated. If the source and the target databases should be on the same server, the default connection parameters can be kept. If the staging area is generated on another server, then the IP address of the source server hosting the IMSMANG database has to be provided in the Host field. Remember to update the password in the Password field if you have changed the default imsma password on your IMSMANG installation.
2 - Target Database This section defined the connection to the target database, i.e. the staging area database to be created. The parameters to be provided are the following:
  • Host: since the SAG tool has to be run on the target server, the value for this field should always be localhost
  • Port: port of your target Postgres database server installation - the default is 5432 (which is also the port on which IMSMA is running on the source)
  • Database: this is the name of the staging area database to be created. You can specify any name here, using lower-case alphanumeric characters and underscores. If a database with the provided name does not yet exist, the SAG tool will create a database with that name on the target database server. If a database with the provided name already exists, the content will be dropped and recreated by the SAG tool. If, for any reason, the database cannot be created, an error message will be displayed. For example, if the specified target database name contains upper-case letters, the following error message will be displayed:
SAG error msg wrong db name.png

Contact your GICHD IM advisor if you are unsure how to solve that.

  • User: this is the username of the database user that should be the owner of the staging area database. The default is imsma - however, this will only work if the staging area is created on the same database server on which IMSMA is running. On another, independent, database, there might not be any user named imsma. In that case, another, existing, user can be provided, or the database superuser named postgres can be used.
  • Password: password of the above user.
  • Postgres User: the above defined user that was specified as the owner of the staging area database might not have the permissions to actually create the new staging area database. Therefore, a user with enough permissions has to be provided here. However, if the staging area is generated on the same database where IMSMA resides, then the imsma user can be used, since it has all required privileges. The default is therefore imsma.
  • Postgres Password: password of the above user - in case of doubt, leave the default (password).
3 - Post Processing SQL Scripts (optional) In this optional field it is possible to provide the full path and name to a validated SQL file. Whatever is specified inside that file will be executed on the staging area database after its generation. This can be valuable in several use cases, for example:
  • If a staging area with only a subset of the IMSMA data should be generated (in this case the SQL file would include DELETE statements)
  • If sensitive data (e.g. victim names and addresses) should not be transferred to the staging area (in this case the SQL file would include UPDATE statements to set all the victim names and addresses to NULL)
  • If data has to be pre-computed for reporting purposes
  • If additional database views are necessary, also for reporting and analysis purposes

Contact your GICHD IM advisor for help in developing SQL statements according to specific requirements.

Validate Selections After having provided all required connection details (sections 1 and 2), click on Validate Selections. The following actions will be performed:
  • A connection to the source database will be attempted
  • A connection to the target database will be attempted - if a database with the specified name does not yet exist, it will be created
  • Some data will be read from the source database in order to populate the Additional Options section.

In case of an error, the related message usually specifies what the exact issue is.

4 - Additional Options This section is only enabled after a successful validation of the connection details.
  • Conversion Language: specifies the language in which the staging area database should be created. This refers to the translation concept in IMSMANG. For example, the English values for the status of a Land include Open, Worked on and Closed. However, in IMSMA translations of these terms into other languages can be defined. Considering this example, the option chosen in the Conversion Language field specifies which value will be written into the status_enum field of the HAZARD table in the staging area. If a translation does not exist in the specified language, it will default to English or, if this is not available, to the IMSMA base term. When translations are heavily used, it would be possible for instance to create one staging area in language 1 and another one in language 2, with the same data but values in different languages.
  • Gazetteer Seq No: this option allows to specify the flattening of the country structure into the main object tables, if gazetteer alternative names are used in IMSMANG. Alternative names in the country structure, if defined, have a sequence number. For example, if for each country structure element two alternative names are defined, the first one has sequence number 0 , the second one sequence number 1. If the standard gazetteer names should be used in the staging area, choose None. Otherwise, choose the appropriate sequence number.
  • Do Geodata: this is a switch to enable/disable the generation of geodata. The staging area can either be created with non-geodata only, or including all the coordinates (points, polylines and polygons), i.e. the data stored in the GEOPOINT, GEOSPATIALINFO and related tables in IMSMANG. Refer to the section on geodata below for more details.
  • Target SRID: when generating geodata in the staging area, the data can be re-projected. The Target SRID allows to define the projection. The default, 4326, corresponds to WGS 84 and 3857 corresponds to WGS84 Web Mercator. For a complete list of available projections, issue the following query on an empty staging area (i.e. the one created after clicking on Validate Selections):
select *
from spatial_ref_sys

You may also consult http://spatialreference.org/ to find the SRID that you would like to use. PostgreSQL is using the EPSG standard.

Staging Area Generator GUI options.png

Finally, click on Start to start the staging area generation. In the next window, click on Start once more:

Data Staging Progreess.png

A log with all the steps is written in this window as the generation process goes on. This might take some time, depending on the size of the source IMSMA database. The range is from very few minutes up to two hours for very big databases. Once the generation is finished, the newly generated staging area database can be explored with a database query tool such as pgAdmin3 or Navicat, and reporting tools can be connected to it.

Changing the log level of the Staging Area Generator

Note.jpg This is only required in case of doubt/error, when more details are needed in order to investigate what goes wrong.

In order to change the log level of the SAG, for example to write a more detailed output to the log file (etl.log in C:\IMSMAETLTool), the following steps are required:

  1. Open a command line window and go to C:\IMSMAETLTool\bin
  2. Extract the configuration file by typing the following in the command line:
     C:\IMSMAETLTool\jdk1.6.0_45\bin\jar xf etl-1.0-jar-with-dependencies.jar mylog4j.properties 
  3. Open the resulting file mylog4j.properties (in C:\IMSMAETLTool\bin) in an editor (e.g. Notepad++)
  4. Change the line
     log4j.category.com.fgm=INFO 
    to
     log4j.category.com.fgm=DEBUG 
  5. Bundle the log file in the jar again by typing the following in the command line:
     C:\IMSMAETLTool\jdk1.6.0_45\bin\jar uf etl-1.0-jar-with-dependencies.jar mylog4j.properties 

More output is now written to the etl.log file.

Note.jpg This is only required in case of doubt/error, when more details are needed in order to investigate what goes wrong.