Difference between revisions of "Staging Area Generator"
Line 90: | Line 90: | ||
A log with all the steps is written in this window as the generation process goes on. This might take some time, depending on the size of the source IMSMA database. The range is from very few minutes up to two hours for very big databases. Once the generation is finished, the newly generated staging area database can be explored with a database query tool such as ''pgAdmin3'' or ''Navicat'', and reporting tools can be connected to it. | A log with all the steps is written in this window as the generation process goes on. This might take some time, depending on the size of the source IMSMA database. The range is from very few minutes up to two hours for very big databases. Once the generation is finished, the newly generated staging area database can be explored with a database query tool such as ''pgAdmin3'' or ''Navicat'', and reporting tools can be connected to it. | ||
+ | ==Changing the log level of the Staging Area Generator==__NOEDITSECTION__ | ||
+ | In order to change the log level of the SAG, for example to write a more detailed output to the log file (etl.log in C:\IMSMAETLTool), the following steps are required: | ||
+ | # Open a command line window and go to C:\IMSMAETLTool\bin | ||
+ | # Extract the configuration file by typing the following in the command line: <pre> C:\IMSMAETLTool\jdk1.6.0_45\bin\jar xf etl-1.0-jar-with-dependencies.jar mylog4j.properties </pre> | ||
+ | # Open the resulting file mylog4j.properties (in C:\IMSMAETLTool\bin) in an editor (e.g. Notepad++) | ||
+ | # Change the line <pre> log4j.category.com.fgm=INFO </pre> to <pre> log4j.category.com.fgm=DEBUG </pre> | ||
+ | # Bundle the log file in the jar again by typing the following in the command line: <pre> C:\IMSMAETLTool\jdk1.6.0_45\bin\jar uf etl-1.0-jar-with-dependencies.jar mylog4j.properties </pre> | ||
+ | More output is now written to the '''etl.log''' file. | ||
+ | {{Note|This is only required in case of doubt/error, when more details are needed in order to investigate what goes wrong.}} | ||
==Scheduling the Staging Area creation==__NOEDITSECTION__ | ==Scheduling the Staging Area creation==__NOEDITSECTION__ | ||
The above described generation of a staging area needs human interaction. However, once all the options are fixed, there might be the requirement to re-generate the staging area regularly, in order to have an up-to-date reporting database. It is possible to automatize the staging area generation by following these steps: | The above described generation of a staging area needs human interaction. However, once all the options are fixed, there might be the requirement to re-generate the staging area regularly, in order to have an up-to-date reporting database. It is possible to automatize the staging area generation by following these steps: |
Revision as of 10:10, 8 December 2014
The IMSMA Staging Area Generator is a tool that creates a staging area from an IMSMANG database. It is thus part of the IMSMANG tool set, but not included in the IMSMANG installation package. IMSMA Staging Area Generator thus needs to be installed separately. Please contact your GICHD IM advisor in order to get the latest installation package.
The Staging Area Generator can only be used with IMSMANG 6.0 and higher. |
Basic Principles
The Staging Area Generator takes an IMSMANG database as input and outputs a staging area database, i.e. a flattened version of the IMSMA database, containing only approved data. The structure of the staging area and a details on the data it contains can be found on the IMSMA Staging Area page.
Where to install and use it
In terms of the technical architecture for IMSMA and the staging area, there are the following options:
- The staging area can be generated on the same server and database server as IMSMA. In this case, the tool is run locally and makes only local connections to the source and target databases.
- The staging area can be generated on another server and database server than IMSMA. For example, there could be a dedicated reporting server, in order not to overload the server on which data entry in IMSMA is done. In this case, the Staging Area Generator has to be run on the target server.
Installing the IMSMA Staging Area Generator
Prior to installing the Staging Area Generator
- To install the staging area generator (SAG) double-click on the etl.exe file.
- The graphical installer then starts. On the first screen, choose the language you would like to use for the installation and click OK.
- Navigate the Installation panels of the graphical installer as described below:
- On the Welcome screen, click Next.
- On the Infromation screen, click Next.
- We strongly recommend to use the default installation path for the Staging Area Generator. Click Next.
- If the target folder (C:\IMSMAETLTool) does not already exist a message will be shown, click OK.
- In the packages selection panel, click Next.
- Wait for the installation progress bar to finish, then click Next.
- When the Setup Shortcuts panel displays, click Next.
- When the installation has completed successfully, click Done.
After the installation, the following menu entry will be visible in the Start menu → All Programs → IMSMA:
Using the IMSMA Staging Area Generator
To start the Staging Area Generator, click on the Staging Area Generator Icon in the Start menu → All Programs → IMSMA. This will open the following graphical user interface.
Sections in the Staging Area Generator user interface | |
Section | Description |
1 - Source Database | This section defines the connection of the source database, i.e. the IMSMANG database from which a staging area should be generated. If the source and the target databases should be on the same server, the default connection parameters can be kept. If the staging area is generated on another server, then the IP address of the source server hosting the IMSMANG database has to be provided in the Host field. Remember to update the password in the Password field if you have changed the default imsma password on your IMSMANG installation. |
2 - Target Database | This section defined the connection to the target database, i.e. the staging area database to be created. The parameters to be provided are the following:
Contact your GICHD IM advisor if you are unsure how to solve that.
|
3 - Post Processing SQL Scripts (optional) | In this optional field it is possible to provide the full path and name to a validated SQL file. Whatever is specified inside that file will be executed on the staging area database after its generation. This can be valuable in several use cases, for example:
Contact your GICHD IM advisor for help in developing SQL statements according to specific requirements. |
Validate Selections | After having provided all required connection details (sections 1 and 2), click on Validate Selections. The following actions will be performed:
In case of an error, the related message usually specifies what the exact issue is. |
4 - Additional Options | This section is only enabled after a successful validation of the connection details.
select * from spatial_ref_sys You may also consult http://spatialreference.org/ to find the SRID that you would like to use. PostgreSQL is using the EPSG standard. |
Finally, click on Start to start the staging area generation. In the next window, click on Start once more:
A log with all the steps is written in this window as the generation process goes on. This might take some time, depending on the size of the source IMSMA database. The range is from very few minutes up to two hours for very big databases. Once the generation is finished, the newly generated staging area database can be explored with a database query tool such as pgAdmin3 or Navicat, and reporting tools can be connected to it.
Changing the log level of the Staging Area Generator
In order to change the log level of the SAG, for example to write a more detailed output to the log file (etl.log in C:\IMSMAETLTool), the following steps are required:
- Open a command line window and go to C:\IMSMAETLTool\bin
- Extract the configuration file by typing the following in the command line:
C:\IMSMAETLTool\jdk1.6.0_45\bin\jar xf etl-1.0-jar-with-dependencies.jar mylog4j.properties
- Open the resulting file mylog4j.properties (in C:\IMSMAETLTool\bin) in an editor (e.g. Notepad++)
- Change the line
log4j.category.com.fgm=INFO
tolog4j.category.com.fgm=DEBUG
- Bundle the log file in the jar again by typing the following in the command line:
C:\IMSMAETLTool\jdk1.6.0_45\bin\jar uf etl-1.0-jar-with-dependencies.jar mylog4j.properties
More output is now written to the etl.log file.
This is only required in case of doubt/error, when more details are needed in order to investigate what goes wrong. |
Scheduling the Staging Area creation
The above described generation of a staging area needs human interaction. However, once all the options are fixed, there might be the requirement to re-generate the staging area regularly, in order to have an up-to-date reporting database. It is possible to automatize the staging area generation by following these steps:
- In the C:\IMSMAETLTool folder, create a .bat script named for example runStagingArea.bat with the following content:
cd C:\IMSMAETLTool call bin\runnogui.bat
- Edit the properties file C:\IMSMAETLTool\conf\config.PROPERTIES by specifying the required values for all the parameters (they correspond to the ones from the graphical user interface described above). If a parameter should not be used, e.g. script_file, then comment it by placing a # at the beginning of the line:
conversion_language=en gazetteer_sequence_level=0 #script_file=scripfile.sql
- Schedule a task in an operating system's task scheduler. For example, using Windows 7, follow those steps:
- Go to Control Panel → Administrative Tools → Task Scheduler
- In the Task Scheduler interface, click on Create Basic Task...
- In the Create Basic Task Wizard, provide a name, e.g. Staging Area Generation and an optional description. Click Next.
- In the next window, specify the required recurrence (daily, weekly, etc.) and click Next.
- Next, refine recurrence criteria (e.g. time of the day) and click Next.
- In the next window, select Start a Program and click Next.
- Browser to the previously created script, e.g. C:\IMSMAETLTool\runStagingArea.bat. Click Next and then Finish.
The staging area generation should now start at the specified time.
It is advised to plan the generation time carefully, i.e. best during the night when no one is accessing and entering data into IMSMANG, as this might lead to inconsistencies. Ideally, the IMSMA server should be stopped before the staging area generation and then restarted. It also needs to be assured that no one (and no automatic reporting process) is accessing the staging area database at the time it has to be re-created. As long as there are open connections to the database, it cannot be dropped and re-created. Contact your GICHD IM advisor in case of questions.
|