Creating Data Sources in MINT

A connection to a database (for example to the IMSMA Staging Area) is called a Data Source. The first step for getting started with MINT is to create a data source. The tool needs to know where the data that has to be analysed is stored. A data source can be:

  • an IMSMA staging area database
  • any other database
  • an XML or Excel file -- the preferred way however is to import or dump the content of a file in a database and then connect MINT to the database. This solution simply scales better and is better manageable. Please get in touch with your GICHD IM advisor prior to including file-based data into MINT.

Contents

Types of data sources

There are different types of data sources. The most common ones to be defined in MINT are the following:

  • JDBC data source: Java Database Connectivity (JDBC) is a Java-based data access technology. It allows access to relational databases, such as Postgres, MySQL, etc. For each database technology, a specific driver is needed.Drivers for Postgres and MySQL are included in MINT, others might need to be installed on request.
  • JNDI data source: Java Naming and Directory Interface (JNDI) is also a Java-based technology to connect to and query data. It is defined directly on the MINT server, centrally and only once. The advantages over a JDBC connection are in maintainability and performance - JNDI manages pools of connections instead of opening many separate connections on the database if several processes are accessing the data simultaneously.
  • Virtual data source: this type of data source allows combining multiple JDBC and JNDI data sources in order to make them available through a common domain. This is very useful if several databases/data sources, e.g. IMSMA and an additional source, need to be analyzed together. In such a case, it is crucial though to identify a common key that allows joining the data sets.

Create a data source

Note.jpg Only administrators can create data sources.

There are two easy ways of creating a data source:

  1. Click on Create → Data Source
  2. Right-click on the folder in which the data source should be stored, and choose Add Resource → Data Source
MINT add data source.png

In both cases, the following page to enter the data source properties appears.

MINT add data source properties.png
Note.jpg It is recommended to save all data sources in one folder, e.g. named Data Sources, in the MINT repository.

Define a JDBC data source

To define a JDBC data source, select JDBC Data Source in the Type drop-down menu of the Add Data Source page. The following information then needs to be provided:

  • Name: name that allows identifying the data source
  • Resource ID: internal ID, is filled by the system but can be adapted (no special characters allowed). Once saved, the Resource ID cannot be changed.
  • Description: optional description of the data source
  • JDBC Driver: choose the appropriate driver in the drop-down list. For a connection to an IMSMA staging area, choose PostgreSQL (org.postgresql.Driver). In the case of other data source, select the corresponding driver.
  • Host: IP address of the host on which the database to connect to is running.
  • Port: port on which the database is running on the host defined above. The default port number for Postgres is 5432.
  • Database: name of the database to connect to, e.g. staging.
  • URL: this is the full URL connection to the database. It is filled automatically, in the right syntax, with the information entered above (host, port, database).
  • User Name: name of a database user that can connect to the database. For example, the user imsma can usually connect to the staging area. It is recommended though to define a read-only user on the database and use this one to connect from MINT.
  • Password: password of the above defined database user.
  • Time Zone: leave the default (Use database setting) - the time zone of the database is then the reference time zone.
  • Save Location: browse to a location of the MINT repository to save the data source.

Once all information has been filled, the connection can be tested and saved.

Note.jpg The data sources are usually set up at the beginning of a MINT implementation and should not change. However, in case of doubt or if a new data source has to be created, contact your GICHD IM advisor.

Define a JNDI data source

Once the data sources for a specific MINT implementation are fixed, it is recommended to define JNDI data sources instead of JDBC ones. However, a JNDI data source has to be defined directly on the server and thus has to be set up by the global MINT administrator. Therefore, the creation of a JNDI data source has to be requested to the GICHD IM advisor, and the following information has to be provided:

  • Name: name that allows identifying the data source
  • Driver: type of driver / type of database to connect to (e.g. Postgres)
  • Host: IP address of the host on which the database to connect to is running.
  • Port: port on which the database is running on the host defined above. The default port number for Postgres is 5432.
  • Database: name of the database to connect to, e.g. staging.
  • User Name: name of a database user that can connect to the database. For example, the user imsma can usually connect to the staging area. It is recommended though to define a read-only user on the database and use this one to connect from MINT.
  • Password: password of the above defined database user.

A typical configuration on the server will then look as follows:

    <Resource name="jdbc/Staging" auth="Container" type="javax.sql.DataSource"
        maxActive="100" maxIdle="30" maxWait="10000"
        username="imsma" password="password"
        driverClassName="org.postgresql.Driver"
        validationQuery="SELECT 1"
        testOnBorrow="true"
        url="jdbc:postgresql://x.x.x.x:5432/staging"
        factory="com.jaspersoft.jasperserver.tomcat.jndi.JSBasicDataSourceFactory"/>

Once the connection is defined on the server, the corresponding JNDI data source can be defined in MINT. To define a JNDI data source, select JNDI Data Source in the Type drop-down menu of the Add Data Source page. The resource name (e.g. jdbc/Staging in the example above) then needs to be provided in the Service Name box. Once all information has been filled, the connection can be tested and saved.

MINT add data source JNDI.png

Define a Virtual data source

A virtual data source is a combination of two or more other data sources. Therefore, the existence of other data sources (typically JDBC and/or JNDI) is a prerequisite. To define a virtual data source, choose Virtual Data Source in the drop-down menu of the Add Data Source page. The following page appears.

MINT add data source Virtual.png

As for other data sources, a name, ID, optional description and a repository location have to be provided. Then, the lower pane allows choosing several data sources from previously defined ones. Choose two or more data sources that need to be analyzed together. When creating a domain selected tables of these different data sources can then be joined. A virtual data source is particularly useful if there is the need to combine an IMSMA staging area with another database that contains additional data, e.g. on socioeconomic figures. The prerequisite though is the existence of a common key between the data sets.