Creating MINT Domains

From IMSMA Wiki
Revision as of 07:04, 14 October 2014 by Evinek (talk | contribs)
Jump to navigation Jump to search
This page is under construction

A Domain in MINT is a view on data residing in a data source, e.g. in an IMSMA staging area. It presents the data in a certain context and allows end-users to access that data in a comprehensive way. A domain can be compared to a view on the database - it in fact implements a view functionality within MINT. This means that instead of having a view object directly on the database, a domain is a metadata layer residing inside MINT.

Usually, different types of data (e.g. data on accidents, data on activities, etc.) are presented in different domains, unless these different types need to be put in context and presented in a single report.

A domain has to be designed by a MINT administrator. It is based on database tables (and potentially views), derived tables, joins between all the specified tables and derived tables, calculated fields, display names, and certain properties that can be defined, such as security and localisation.

In short, a domain presents selected data field to an audience, i.e. to end-users running reports and designing dashboards.

Basic relational database / SQL knowledge is required for creating a domain.

Add a domain

The Add New Domain page can be accessed via the top menu by selecting Create → Domain.

On the Add New Domain page, the following information has to be provided:

  • A Name and Resource ID for the domain. The Resource ID is a unique identifier and cannot be changed later.
  • An optional description
  • A location in the MINT repository. It is recommended to save all domains in a single folder called Domains.
  • The data source from which data should be read and presented.

This sets the metadata of the domain. However, the domain still needs to be designed, i.e. it needs to be specified what data should be displayed how. This can be done via the so called Domain Designer that is started by clicking on the Create with Domain Designer... link. Alternatively, if available, an existing domain file can be uploaded - this would be a predefined domain designed previously and exported as a template. This is out of the scope here though.

The Domain Designer

The Domain Designer is a graphical tool available within MINT that allows designing a domain. When opening the Domain Designer, MINT connects to the data source that has been specified in the previous step (when specifying the metadata of the new domain). A relational database is usually organised in so called database schemas which represent logical grouping of data on the database. When designing a domain, MINT first needs to know to which schema on the specified database it should connect.

On a PostgreSQL database (such as an IMSMA Staging Area), the actual data always resides in the schema called public. Therefore, the public schema always has to be selected in this step, no other one. Select public and click OK. For other database systems, the schema names might be different though. Please get in touch with your GICHD IM focal point in case of doubt.

The Domain Designer now opens on its first tab called Tables. The sections of the designer are the following.

  • Tables: this part allows selecting all the tables from which should should be integrated into the domain.
  • Derived Tables: This sections allows the specification of any query to be run on the database. The result is stored as a so called derived table that is subsequently treated in the same way as any table selected in the Tables tab.
  • Joins: when more than one table or derived table are specified, joins have to be defined between them.
  • Calculated Fields: specific fields can be calculated based on other fields. For example, for data on land (hazard), the difference between the estimated and the calculated area could be determined and the value stored as a field.
  • Pre-filters: this section allows to specify filters on the data. For example, data on accidents could be restricted based on the date of the accident; activities could be filtered for a certain organisation only, etc.
  • Display: this final section allows selecting the data to be displayed, and organising it in meaningful sets. Additionally, each field/attribute can be renamed.

Throughout the remaining of this page, it is assumed that a domain on accident data (including ordnance that caused the accident) is designed.

Tables

In the left-hand pane, all the tables available in the chosen schema (in this case, the public schema of the staging area) are displayed. Each table required for the domain can be select either by double-clicking on it, or by highlighting it and using the arrows in the middle of the screen, between the two panes.

In the example mentioned, two tables are required: acccident and accdeviceinfo. In MINT, they are prefixed with public_, indicating that they reside inside the schema called public.

In this tab, only tables of the database are displayed. If views need to be queried, this can be achieved via derived tables as described in the next section.

Once all required tables are selected, click on the next tab, Derived Tables.

Derived Tables

A derived table is the result of any query that can be run on the database. Let's look at the following example: assume that the requirement it to create a table displaying the number of accidents by year. Although the date of the accident is available as an attribute, let's create another attribute holding only the year of the accident, not the month and day. In PostgreSQL, this can be achieved with the following query:

SELECT extract(year from dateofaccident) as year_of_accident FROM accident

A derived table can be as simple as this query. However, since the derived table needs to be joined to the other tables (in this example, specifically to the accident table), the unique identifier of the row should be selected as well, leading to the following query:

SELECT accident_guid, extract(year from dateofaccident) as year_of_accidnet FROM accident

Enter this query in the Query (required) pane, give it a name in the Query ID (required) field, and click on Run Query. If the query runs successfully on the database specified for the domain, the attributes returned by the database are displayed. Select both fields and click on Select Table. The derived table now appears in the left-hand pane, under Available objects. The symbol next to the name suggests that it is a derived table, as opposed to a normal database table.

Once all required derived tables are selected, click on the next tab, Joins.

Joins

In the example mentioned above, three tables had been selected/created, two normal ones and one derived. As for any database query, when data from several tables is combined, these tables need to be put in context, i.e. it has to be specified how these tables can be combined. In the above example, the following joins need to be created:

  • Join between accident and accdeviceinfo on the attribute accident_guid present in both tables.
  • Join between accident and the derived table year_of_accident, also on the attribute accident_guid present in both tables.

In the Domain Designer, these joins can be created graphically, i.e. they do not have to be written in SQL. In the Join section of the Designer, the Left and Right tables can be expanded in order to select the attributes for the join. Once both on the left and on the right-hand side the required attributes have been selected, the type of join (inner, left outer, right outer, full outer) has to be specified by clicking on the corresponding symbol.

For the above example, the end result would look as follows:

Once all required joins are defined, click on the next tab, Calculated Fields.

Calculated Fields

The Calculated Fields section allows the specification of new fields derived from existing ones. For example, arithmetic operators can be applied to existing fields, e.g. multiplying, dividing, adding or subtracting two or more numeric values. An if-then-else logic can also be implemented, such as in the following example expression:

if (public_accident.sourcereliabilityenum == null, 'Unknown', public_accident.sourcereliabilityenum)

This expression acts like a CASE statement on the database: if there is no value for the attribute sourcereliabilityenum, then the string Unknown is displayed, otherwise (i.e. if there is a value), that value is taken into account. Such a manipulation can be required in the following use cases for example:

  • Eliminating null values (such as in the example above)
  • Grouping categories together (e.g. the values Agriculture and Agricultural of a single-value attribute, if they have the same meaning)
Take into consideration that there are several steps in the process where such manipulations can be made:
  • At the source, i.e. inside the IMSMA database: this fixes the problem at the source and is the best option
  • In a database view: create a view on the database managing e.g. elimination of null values, grouping of categories, etc.
  • Using derived tables when designing a MINT domain (use SQL)
  • Using calculated fields

The choice of the method depends on the requirements, the urgency, timeline, etc. It is important to be consistent though and document where each manipulation is implemented.

Pre-filters

Display

Edit a domain

Delete a domain

{{#switch:|subgroup|child=|none=|#default=

}}{{#ifeq:|Template|{{#ifeq:|child||{{#ifeq:|subgroup||{{#switch:creating mint domains

|doc
|sandbox
|testcases =
|#default = {{#switch:hlist
 |plainlist
 |hlist
 |hlist hnum
 |hlist vcard
 |vcard hlist = 
 |#default = 
 }}
}}

}}}}}}