Creating MINT Domains

From IMSMA Wiki
Revision as of 13:53, 20 June 2017 by Evinek (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

A Domain in MINT is a view on data from a data source, e.g. from an IMSMA staging area. It presents the data in a certain context and allows end-users to access that data in a comprehensive way. A domain can be compared to a view on the database - it in fact implements a view functionality within MINT. This means that instead of having a view object directly on the database, a domain is a metadata layer residing inside MINT.

Usually, different types of data (e.g. data on accidents, data on activities, etc.) are presented in different domains, unless these different types need to be put in context and presented in a single report.

A domain has to be designed by a MINT administrator. It is based on database tables (and potentially views), derived tables, joins between all the specified tables and derived tables, calculated fields, display names, and certain properties that can be defined, such as security and localisation.

In short, a domain presents selected data field to an audience, i.e. to end-users running reports and designing dashboards.

Note.jpg Basic relational database / SQL knowledge is required for creating a domain.

Add a domain

The Add New Domain page can be accessed via the top menu by selecting Create → Domain.

MINT create domain.png

On the Add New Domain page, the following information has to be provided:

  • A Name and Resource ID for the domain. The Resource ID is a unique identifier and cannot be changed later.
  • An optional description
  • A location in the MINT repository. It is recommended to save all domains in a single folder called Domains.
  • The data source from which data should be read and presented.
MINT add domain properties1.png

This sets the metadata of the domain. However, the domain still needs to be designed, i.e. it needs to be specified what data should be displayed how. This can be done via the so called Domain Designer that is started by clicking on the Create with Domain Designer... link. Alternatively, if available, an existing domain file can be uploaded - this would be a predefined domain designed previously and exported as a template. This is out of the scope here though.

MINT edit with domain designer.png

The Domain Designer

The Domain Designer is a graphical tool available within MINT that allows designing a domain. When opening the Domain Designer, MINT connects to the data source that has been specified in the previous step (when specifying the metadata of the new domain). A relational database is usually organised in so called database schemas which represent logical grouping of data on the database. When designing a domain, MINT first needs to know to which schema on the specified database it should connect.

MINT domain select schema.png

On a PostgreSQL database (such as an IMSMA Staging Area), the actual data always resides in the schema called public. Therefore, the public schema always has to be selected in this step, no other one. Select public and click OK. For other database systems, the schema names might be different though. Please get in touch with your GICHD IM advisor in case of doubt.

The Domain Designer now opens on its first tab called Tables. The sections of the designer are the following.

MINT domain designer tabs.png
  • Tables: this part allows selecting all the tables from which should should be integrated into the domain.
  • Derived Tables: This sections allows the specification of any query to be run on the database. The result is stored as a so called derived table that is subsequently treated in the same way as any table selected in the Tables tab.
  • Joins: when more than one table or derived table are specified, joins have to be defined between them.
  • Calculated Fields: specific fields can be calculated based on other fields. For example, for data on land (hazard), the difference between the estimated and the calculated area could be determined and the value stored as a field.
  • Pre-filters: this section allows to specify filters on the data. For example, data on accidents could be restricted based on the date of the accident; activities could be filtered for a certain organisation only, etc.
  • Display: this final section allows selecting the data to be displayed, and organising it in meaningful sets. Additionally, each field/attribute can be renamed.

Throughout the remaining of this page, it is assumed that a domain on accident data (including ordnance that caused the accident) is designed.

Tables

In the left-hand pane, all the tables available in the chosen schema (in this case, the public schema of the staging area) are displayed. Each table required for the domain can be select either by double-clicking on it, or by highlighting it and using the arrows in the middle of the screen, between the two panes.

In the example mentioned, two tables are required: acccident and accdeviceinfo. In MINT, they are prefixed with public_, indicating that they reside inside the schema called public.

MINT domain designer tables.png
Note.jpg In this tab, only tables of the database are displayed. If views need to be queried, this can be achieved via derived tables as described in the next section.

Once all required tables are selected, click on the next tab, Derived Tables.

Derived Tables

A derived table is the result of any query that can be run on the database. Let's look at the following example: assume that the requirement it to create a table displaying the number of accidents by year. Although the date of the accident is available as an attribute, let's create another attribute holding only the year of the accident, not the month and day. In PostgreSQL, this can be achieved with the following query:

SELECT extract(year from dateofaccident) as year_of_accident FROM accident

A derived table can be as simple as this query. However, since the derived table needs to be joined to the other tables (in this example, specifically to the accident table), the unique identifier of the row should be selected as well, leading to the following query:

SELECT accident_guid, extract(year from dateofaccident) as year_of_accidnet FROM accident

Enter this query in the Query (required) pane, give it a name in the Query ID (required) field, and click on Run Query. If the query runs successfully on the database specified for the domain, the attributes returned by the database are displayed. Select both fields and click on Select Table. The derived table now appears in the left-hand pane, under Available objects. The symbol next to the name suggests that it is a derived table, as opposed to a normal database table.

MINT domain designer derived tables.png

Once all required derived tables are selected, click on the next tab, Joins.

Joins

In the example mentioned above, three tables had been selected/created, two normal ones and one derived. As for any database query, when data from several tables is combined, these tables need to be put in context, i.e. it has to be specified how these tables can be combined. In the above example, the following joins need to be created:

  • Join between accident and accdeviceinfo on the attribute accident_guid present in both tables.
  • Join between accident and the derived table year_of_accident, also on the attribute accident_guid present in both tables.

In the Domain Designer, these joins can be created graphically, i.e. they do not have to be written in SQL. In the Join section of the Designer, the Left and Right tables can be expanded in order to select the attributes for the join. Once both on the left and on the right-hand side the required attributes have been selected, the type of join (inner, left outer, right outer, full outer) has to be specified by clicking on the corresponding symbol.

MINT domain designer join types.png

For the above example, the end result would look as follows:

MINT domain designer joins.png

Once all required joins are defined, click on the next tab, Calculated Fields.

Calculated Fields

The Calculated Fields section allows the specification of new fields derived from existing ones. For example, arithmetic operators can be applied to existing fields, e.g. multiplying, dividing, adding or subtracting two or more numeric values. An if-then-else logic can also be implemented, such as in the following example expression:

if (public_accident.sourcereliabilityenum == null, 'Unknown', public_accident.sourcereliabilityenum)

This expression acts like a CASE statement on the database: if there is no value for the attribute sourcereliabilityenum, then the string Unknown is displayed, otherwise (i.e. if there is a value), that value is taken into account. Such a manipulation can be required in the following use cases for example:

  • Eliminating null values (such as in the example above)
  • Grouping categories together (e.g. the values Agriculture and Agricultural of a single-value attribute, if they have the same meaning)
Note.jpg Take into consideration that there are several steps in the process where such manipulations can be made:
  • At the source, i.e. inside the IMSMA database: this fixes the problem at the source and is the best option
  • In a database view: create a view on the database managing e.g. elimination of null values, grouping of categories, etc.
  • Using derived tables when designing a MINT domain (use SQL)
  • Using calculated fields

The choice of the method depends on the requirements, the urgency, timeline, etc. It is important to be consistent though and document where each manipulation is implemented.

MINT domain designer calculated fields.png
Note.jpg The reference of the language used for defining calculated fields can be found here.

Once all required calculated fields are defined, click on the next tab, Pre-Filters.

Pre-filters

This section allows defining filters on the data. For example, if a domain containing accident data from a specific year only should be created, a filter can be created on the date of accident attribute, as in the example below. Any field from the selected tables and derived tables, as well as from the calculated fields, can be double-clicked or dragged and dropped to the right-hand filter pane. Depending on the data type of the selected attribute, several filter operators are suggested, such as equals, greater than, between, etc.

MINT domain designer prefilters.png
Note.jpg Take into account that filters can be applied later, when creating an ad-hoc analysis on a domain. In order to keep the number of domains manageable, it is thus recommended to create rather generic domains and apply filters later when analysing the data.

Once all required pre-filters are defined, click on the next tab, Display.

Display

Finally, this last tab lets administrators organise and name the attributes to be presented to the users who will make ad-hoc views and reports based on domains. This is an important step that should be done carefully, since it is about giving meaningful names and labels to each attribute.

Note.jpg The names of the attributes of the staging area might not always be meaningful, end-user names. For example, sourcereliabilityenum is the name of an attribute - display it as Source reliability will be much clearer for an end-user.

On the Display page, the left-hand pane contains all selected tables and derived tables, each with all their attributes. In order to build up an organisation and select useful attributes, the below steps can be followed:

  1. Think about how the attributes should be grouped together, in a folder-like manner. In a MINT domain, a folder is called a set. For example, in the example followed through this page, attributes from accidents and from devices causing the accidents are exposed. One possibility would thus be to group the core accident attributes in a folder, and organise the device-related attributes in a sub-set. Use the Add Set button to add a new set. If an existing set is highlighted, then the new set will be created as a sub-set.
    MINT domain designer display detail.png
  2. From the available attributes in the left-hand pane, select those that should be exposed to the user, and assign them to the set they logically belong to. While the tables can contain a lot of attributes, only a few might be really used and have values. It is thus recommended to only display those attributes that make sense to analyse in a specific context. Use the Add to sets and Add to selected set buttons to assign attributes to a set.
    MINT domain designer display add to sets.png
  3. Finally, each attribute can be renamed by editing its properties.
    MINT domain designer display edit properties.png
    Note.jpg It is recommended to change the Label only and keep the original attribute name as the ID.
MINT domain designer display.png

Finally, after having organised and renamed the attributes, click OK in the lower left corner, and then Submit on the following page in order to save the domain.

Note.jpg When defining a complex domain and/or making many changes that require some time, it is recommended to save/submit the domain design from time to time, to avoid losing changes due to e.g. a timeout or connection issue.

Edit a domain

To edit an existing domain, highlight the domain in the Repository and click Edit.

MINT edit domain.png

Then, click Edit with Domain Designer to open the domain in the designer and make any changes in one of the tabs described above.

Note.jpg When editing a domain in the Domain Designer, the view jumps to the Display tab. Click any other tab (Tables, Derived Tables, Joins, etc.) as required to make changes.

Make sure to click OK in the lower left corner, and then Submit on the following page in order to save the changes made to the domain.

Note.jpg Some changes are not permitted. For example, if a field from a domain is used in an Ad-Hoc View, then this field cannot be removed from the list of displayed field, because it would corrupt depending objects. Resulting error messages usually point to the exact issue.

Delete a domain

To delete a domain, highlight the domain in the Repository and click Delete.

MINT delete domain.png
Note.jpg Note that a domain can only be deleted if no object in the Repository is depending on it, i.e. there is no Ad-Hoc View in the Repository that is based on this domain.