Changes

Difference MySQL PostGreSQL

13,582 bytes added, 19:56, 29 September 2015
no edit summary
{{Under construction Warning| This page is under constructionThere are relation differences between {{IMSMANG}} 5.08.04 and 6.0 database meaning that FKeys constrains and cascading rules have been changed even for tables that existed in 5.08.04.}} {{TOC right}}
== Introduction ==__NOEDITSECTION__
{{IMSMANG}} version 6.0 is building on a PostgreSQL database management system, whereas previous {{IMSMANG}} versions were built on a MySQL database. Both systems are relational database systems and using SQL, the Structured Query Language, as means for interacting with it. However, there are some differences between the MySQL and PostgreSQL syntax. It is not possible to list all differences here, but the main ones encountered in the {{IMSMANG}} world are highlighted below.
The biggest differences are:
* PostGreSQL PostgreSQL is case sensitive* Command/functionBuilt-in functions (e.g. conditional expressions)* Dates are stored without format and time formats and queries* Custom functions For more details about the PostgreSQL syntax it is advised to consult the official PostgreSQL documentation at [http://www.postgresql.org/docs/9.1/static/index.html]. For more details about using SQL statements in PostGreSQL{{IMSMANG}} import scripts contact your [[Information Management Team | GICHD IM advisor]].* Round off * Customised functions{{Note | PostgreSQL complies to ANSI SQL standards, whereas MySQL does not.}}
== Changes Most common changes in database queries and views from mySQL MySQL to PostreSQL PostgreSQL ==__NOEDITSECTION__When upgrading from {{IMSMANG}} 5.08.04 to 6.0, the database views need to be adapted, i.e. changed from mySQL MySQL syntax to PostgreSQL syntax. The country focal point is taking care of this. For reference, this section describes the main changes encountered. This can also be used as a tutorial for creating views in PostgreSQL, for {{IMSMANG}} administrators who were used to creating them in mySQLMySQL.
=== Case sensitivity and column names ===__NOEDITSECTION__
PostgreSQL, as opposed to MySQL, is case-sensitive. This means that strings need to be provided in exactly the way they are stored in the database. For example, in the table IMSMAENUM, one out of many values in the column ENUMVALUE is ''Progress Report'', with capital P and capital R. In mySQL MySQL it was possible to write:
<pre>
SELECT *
Therefore, a create view statement like this:<br />
'''mySQLMySQL'''
<pre>
CREATE OR REPLACE VIEW my_view AS
SELECT ...
FROM `imsmaenum`, ...WHERE `imsmaenum`.`enumvalue ` = 'progress report';
</pre>
</pre>
{{Note | The single quote '''`''' present in the MySQL queries is an artifact of the tool Navicat from which these queries have been extracted.}} Similarly, in mySQL MySQL it is possible to define column names with a mix of upper and lower case as well as blanks and other special characters without any trouble. In PostgreSQL, this is also possible, but in this case the column name '''must''' be enclosed in double quotes. For example, the following create view statement:<br />'''mySQLMySQL'''
<pre>
CREATE VIEW my_view AS
SELECT `hazreducdeviceinfo`.`hazreduc_guid ` AS `hazreduc_guid`,`ordnance`.`model ` AS `Device Type`FROM `hazreducdeviceinfo`;
</pre>
</pre>
{{Note | This also applies to querying the data. For select rows where the device type is equal to 'xy', the syntax is the following: <pre> ... WHERE "Device Type" = 'xy'</pre>}} {{Note | If an column name in PostgreSQL is written with upper case but not enclosed into double quotes, it will be created in lower case in the database.}} === CAST AS CHAR CHARSET utf8 and _utf8 (mySQLMySQL) ===__NOEDITSECTION__In some (older) MySQL versions, UTF-8 was not the default character set and strings thus needed to be explicitly converted to UTF-8. This is not the case in PostgreSQL and UTF-8 related syntax as used in MySQL does not work in PostgreSQL. Below are two examples encountered in {{IMSMANG}} views. 
'''MySQL'''
<pre>
</pre>
=== IF and IFNULL (mySQLMySQL) / CASE (PostgreSQL) statements (conditional expressions) ===__NOEDITSECTION__The statements IF() and IFNULL() do not exist in PostgreSQL. In PostgreSQL, the CASE statement is used for conditional expressions. The syntax of the CASE statement is as follows:<pre>CASE WHEN condition THEN result [WHEN ...] [ELSE result]END</pre> Therefore, for example the following create view statement:<br />'''MySQL'''<pre>CREATE VIEW `my_view` ASSELECTsum(if((`my_view2`.`Device Type` = _utf8'SAA'),`my_view2`.`Quantity`,0)) AS`SAA`,sum(if((`my_view2`.`Device Type` = _utf8'AP'),`my_view2`.`Quantity`,0)) AS `AP`,sum(if((`my_view2`.`Device Type` = _utf8'AIED'),`my_view2`.`Quantity`,0)) AS `AIED`,sum(if((`my_view2`.`Device Type` = _utf8'UXO'),`my_view2`.`Quantity`,0)) AS `UXO`,sum(if((`my_view2`.`Device Type` = _utf8'AT'),`my_view2`.`Quantity`,0)) AS `AT`,sum(if((`my_view2`.`Device Type` = _utf8'Fragments'),`my_view2`.`Quantity`,0)) AS `Fragments`FROM`my_view2`;</pre> needs to be changed to:<br />'''PostgreSQL'''<pre>CREATE VIEW my_view ASSELECTsum(case when my_view2.Device Type = 'SAA') then my_view2.Quantity else 0 end) AS "SAA",sum(case when my_view2.Device Type = 'AP') then my_view2.Quantity else 0 end) AS "AP",sum(case when my_view2.Device Type = 'AIED') then my_view2.Quantity else 0 end) AS "AIED",sum(case when my_view2.Device Type = 'UXO') then my_view2.Quantity else 0 end) AS "UXO",sum(case when my_view2.Device Type = 'AT') then my_view2.Quantity else 0 end) AS "AT",sum(case when my_view2.Device Type = 'Fragments') then my_view2.Quantity else 0 end) AS "Fragments"FROMmy_view2;</pre> The ISNULL() function can also be translated into a CASE statement:<pre>CASE WHEN field IS NULL THEN resultEND</pre> Alternatively, the built-in function COALESCE() can be used in PostgreSQL:<pre>SELECT COALESCE(field, 'N/A')FROM table;</pre> This query will return the value of ''field'' if it is not null, or the string N/A if field is null. In general, COALESCE() returns the first value that is not null:<pre>SELECT COALESCE(field1, field2, field3, 'N/A')FROM table;</pre>If field1 is null, this query will evaluate field2 -- if field2 is null, it will evaluate field3, etc. If all the provided fields are null, it will return the last provided value, in this case the string N/A.
=== Date manipulations ===__NOEDITSECTION__
There are differences between MySQL and PostgreSQL when it comes to manipulating fields of type DATE, such as for example extracting the year or the month of a date. In MySQL, the related functions are year() and month(). In PostgreSQL, there is an extract() function that does the job. See the below example for reference. The following create view statement:<br />
'''MySQL'''
<pre>
CREATE VIEW `my_view` AS
SELECT
(year(`my_view2`.`DateofAccident`) - year(`my_view2`.`DateofBirth`)) AS `Age`,
year(`my_view2`.`DateofAccident`) AS `Year`,
month(`my_view2`.`DateofAccident`) AS `Month`
FROM `my_view2`;
</pre>
 
needs to be changed to:<br />
'''PostgreSQL'''
<pre>
CREATE VIEW my_view AS
SELECT
extract(year from my_view2."DateofAccident") - extract(year from my_view2."DateofBirth") AS "Age",
extract(year from my_view2."DateofAccident") AS "Year",
extract(month from my_view2."DateofAccident") AS "Month"
FROM my_view2;
</pre>
=== The GROUP BY clause ===__NOEDITSECTION__
MySQL has much more "relaxed" rules about using the GROUP BY clause than PostgreSQL has. Therefore, adaptations might need to be done in views using grouping with GROUP BY.
The things to consider are the following:
* '''PostgreSQL''': When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns (cf. [http://www.postgresql.org/docs/9.1/static/sql-select.html#SQL-GROUPBY]). In other words, when the SELECT part of a query or create view statement contains three columns without aggregation and two with an aggregation function, all three columns without the aggregation '''must''' be specified in the GROUP BY clause. This is not the case in MySQL.
 
An example best illustrates that behavior. Consider this '''MySQL''' view from an {{IMSMANG}} deployment in a country:
<pre>
CREATE VIEW `annualreport_view_bac` AS
select `progress`.`hazreduc_guid` AS `ProgressGUID`,
cast(`progress`.`startdate` as date) AS `StartofReportingPeriod`,
cast(`progress`.`enddate` as date) AS `EndofReportingPeriod`,
`progress`.`areasize` AS `AreaCleared`,
sum(if((`annualreport_view_devicesdestroyed`.`Model` = _utf8'SAA'),`annualreport_view_devicesdestroyed`.`Quantity`,0)) AS `SAA`,
sum(if((`annualreport_view_devicesdestroyed`.`Model` = _utf8'AP'),`annualreport_view_devicesdestroyed`.`Quantity`,0)) AS `AP`,
sum(if((`annualreport_view_devicesdestroyed`.`Model` = _utf8'UXO'),`annualreport_view_devicesdestroyed`.`Quantity`,0)) AS `UXO`,
sum(if((`annualreport_view_devicesdestroyed`.`Model` = _utf8'AT'),`annualreport_view_devicesdestroyed`.`Quantity`,0)) AS `AT`,
sum(if((`annualreport_view_devicesdestroyed`.`Model` = _utf8'Fragments'),`annualreport_view_devicesdestroyed`.`Quantity`,0)) AS `Fragments`,
sum(if((`annualreport_view_devicesdestroyed`.`Model` = _utf8'AIED'),`annualreport_view_devicesdestroyed`.`Quantity`,0)) AS `AIED`,
`annualreport_view_clearancetype`.`Clearance Type` AS `ClearanceType`,
`my_view_organization`.`ShortName` AS `Agency`,
`my_view_gazetteer`.`Region` AS `Region`,
`my_view_gazetteer`.`Province` AS `Province`,
`my_view_gazetteer`.`District` AS `District`,
`my_view_gazetteer`.`Village` AS `Village`
from (((((`hazreduc` `Progress`
join `imsmaenum` on((`progress`.`hazreductypeenum_guid` = `imsmaenum`.`imsmaenum_guid`)))
left join `annualreport_view_devicesdestroyed` on((`annualreport_view_devicesdestroyed`.`ProgressGUID` = `progress`.`hazreduc_guid`)))
join `annualreport_view_clearancetype` on((`annualreport_view_clearancetype`.`Hazreduc_guid` = `progress`.`hazreduc_guid`)))
join `my_view_organization` on((`progress`.`org_guid` = `my_view_organization`.`org_guid`)))
join `my_view_gazetteer` on((`my_view_gazetteer`.`location_guid` = `progress`.`location_guid`)))
where (`imsmaenum`.`enumvalue` = _utf8'progress report')
group by `progress`.`hazreduc_guid`;
</pre>
 
Note the following regarding the above code:
* All the column definitions starting with ''sum(if''... use aggregation functions (in this case sum()).
* Only one column (hazreduc_guid) is present in the GROUP BY clause, but many columns without an aggregation function are specified in the SELECT clause.
 
In '''PostgreSQL''', even after performing all changes described above on this page, this view definition will not work. The reason is that all the non-aggregation columns of the SELECT statement need to be referred to in the GROUP BY clause. An exception can only be made for those that are dependent on another attribute that is specified in the GROUP BY. In this example, since the hazreduc_guid is in the GROUP BY and it is the primary key of the table HAZREDUC (renamed to ''progress'' in the above query), all other columns of that same table (in this case cast(progress.startdate as date), cast(progress.enddate as date) and progress.areasize can be omitted from the GROUP BY clause.
Taking into account all the changes to be made as described on this page, and looking into the GROUP BY clause issue, the following code would be equivalent in '''PostgreSQL''':
 
<pre>
CREATE VIEW annualreport_view_bac AS
select progress.hazreduc_guid AS "ProgressGUID",
cast(progress.startdate as date) AS "StartofReportingPeriod",
cast(progress.enddate as date) AS "EndofReportingPeriod",
progress.areasize AS "AreaCleared",
sum(case when annualreport_view_devicesdestroyed.Model = 'SAA' then annualreport_view_devicesdestroyed.Quantity else 0 end) AS "SAA",
sum(case when annualreport_view_devicesdestroyed.Model = 'AP' then annualreport_view_devicesdestroyed.Quantity else 0 end) AS "AP",
sum(case when annualreport_view_devicesdestroyed.Model = 'UXO' then annualreport_view_devicesdestroyed.Quantity else 0 end) AS "UXO",
sum(case when annualreport_view_devicesdestroyed.Model = 'AT' then annualreport_view_devicesdestroyed.Quantity else 0 end) AS "AT",
sum(case when annualreport_view_devicesdestroyed.Model = 'Fragments' then annualreport_view_devicesdestroyed.Quantity else 0 end) AS "Fragments",
sum(case when annualreport_view_devicesdestroyed.Model = 'AIED' then annualreport_view_devicesdestroyed.Quantity else 0 end) AS "AIED",
annualreport_view_clearancetype."Clearance Type" AS "ClearanceType",
my_view_organization.ShortName AS "Agency",
my_view_gazetteer.Region AS "Region",
my_view_gazetteer.Province AS "Province",
my_view_gazetteer.District AS "District",
my_view_gazetteer.Village AS "Village"
from hazreduc progress
join imsmaenum on (progress.hazreductypeenum_guid = imsmaenum.imsmaenum_guid)
left join annualreport_view_devicesdestroyed on (annualreport_view_devicesdestroyed.ProgressGUID = progress.hazreduc_guid)
join annualreport_view_clearancetype on (annualreport_view_clearancetype.Hazreduc_guid = progress.hazreduc_guid)
join my_view_organization on (progress.org_guid = my_view_organization.org_guid)
join my_view_gazetteer on (my_view_gazetteer.location_guid = progress.location_guid)
where (imsmaenum.enumvalue = 'Progress Report')
group by progress.hazreduc_guid,
cast(progress.startdate as date),
cast(progress.enddate as date),
progress.areasize,
annualreport_view_clearancetype."Clearance Type",
my_view_organization.ShortName,
my_view_gazetteer.Region,
my_view_gazetteer.Province,
my_view_gazetteer.District,
my_view_gazetteer.Village;
</pre>
 
Note that the columns cast(progress.startdate as date), cast(progress.enddate as date) and progress.areasize in the GROUP BY are optional.
 
{{Note | Although the PostgreSQL version can seem more "tedious" here, it is actually more correct and standard-compliant. In fact, it is important to specify all the groupings, otherwise the system (e.g. MySQL) will choose the remaining ones and apply some sorting to do so.}}
 
{{Warning| When upgrading views and making such changes/adaptations regarding the GROUP BY, it is important to compare the result set in both versions!}}
 
The MySQL way of using the GROUP BY can also be emulated by the '''DISTINCT ON''' in PostgreSQL. The '''MySQL''' query
<pre>
SELECT a,b,c,d,e FROM table GROUP BY a
</pre>
can be translated into the '''PostgreSQL''' query
<pre>
SELECT DISTINCT ON (a) a,b,c,d,e FROM table ORDER BY a,b,c
</pre>
 
== Postgres Tips & Tricks ==__NOEDITSECTION__
The following is a collection of useful PostgreSQL queries using special functions.
=== String manipulations===
'''Concatenation'''
<pre>
SELECT hazard_localid||'----'||hazardname AS full_haz_desc
FROM hazard;
</pre>
 
'''Substring'''
<pre>
SELECT DISTINCT substr(hazreduc_localid, 1, 2)
FROM hazreduc;
</pre>
 
=== Rounding numeric values ===__NOEDITSECTION__
<pre>
SELECT round(areasize)
FROM hazard;
 
SELECT round(areasize, 2)
FROM hazard;
</pre>
 
=== Type conversions ===__NOEDITSECTION__
* to_char
* to_date
* to_timestamp
* to_number
* cast
 
<pre>
SELECT to_char(areasize, '9999999999')
FROM hazard;
 
SELECT cast(areasize as varchar)
FROM hazard;
</pre>
 
=== Query for duplicates ===__NOEDITSECTION__
<pre>
SELECT hazard_localid, count(*)
FROM hazard
GROUP BY hazard_localid
HAVING count(*) > 1;
</pre>
 
=== WITH clause ===__NOEDITSECTION__
Provides a very convenient way to write auxiliary statements for use in a larger query
<pre>
WITH my_subquery1 AS (
SELECT *
FROM a
), my_subquery2 AS (
SELECT *
FROM b
)
SELECT *
FROM my_subquery1
WHERE my_subquery1.x IN (SELECT x FROM my_subquery2);
 
</pre>
== References ==__NOEDITSECTION__
* PostgreSQL 1.9.4 documentation: http://www.postgresql.org/docs/9.1/static/index.html
{{NavBox Technical Notes}}
[[Category:NAA]]
6,632
edits