Migrating from MySQL to PostgreSQL
----------------------------------
By: Jeroen Ruigrok van der Werven <asmodai@tendra.org>

NOTE: This is a pre-DocBook draft.  There's some initial markup which is my
own personal style.


Abstract
--------

This reference guide describes how to migrate from MySQL to PostgreSQL on all
levels.


Copyright information
---------------------

MySQL is a registered trademark of MySQL AB in the United States, the European
Union and other countries.


Preface [preface]
=======

This guide hopes to detail the changes a person must make to his/her database
design in order to switch from MySQL to PostgreSQL.  The database versions
used in the comparison/conversion testing are MySQL 4.x and PostgreSQL 7.3.x.
No doubt a lot of the information presented here also applied to past and
future versions.

This chapter has the following sections:

- Audience
- Prerequisite Knowledge
- Structure of this Guide
- Conventions Used


Audience [section]
--------

This guide is aimed at people who are migrating from MySQL to PostgreSQL.


Prerequisite Knowledge [section]
----------------------

You should be familiar with the operating system(s) on which you are running
MySQL and PostgreSQL.  You should also be familiar with relational database
concepts.


Structure of this Guide [section]
-----------------------

This reference guide is organized as follows:

Chapter 1, "Overview"

Describes MySQL and PostgreSQL and their respective features.

Chapter 2?, "Data Types"

Describes the data types from MySQL to PostgreSQL.  Also describes reserved
words.


Conventions Used [section]
----------------

This section describes the conventions used within this guide for text and
code examples.  It describes:

- Text Conventions Used
- Code Examples Conventions Used

Text Conventions Used [subsection]
- - - - - - - - - - -

Convention	Meaning			Example
------------------------------------------------------------------------------
Bold		The bold typeface	When you use this argument, you create
		indicates terms that	a -unique index-.
		are defined in the
		text, that appear in
		the glossary, or both.

/Italics/	The italic typeface	PostgreSQL 7.3 User's Guide
		indicates book titles,
		emphasis, placeholders,
		or syntax clauses.

UPPERCASE	The uppercase monospace	The N U M E R I C datatype allows for
monospace	typeface indicates	arbitrary precision.
		system-supplied
		elements, such as:
		SQL keywords, commands,
		datatypes, and
		privileges.

lowercase	The lowercase monospace	Use p s q l to start the command line
monospace	typeface indicates	client to PostgreSQL.
		user-supplied elements
		and executables.  Such
		elements include
		database names, column
		names, user names, and
		parameter values.

Code Examples Conventions Used [subsection]
- - - - - - - - - - - - - - - 

The code examples show typical use of SQL, PL/pgSQL, or other command-line
statements.  They are displayed using a monospace typeface and clearly
separated from the normal text.

SELECT firstname FROM customers WHERE lastname = 'Smith';

The following is a list of conventions used in the code examples, their
meaning, and an example.

Convention		Meaning			Example
------------------------------------------------------------------------------
[]			Indicates that the	CREATE SCHEMA schemaname
			enclosed arguments are	[ AUTHORIZATION username ]
			optional.

{}			Indicates that one of	CREATE { TEMPORARY | TEMP }
			the enclosed arguments	TABLE table_name
			is required.

|			Separates alternative	[ ENCRYPTED | UNENCRYPTED ]
			items that may be
			optional or required.	{ BEFORE | AFTER }

...			Indicates that the	CREATE USER username
			preceding item can be	[ [ WITH ] option [ ... ] ]
			repeated.  In code
			fragments an ellipsis	SELECT column1, column2, ... ,
			means that code not	column/n/ FROM table;
			relevant to the
			discussion has been
			omitted.

Italics			Indicates variables
			that you must supply
			values for.

UPPERCASE

lowercase


Chapter 1, Overview [chapter]
===================

Features [section]
--------

MySQL Features [subsection]
- - - - - - - 

PostgreSQL Features [subsection]
- - - - - - - - - -

heritance
data types
functions

Other features provide additional power and flexibility:

constraints
triggers
rules
transactional integrity


So, exactly why are you changing?
---------------------------------

There are some reasons a person might want to switch from MySQL to PostgreSQL.
Some are based on technical merit, others on personal feelings.

A big difference is that PostgreSQL is BSD-licensed.  This makes a whole of a
lot difference for some people.  This is but a small reason to make a change.
The true reasoning for switching databases comes from the technical side.
PostgreSQL, unlike MySQL, is a full-fledged relational database management
system (RDBMS).  This means PostgreSQL is slowly pushing its way up there with
the big names such as: Oracle, IBM, and Sybase.

The reason MySQL took off over PostgreSQL is that it is (was?) blazingly fast
on the select query side.  And given this and the average deployment of MySQL
(as a backend for websites) allowed for a big boost in use.
There comes a time however when a (beginning) DBA starts to feel the
shortcomings of MySQL and desires/needs a real RDBMS.

One of the big advantages MySQL has is the availability of a native Windows
binary.  This means a lot of people can get a pretty good database for no
cost, whatsoever.


Chapter 2
=========

Differences that stand out
--------------------------

PostgreSQL makes a difference between single quotes (') and double quotes (")
as the ANSI standard dictates.  MySQL does not follow this behaviour and
allows you to use both single and double quotes to quote values, whereas it
should only be single quotes.  Double quotes are meant to be used on quoting
field names, system identifiers, table names, and so on.

MySQL is case insensitive.  This means that doing a SELECT query on a lastname
like: WHERE lastname='Johnson' would match 'Johnson', 'JOHNSON', or 'johnson'.
This is solved easily, but depends on what you want to do:

	- use the correct case in the query, like in the example above,
	- convert the string to lowercase, e.g.: WHERE
	  lower(lastname)='johnson',
	- use a case-insensitive query operator, e.g. ILIKE or *~.

ANSI only allows '--' as a comment delimiter, whereas MySQL supports the
shell/perl's '#' use of a comment delimiter.  Just use '--' since both
understand this.


Data Types
----------

PostgreSQL has only 3 integer types, namely SMALLINT (2 bytes big), INTEGER
(INT, 4 bytes big) and BIGINT (8 bytes big).  MySQL supports 5, namely:
TINYINT (1 byte big), SMALLINT (2 bytes big), MEDIUMINT (3 bytes big), INT (4
bytes big), BIGINT (8 bytes big).  So just take the equivalent or the one
closest to it to contain the other in.

MySQL's TINYTEXT can be substituted by a VARCHAR(255).  For TEXT, MEDIUMTEXT,
and LONGTEXT just use PostgreSQL's TEXT type.

PostgreSQL's DECIMAL and NUMERIC are equivalent to eachother just like
MySQL's.

MySQL supports, as an extension to the SQL92 standard, a synonym for DOUBLE
PRECISION using just DOUBLE.


What's The Time?
----------------

MySQL's DATETIME is an ISO-8601-like date and timestamp format in the form of:
YYYY-MM-DD HH:MM:SS.  This excludes any timezone information whatsoever.
Within PostgreSQL use TIMESTAMP.  This automatically makes it TIMESTAMP
WITHOUT TIME ZONE.

To confuse matters MySQL also has a date/time type called TIMESTAMP.  This
type is updated under a number of conditions.  PostgreSQL does not support
something like this.  Most often you can get away with using the PostgreSQL
TIMESTAMP type with a default of 'now' or now() to have the field received the
current date and time.  To get the creation time of the table you would need
to use DEFAULT TIMESTAMP 'now'.


Constraints -- Hold What?
-------------------------


MySQL's schema (and Sybase too) often includes declarations like:

	CREATE TABLE blah (
	  id SMALLINT NOT NULL PRIMARY KEY);

PostgreSQL (and Oracle) also support this syntax.  Another way to have this
working is to define a constraint, e.g.:

	CREATE TABLE blah (
	  id SMALLINT NOT NULL);
	ALTER TABLE ONLY blah
	  ADD CONSTRAINT blah_pk PRIMARY KEY (id);

MySQL also knows a concept of enumerators (enum).  Basically you specify an
enumerated list of allowed values, e.g.:

	CREATE TABLE blah (
	  sentence enum('This', 'is', 'a', 'test'));

Within PostgreSQL you would solve this like this:

	CREATE TABLE blah (
	  sentence CHARACTER VARYING(50)
	  CONSTRAINT blah_sentence_cstr CHECK (sentence IN ('This', 'is', 'a',
	  'test');


More specialized differences
----------------------------

MySQL has an argument to fields, auto_increment, which automatically
increases the value of the field whenever a new entry is made.  PostgreSQL
does not work like this.  Within PostgreSQL (and Oracle) you normally use a
'sequence' where you get the next value from.  See below for an example:

MySQL definition:

	CREATE TABLE test (
	  id	AUTO_INCREMENT, 
	  name	VARCHAR(30));
	INSERT INTO test (name) VALUES ('a name');

PostgreSQL definition:

	CREATE SEQUENCE test_sequence;
	CREATE TABLE test (
	  id	int not null default nextval('test_sequence'),
	  name	VARCHAR(30));
	INSERT INTO test (name) VALUES ('a name');

PostgreSQL also allows you to use SERIAL to shorten the code above:

	CREATE TABLE test (
	  id	SERIAL,
	  name	CARCHAR(30));

MySQL also support the SERIAL datatype nowadays.


Portability reminders
---------------------

ANSI SQL-92 reserves a lot of words that cannot/should not be used as table or
column names.  Please avoid using these names as this increases portability
between different database backends.  Different RDBMSes treat the reserved
keyword list differently.


Random notes
------------

MySQL's IFNULL(x, y) is definately not equivalent with PostgreSQL's
NULLIF(x, y).  The first returns x if x is not NULL, otherwise it returns y.
PostgreSQL's NULLIF returns a NULL only if both x and y are equal.  MySQL also
knows a NULLIF which is the same as PostgreSQL's.
The equivalent to MySQL's IFNULL would be COALESCE(x, y).

Do note that while MySQL supports N/0 to be equivalent to NULL, apparently,
PostgreSQL reports this as a floating point operation division by zero.

MySQL does not do any date checking.  February is allowed to have 30 or 31
days and even leap years are not checked so you could also have a day 29 every
year.  PostgreSQL enforces correct dates.

MySQL has a UNIX_TIMESTAMP('date') which converts date back to seconds since the
Unix epoch.  PostgreSQL can also do this through a EXTRACT(EPOCH from
TIMESTAMP 'date').

MySQL's LENGTH(string) operator is the same as PostgreSQL's
CHAR_LENGTH(string);

PostgreSQL does not really know anything like MySQL's LAST_INSERT_ID().
Instead you need to use the sequences.  If you have inserted in the same
session you can use SELECT currval('sequence'); or otherwise you can use
SELECT last_value from sequence;.