Aug 29, 2020

Postgres vs MySQL

I'm no way an authority on this topic, but as part of refreshing my knowledge I wanted to dive into the topic once again. I have had experience using both, and in the end of the day the systems that I've been working with have just needed to use databases through the simplest SQL use cases. No complicated joins, not even transactions have been a big requirement.

One time we decided to migrate from a MySQL database to Postgres. From what I remember we landed at this decision because whenever we wanted to add new fields to our MySQL database, the schema migration would simply lock our database. Now, it should be noted that this can definitely happen for Postgres as well. There was a brilliant post that highlighted what kind of operations you could or could not do in order to not lock your database when adding new columns to a table. For example I remember that you can have a new NOT NULL column if you don't want to lock the table. For small databases locking a table is probably not a problem, but as the amount of data grows, you definitely want to prohibit an unexpected lock that in these cases can last hours.

Reading up on the internet I found that (naturally) MySQL and Postgres are adding a lot of each others' features which makes them more and more similar. This is to be expected.

I know one thing with Postgres that can easily be a gotcha if you are not prepared - the vacuuming. That's what's needed in order to cleanup the dead data that gets created so that Postgres can keep in being fast during read/write scenarios. It simply doesn't delete or change any data when the operation is performed. This job is performed during the vacuuming. This should happen automatically, but my experience is that if you don't schedule an explicit vacuuming, you will eventually run into a time when your database is very bloated and starts becoming slow. Succinctly put: I don't trust the automatic vacuuming.

My overall feeling is that the future is actually already starting to leave the classical RDBMS or ORDBMS as Postgres should be called, behind. My feeling is that NoSQL like DynamoDB where you with minimal effort gets a cluster and hence guaranteed scalable database performance is the future.

Code of the Day

In the API of my 2d Geometry library I started writing "apply" methods like they are called in scala, to make the API more succinct. I don't know if this is best practice, but I kind of like it.


/**
 * Creates a polygon based on the points that are given as an array of two dimensional
 * arrays with numbers.
 * @param points
 * array of two dimensional arrays with points.
 */
export function polygon(points: number[][]) {
  const p = points.map(num => new Point(num[0], num[1]));
  return Polygon.fromPoints(p);
}