MySQL (thing) by StrawberryFrog

MySQL is a lightweight, fast relational database program distributed under the GPL license. It is very fast. It runs on many popular operating systems, typically Linux, but also Microsoft OSs and BSD. It is stable (except, in my experience, in the case of a local client crashing on Windows 98, but hey, that's what you get with Win98).

The website at www.mysql.com is quite simple to use and contains good documentation on MySql. The site refers to MySQL as "the world's most popular Open Source database server". The company sells support and services, commercial licences, pro versions and suchlike.

MySQL is a fine program, a worthy open source, whoops, free software endeavour and the right tool for some jobs. It is widely used to serve web content. it is good for applications where there is much reading and not much writing, and where a failure or inconsistency will not cost money.

However, in it’s current form it is not the same kind of program as any of the others that are marketed as relational database management systems. Compared to these programs, mysql is a toy.

If Oracle or Microsoft SQL Server is what you are used to, then you might rapidly decide that MySQL is not a DBMS, just a program that reads and writes files of data.

Wait a minute. Isn’t that what a DBMS is?

No, a DBMS is more than that. Many hackers have a very limited conception of what a SQL database system can do, based on using MySQL. SQL in its own way a powerful and subtle language. It can ensure that all the heavy lifting is done on the server, and the client just specifies what it wants done and is never stuck doing read-a-row, process-the-row, write-the-row, next row loops.

Or at least, that's the potential when SQL is fully implemented.

If you have spent a lot of time using an industrial-strength DBMS, you will rapidly become frustrated at MySQL's limitations. To be blunt, it has no balls (see below).

It seems that MySQL is popular in open-source circles partly because of the herd-mentality – MySQL is well-known so it the first choice, partly because its speed is seen as more important – (speed means scalability, doesn't it? Erm, E2 lag anyone?), and because of low expectations of what the DBMS can do, i.e. its users don't know any better.

Whilst they take pride in subtle hacks and comprehensive knowledge in C or PERL, they are content with this junior league implementation of SQL.

Leaving aside the commercial alternatives such as Oracle, IBM's DB2, Informix, Sybase and Microsoft SQL Server, there are free open databases such as Interbase and PostgreSQL that support the features that MySQL lacks.

But MySQL is fast. Of course. A motorbike will easily outpace a pickup truck. That’s because the bike’s engine drags less stuff around with it. But which one are you going to use to move house? Which one do you want to be transporting you when it crashes? In fact, PostgreSQL is known to perform dramatically better than mySQL in some high volume, high load scenarios.

But UNIX is the standard, isn't it? One of the more annoying things about MySQL is that it favours C and UNIX standards over ANSI SQL standards, for e.g. in string escaping and comments. This suggests just plain blinkered ignorance on the part of its creators.

The missing balls of MySQL

The missing left testicle: Atomic, Consistent, Isolated and Durable transactions.

BEGIN TRANSACTION
Update Account set Balance = Balance – 100 where AccountId = 1234
Update Account set Balance = Balance + 100 where AccountId = 5678
COMMIT TRANSACTION

The updates above should succeed together or fail together. If the second one fails, the first must be undone.

This is one good reason why MySQL is so fast – it can’t do this. It doesn't even try to meet the ACID transaction test. It doesn't have to keep logs and audit trails because it will never be called upon to abort and roll back a pending transaction.

It is useless for applications where data integrity is of utmost importance, as in the financial example above.

The missing right testicle: Nested queries.

Select * from Person where
Person.email_address in
(select email_address from account where balance > 1000)

Imagine that you downloaded a C compiler, started to use it, and then discovered that it didn’t support {} blocks inside procedures. Is that still C? This is what MySQL does to the SQL language.

This limitation also applies to Delete and update statements. For e.g. you cannot do this:

/* remove all dead people’s addresses */
Delete Address
from Person
where Address.PersonId = Person.Id and
Person.Deceased = 1

As is rightly pointed out elsewhere, this limitation requires that you write code that does two sequential reads of the two tables, and then process rows on the client one at a time. Erm, what was that about a speed advantage?

Other missing bits that I have come across include

referential integrity constraints
The UNION operator
Outer join
Stored procedures and triggers. Ok, so there is no standard for these. But that's no excuse to leave them out altogether.

The latest versions of MySQL have indeed added a table format that allows transactions.

MySQL is catching up, soon it will reach the functionality that commercial databases had in oh, 1980 or so.

Now it should be noted that most databases do not feel the need to have more than one kind of table. It should also be noted that some of the performance optimisations that this feature allows the developer to make manually are performed automatically, in dynamic response to user load by more advanced servers. E.G. MS SQL server will store a table in memory automatically if the table is small and often used.

The SQL language goes back to IBM projects circa 1975 and has transactions around 1980. SQL databases are a very mature technology and we expect an incredible degree of functionally and reliability. We get it, even from free software like PostgreSQL.

The assumption that one table is one file is purely an implementation detail, and in fact this assumption is false in some higher-end SQL servers which store the entire database in one large file. This implementation detail should not concern the person writing the SQL query. But here it does.

What happens if you start a transaction, update tables of more than one format, then roll it back? Data in some tables can be restored, in others it can't. Oops. Transactions are supposed to be ACID, and C stands for Consistent. Rolling back part of a transaction and leaving the rest is never going to be consistent, but that's the best you can do here. ACID is better thought of as a property of the database as a whole than a property of a single table.

To say that a table is "doesn't need to use transactions because it is non-volatile data" is to miss the point by mixing up two different concepts - the frequency of an operation and the need for it to happen without error when it does happen. Furthermore, when a table is not updated, then how does using transactions for updates add overhead?

The fact that mySQL now supports multiple table formats, each with a subset of features, is an accident of history. It is cruft. Sure, as a mySQL user, you can get better performance by working with it, and taking a guess way up front as to which tables need to be transactional. But it is definitely not a reason why a new DB user should go with mySQL. Quite the reverse.

But if your database is so delicately poised on the knife-edge of performance viability that the change from transactional mySQL to non-transactional mySQL will make or break, then I heartily recommend that you shell out $100 for more RAM, and change over to PostgreSQL.

Sure it is poor mySQL database design to update transactional and non-transactional tables in a transaction. If you are already using mySQL and designing mySQL databases. If you want to arse around with multiple table formats and sorta-transactional databases, by all means use MySQL. You could, however, think about your database at a higher level, in terms of compliance to the SQL standard.

Or if you don't know what transactions are, and you will be serving lots of non-mission critical text, and you do not already have a real database running, mySQL might even be the right tool for your job.

PostgreSQL	RDBMS	SQL	Stupidest thing you've coded just to see if you could
Katie Holmes	Perl	If Linux distributions were Airlines	PHP
Infinite monkeys theorem	Oracle	ODBC	ACID transaction
FreeBSD	Linux	acid	Windows NT clock
Interbase	Capitalism vs. Communism	mod_perl	The air feels brittle
perl monks	Python	GPL