The Great Libre Database List

Please note that this list is a work in progress!  You are free to browse, just know that more is to come!


By far the most popular libre data product from about 2000 through 2010, MySQL is ubiquitously included with all kinds of web hosting plans.  Its ease of learning and good performance have kept it in the forefront for a long time.  It is supported by virtually every major libre content management system.  Its replication is easier to set up than any other libre SQL database.  Recent versions have continued to provide features demanded by large enterprises, and it remains a solid choice.  In recent years, however, it has been getting a run for its money by NoSQL, PostgreSQL, and even forks of MySQL itself.  Its license is GPL, including the client libraries, which may require you to buy a commercial license if you develop a non-free program for MySQL and distribute it to others (this does not apply to web applications that others use).

Documentation *


This is the enterprise giant of libre SQL databases.  It has a long and proud history starting at the University of California Berkeley in the late 1980s.  It supported transactions long before any of its libre competitors. It supports custom data types, triggers, stored procedures, SQL window functions, and has many different ways to do replication.  Its documentation is best-of-class.  Its SQL language is somewhat more standards compliant than that of MySQL, and those coming from the likes of Oracle should be more at home here.  An active developer community adds several new features each year, breaking farther and farther into enterprise territory each time.  A highly non-restrictive BSD license allows you to do virtually anything with the code, including selling modified versions of it without the source code (we at don't recommend that though!).  Needless to say, this author likes PostgreSQL a lot and would recommend it for nearly anything where SQL makes sense.

Documentation * Planet PostgreSQL


Probably the most featureful libre NoSQL database, MongoDB is a document store.  It stores and processes documents that are essentially JSON documents (more technically, they are BSON, Binary JSON).  It can search, index, aggregate, and perform simple computations on JSON attributes - including arrays and sub-documents.  It is quite flexible and powerful.  Although it does not support multi-document atomic transactions, it can do many simultaneous updates to a single document atomically.  Although it can not do queries joining two documents, its flexible document structure eliminates some of the use cases of joins.  It uses a replica set model of replication, where if the master goes down, a new master can take over automatically.  It also supports sharding out of the box.  What makes MongoDB very useful in this author's opinion is that it can store most things related to a single web page in a single document.  Where SQL might require, for example, fetching document text out of one table and various attributes and extensions out of another, MongoDB could be set up to fetch everything from one document.



If you need a very lightweight SQL database for an embedded application, SQLite might just be the ticket.  Entire databases are stored in a single file, which can be accessed directly by one application at a time via the sqlite library.  It has a relatively full featured dialect of SQL and full ACID compliance.  Surprisingly, its "license" is true public domain -- you can do absolutely anything at all with the code with no restrictions at all!  (Every other product mentioned here is libre, meaning you can take the code, learn from it, modify it and pass on modifications, but they all have copyrights which do add some restrictions -- mainly at minimum you have to acknowledge the copyright owner.  Additionally, GPL licensed code says that if you pass on a modification, the modified code must also be distributed under the GPL.)  SQLite does not have to even be acknowleged in any way.  I'm sure they'd appreciate it though!


Another JSON document store, CouchDB uses an entirely RESTful API, which will be more convenient than MongoDB for some use cases.  It also supports master-master replication out of the box, so writes can go to any server.  Additionally, it emphasizes availability even when one server is disconnected from another.  It can accumulate writes on both servers and combine them later.  Its query facilities are rather more complex than that of most databases, though, as you have to write map/reduce functions.



Technically more a search engine than a database, but it basically is both.  It is a document store with a RESTful interface that is optimized for full text searches.  It can scale to huge environments.



Graphite is a time-series store that is optimized for creating realtime graphs.  It can scale horizontally across nodes and has a nice user interface to generate desired graphs, whose URLs can then be copied and embedded into web pages (dashboards, etc).  Although it is often used for system administration type tasks (server load average, etc), there's no reason why it can't be used to store, say, business or environmental data.



If you have a torrential stream of data and need a data store that is capable of absorbing massive quantities of writes with ease, you would do well to look into Apache Cassandra.  It uses a masterless architecture and will automatically shard and replicate your data.  A prime feature is its ability to store up to 2 billion columns in a row; this is useful for situations where you might have a "normal" DB row containing a primary key, a sub key (such as a timestamp), and a value.  It is trusted by many large and popular Internet sites.



You probably don't want to use Apache Hadoop.  But if you have to, you have to.  This is the grand-daddy of Big Data products.  Hadoop provides the architecture needed to store and process a virtually infinite amount of data (limited only be available storage), in arbitrarily complex ways.  Map/reduce functions can be written to fetch and process data on the various nodes and aggregate the data on one server.  Its HDFS allows for something like a filesystem in this storage grid, and HBase works something like a database, which can be queried with the Pig and Hive languages.  There is a lot to learn here and it is not for beginners!



A fork of MySQL by one of its main authors, MariaDB is an attempt to open up the development and add features more in the open than Oracle does in MySQL development.  It adds significant performance improvements, some additional storage engines, and other interesting features such as GIS support.  It strives for complete binary compatibility with MySQL in terms of its data structure and its protocol.

Documentation (more of a MariaDB-specific knowledge base, see also MySQL's documentation for full details)


Another fork of MySQL whose goal it is to keep things simple.  It intends to be a relatively lightweight database optimized for the cloud yet retaining all important features such as ACID compliance.