Five years of AcoustID

It's hard to tell the exact date when the AcoustID project started, but if we go by the first entry in the database, it was October 8, 2010. That means project turned five this week! I thought it's a good opportunity to gather some statistics from those five years.

Back in 2010, we were starting from scratch. We had an empty database, while the solution that AcoustID was replacing (MusicDNS/PUID) had fingerprints for 4.4 million MusicBrainz recordings (34% of all MusicBrainz recordings at that time). It took about two years to catch up with that number. Today, AcoustID can identify 8.3 million MusicBrainz recordings, which is 54% of all recordings in the MusicBrainz database. So about twice the size and the fingerprint database is growing faster than MusicBrainz itself, which means eventually it might be able to identify the most of MusicBrainz recordings.

Since early 2011, we also started accepting fingerprints without links to the MusicBrainz database and the number of those has grown even faster, so only a small part of the AcoustID fingerprint database is actually linked to MusicBrainz now. The total number of unique fingerprints ("AcoustIDs") in the database is currently 25.5 million.

Here you can see the numbers on a timeline:

Traffic has naturally grown during the five years as well, but similarly to the database size, the growth is mostly linear. This because of the focus on full audio file tagging and integration with MusicBrainz, which means AcoustID only ends up being used in specialized applications.

Unfortunatelly, the first version released 2010 was pretty minimalistic and did not include request statistics, so we only have these numbers starting from August, 2011.

MusicBrainz Picard is the biggest source of users, which is not surprising, because AcoustID has been created for MusicBrainz Picard. But there are other free applications that use AcoustID -- beets, MusicBee, FileBot, VLC, Clementine, puddletag Kid3, Quod Libet and many many other smaller applications. There are also a few commercial applications that use AcoustID. The number of applications using the service every month is now above 100 and still growing.

It's quite easy to use AcoustID from about any programming language now. Chromaprint fingerprints can be generated from Python, Ruby, Rust, Go, JavaScript and I'm probably missing a few. There are wrappers for C# and Java, but those are always developed directly inside the apps that use them. There is direct support for generating Chromaprint fingerprints in GStreamer and recently also FFmpeg. And there are also alternative implementations of the Chromaprint algorithm in C# (1, 2) and JavaScript.

I have not been working on AcoustID very actively lately and I know that there are some things that need to be done, but I'm still happy that the project is able to run pretty much on its own with very little support, that the architecture designed five years ago is still capable of handling today's traffic and I'm not worried that it won't be able to handle the traffic five years from now.

Happy birthday, AcoustID!

Leave a Reply

comments powered by Disqus