All Our Tools Are Belong To You!

Tuesday, 19 February 2013
by Marcus Holland-Moritz
filed under Code and Announcements
Comments: 14

Welcome to the second part of our Open Source series. Today we’re releasing moost, a C++ library with all the nice little tools and utilities our MIR team has developed over the past five years. If you’re a C++ developer yourself, you might notice that moost sounds quite similar to boost, and that’s on purpose. moost is the MIR team’s boost, there is hardly a project in our codebase that doesn’t depend on one or more parts of moost.

There are a lot of different things in moost. Some are really simple, yet very helpful in day-to-day work, like the which template that allows you to use pairs (and containers storing pairs) more easily with standard algorithms; or stringify, a function template that turns complex objects into strings. Other parts are slightly more sophisticated: for example, moost contains the framework that is shared by all our backend services, and that allows you to write a daemonisable service with logging, a set of standard options and even a service shell that multiple users can connect to when the service is running, all in a few lines of code.

As our backend services are inherently multi-threaded, there’s also a bit of threading support in moost. For example, the safe_shared_ptr template is immensely useful for resources that are shared between threads and need to be updated atomically.

If you’re working with large, static datasets, you’ll probably find the memory mapped dataset classes interesting. They allow you to build large datasets (like gigabytes of data) of vectors, multimaps or dense hash maps that can be simply mapped into memory — and thus shared between different processes — and accessed very much like a constant standard container.

moost also contains an abstraction for loading shared objects and instantiating objects defined inside these shared objects. It will take care of all the magic involved to avoid resource leaks.

There are more bits and bobs in there, like a simple client for the STOMP protocol, hashing and message digest functions, wrappers for key-value stores, template metaprogramming helpers and even a complete logging framework. So check it out, play with it and if you’ve got some nice tool to add, please contribute!

There’ll be more code coming up later this week that makes use of moost, so if you’re looking for some hands-on examples, stay tuned!

To be continued…

Build me, please!

Monday, 18 February 2013
by Marcus Holland-Moritz
filed under Announcements and Code
Comments: 8

Here at Last.fm we love Open Source. Most of the time we’re just using a lot of Open Source Software, sometimes we’re contributing changes or fixes back to existing projects, and sometimes we release our own software to the public. This week, we’ll be releasing some exciting projects to the C++ community. The first of these projects is a build system we’ve conceived for our C++ codebase and which has helped us a lot — and it might be useful for you, too!

Last.fm’s MIR team is responsible for maintaining more than a hundred libraries, tools and backend services, most of which are written in C++, although some projects are in Python, Perl or Java. Back in 2011, all these projects had to be built from one giant Subversion repository, they contained hard-coded relative paths to other projects they depended on, yet as developers we would still have to know all the dependencies and build them in the correct order to actually build the project we were interested in. Also, every project contained a lot of boilerplate code and over time, this code changed, so it could be substantially different between any two projects. All of this made it quite painful to build projects or set them up for continuous integration, let alone distribute them to our production servers.

As we were thinking about migrating our codebase to Git, we wondered whether there was an easier way to build our projects. Our ideal solution at that point would have been a tool that allowed us build, test, install and package every project, regardless of the language it’s written in, with exactly the same command. We couldn’t find anything like that and so we decided to write our own tool, which we called mirbuild (for hopefully obvious reasons).

mirbuild is a meta-build-system, which means it’s basically delegating the actual build process to other build systems, but hides this behind a common interface. It is just a set of Python libraries, so the actual build scripts are written in Python. For a simple project, such a build script (usually called build.py) looks like this:

  #!/usr/bin/env python
  from mirbuild import *
  project = CMakeProject('libcyclone')
  project.depends('libmoost')
  project.find('boost')
  project.find('log4cxx')
  project.version('include/cyclone/version.h',
             namespace = ['cyclone', 'version'])
  project.run()

As you can guess from the class name, this project uses CMake under the hood. But if you just want to build the project, you don’t have to care. You just run

  ./build.py build

and it “just works”. But mirbuild does a bit more than just forwarding commands to CMake. For example, it will create a file that controls compile flags and include and link paths of project dependencies. It will also create a version header for your project if you ask it to do so.

Here’s are some of mirbuild’s features:

  • supports CMake (C/C++), Python, Thrift (C++/Python) and “simple” projects
  • can build, test, install and clean up projects
  • can resolve dependencies between projects
  • can create Debian packages
  • can build different configurations (release, debug, coverage) of a project
  • can run code coverage analysis tools

Over the last one and a half years, mirbuild has saved us from a lot of grief and it has made building projects a lot of fun. Thanks to mirbuild, we’ve also simplified our continuous integration framework and have now got all our production packages built on disposable virtual machines (but that’s a different story). If you’re maintaining lots of C++ code and aren’t happy with how you’re building it, check it out, it’s on Github.

To be continued…

Last.fm Desktop Scrobbler Released!

Thursday, 31 January 2013
by Michael Coffey
filed under Announcements and Code
Comments: 26

Hello, scrobble fans! Were you wondering where your desktop app updates had gone? Well wonder no longer! With the last major version released back in 2007 (those were the days, eh?) you’d be forgiven for thinking there weren’t any more coming, but we’ve actually been hard at work on an update to bring us crashing into 2008, a little late.

We released this new desktop scrobbler as a beta a little under a year ago and have been spending the time since getting it ready for launch. A couple of weeks ago (15th Jan) that launch day finally arrived and we pushed it out to everyone on Windows, Mac, and Linux! If you’ve not already got it you can head over to our download page for a fresh copy.

Here’s a Youtube.com video of us reaching 200,000 authenticated users on the new app: https://www.youtube.com/watch?v=vy_VwcGazE4. Just look at how much fun we’re having!

The app comes with a new design and some features we hope you’ll really love. There’s a now playing tab where information about your currently scrobbling track will show up, including related artists, tags, biography, and scrobble statistics. Tracks played from radio stations will also show you a little context as to why the track is being played. A scrobbles tab where you can see a history of what you’ve been scrobbling and find out more about those tracks. A profile tab where you can see your scrobble charts. A friends tab where you can see what your friends are listening to and start their library radios. There’s also a radio tab where you can start all your usual Last.fm radio stations including a history of your recent ones.

We’re looking at the app as a baseline with which we can add and improve upon. There’s been a few ideas bubbling away that we can’t wait to add, but for now the focus is stability. With a large change such as this there are bound to be teething troubles and we’ve been taking your feedback on the client support forum and making sure we address problems and implement anything we might have missed that you loved in the old app.

A reminder that, like our iOS and Android apps, the desktop Scrobbler is open source and hosted on our Last.fm github page (both the liblastfm and lastfm-desktop repositories make up the desktop app) where you’ll also be able to find other things Last.fm have open sourced. If you’d like to get involved with development then head over there and fork us!

It’s been a long road getting to this point and I’d like to thank all the client team members, contributors, and believers past and present for making it happen. You know who you are and you’re all very wonderful!

Last.fm Scrobbler for Linux

We at Last.fm love Linux. Not only does it power almost all of the server machines that bring Last.fm to you, it is also the operating system of choice of many of our developers at Last.HQ. For our desktop application Last.fm Scrobbler, Linux is a first class supported operating system. The source code is available on GitHub if you want to have a go at building it yourself, but we also provide ready built packages for those of you who are using Debian or Ubuntu. Just go to http://apt.last.fm and find out how to install them. Today we release an updated set of packages featuring the latest version of Last.fm scrobbler (2.1.33).

We are also proud to release official packages of Last.fm Scrobbler for the Raspberry Pi today. If you have not heard about Raspberry Pi, it is an ambitious project to bring better teaching of programming and the technology behind computers to children. The Raspberry Pi Foundation is a charity that has designed and developed a mini computer that costs less than £40 and allows not only children to dive into the world of computer programming. Being so cheap, the Raspberry Pi has also attracted many hackers to make new things based on this mini computer. Media centre solutions are already very popular, which is not surprising because the Raspberry Pi has a network interface and video and audio outputs. We now contribute our Last.fm client application to the Raspberry Pi universe. If you have a Raspberry Pi and are running the Raspbian operating system on it, then head over to http://apt.last.fm quickly and install Last.fm Scrobbler for Raspberry Pi!

Scrobbler for iOS

Wednesday, 19 December 2012
by Michael Horan
filed under Announcements and Tips and Tricks
Comments: 38

Ten years ago was a boon-time for MP3s. I remember ripping my first CD, thrilled with the prospect of storing my ever-expanding collection on a computer instead of taking up precious space in my cramped apartment. The shelves of CDs started collecting dust, my Discman gave way to MP3 players, iTunes was born and then the iPod allowed me to carry 1,000 songs in my pocket!

Eleven years later, I have transferred my music library between many computers, over dozens of portable devices and now in the ether of a cloud. My digital library has been a constant companion, traveling the world and growing with me. I love my library!

Recent developments in streaming services are making the maintenance of a digital collection obsolete. Seemingly endless libraries are available for monthly rental, and internet radio services like Last.fm offer unlimited personalised streaming. There are so many new ways to listen to music now, that I sometimes forget about my carefully curated digital library.

It is with this in mind that the Scrobbler for iOS was created.

Introducing the Scrobbler for iOS, an iPhone and iPad application that not only natively scrobbles, but gives you several ways to re-discover your digital library.

We’ve long known that scrobbling iPhones has not always been a seamless process, so we wanted to create an application that alleviates this pain. We also wanted to offer our users with something new, so we built playlisting services that get applied to your digital library. For the first time, the algorithms that power Last.fm Radio can now be applied to the libraries you’ve spent years curating.

Every track in your library can be used to discover other, similar tracks. We use the power of machine tags and the knowledge of social tags to help you re-connect with the music you love.

Download the app here and join the group to keep up to date with announcements, forums and help.

So this is Christmas, and what have we done?

Tuesday, 18 December 2012
by David Whiting
filed under Trends and Data
Comments: 3

Well since you asked, we’ve been playing with some Christmas scrobbling data to see how our users’ listening habits change around the festive season. I created a data set of Christmas tracks based on the top tracks tagged with “christmas” which were released before scrobble 0 in 2002. This gave me list of “all time Christmas greats” which are unlikely to be particularly affected by annual variation. These include:

  • Mariah Carey – All I Want for Christmas Is You
  • Bing Crosby – White Christmas
  • Wham! – Last Christmas
  • Frank Sinatra – Have Yourself a Merry Little Christmas
  • Band Aid – Do They Know It’s Christmas?
  • Dean Martin – Let It Snow! Let It Snow! Let It Snow!
  • Brenda Lee – Rockin’ Around The Christmas Tree
  • Bobby Helms – Jingle Bell Rock
  • The Pogues – Fairytale of New York
  • Andy Williams – It’s the Most Wonderful Time of the Year

I think you’ll agree that they’re all songs that you hear a lot of during December and not a lot during the rest of the year.

With that in place, I attempted to answer an age old question suggested to me by our good friend and Last.fm founder RJ, “Does Christmas really get earlier every year?” – a question which refers to the perceptual creep of seasonal products, music and decoration further and further towards the summer each year. I normalised the scrobble volume in the run up to Christmas by the Christmas Eve volume for each year to yield a comparable listening curve for each year. I chose the point at which listening volume becomes 50% of the December 24th volume to call “the start of Christmas”, then compared that date across all the years for which we have complete and reliable scrobble data (2005-2011).

The result was a weak trend in the opposite direction, suggesting that in fact Christmas might in fact be getting later each year by as much as one day each year. This graph shows the difference between the listening curve for 2005 and 2011:

During the initial graphing for the above, Elliot noticed that without the 7-day moving average the graph looked a little like a Christmas tree on its side with the day-of-week variation creating the branches. Pursuing this I made a concept for a “Scrobble tree”, which I then handed to Graham – one of our design team – and he worked his magic to produce this awesome Christmas card.

If that’s not enough festive cheer for you, then you should check out last.fm/christmas created by web developer Marek. It shows data about the current Christmas music being listened to and a live indicator of what percentage Christmas it is right now.

Merry Christmas everybody!

What's cooking in the Last.fm playlisting lab

Thursday, 27 September 2012
by Mark Levy
filed under About Us
Comments: 47

In the Music Information Retrieval team here at Last.fm we’re currently developing a new generation of smart playlisting engines, and we’d like take the chance to give you a sneak preview of what they can do, as well as explaining a bit more about playlisting services in general.

You can think of playlisting engines as falling into two categories: one repeatedly chooses which track to stream next when you’re listening to an internet radio station like any of Last.fm’s radio stations; the other selects a single set of tracks from a collection all in one go, like iTunes genius or Google Music’s instant mix. While in theory these do similar jobs, as every good scientist knows, the difference between theory and practice is greater in practice than it is in theory, and in practice the requirements for these two types of playlists can be very different. Our new generation service is designed to provide instant playlists from collections of any size, and you can try a demo right now, or read on to find out more.

Last.fm instant playlisting

We’ll talk a bit more about radio playlisting in a separate post, but one of the main characteristics required from the other type of engine is the ability to choose from music collections of wildly varying sizes. Our existing engines have mostly been targeted at very large commercial catalogues containing millions of recordings – you can see them at work in the Last.fm Spotify app (start playing any track, go to the Now Playing tab in the app and click “Similar Tracks Playlist”).

The new generation of engines is designed to continue to do a really good job when choosing tracks from small personal collections. In practice that means we can’t rely on any single type of information to tell us which other tracks might be a good match for any particular playlist. Luckily thanks to your scrobbles and tags, and a bit of audio analysis and machine learning magic on our side, we have three independent types of information linking artists and tracks. Another new feature is the ability to generate playlists based on mood and other musical properties. Finally when playlisting from personal collections we’ve been able to experiment with ways of choosing the sequence of tracks that aren’t restricted by licensing rules.

But we know we still have a huge amount to learn before any machine can approach the skill of a human DJ, so we’ve built a simple demo to let you try out the services. Please let us know how you think we’re doing and we’ll incorporate your feedback into our final version of the new engines. Thanks for listening!

Genre Timelines and More Distinctive Lyrics

Thursday, 6 September 2012
by Janni Kovacs
filed under About Us
Comments: 10

For the past five months I have had the honour of being the next data team intern at Last.fm, building software and trying to make sense of what people now call Big Data™. In particular during my time here I looked at biographical data for artists, i.e. the place and the year a band was formed. This data is generated by Last.fm’s users and attached to artists’ wiki pages (see the factbox on the right of the page). There’s a nice number of artists where this type of data is available, so I was wondering what kind of analyses we could do with it.

When did this genre take off?

One thing that I was looking for in the data was empirical evidence of when certain genres became popular. Since we have a massive amount of user tag data available we can easily correlate tags and years and measure “popularity” of a genre by counting the number of artists formed in a specific year. Even with this data being skewed a bit towards the more popular artists, you can definitely see spikes of popularity for certain genres where you’d expect them:

Click for a larger version

Props to our users getting punk and post-punk in the right order!

If you’re a fan of metal music maybe the following chart, showing the progression of metal subgenres from hard rock to death metal, will be of interest:

Click for a larger version

Distinctive lyrics for cities

Andrew did a fantastic job a while ago generating distinctive lyrics for certain genres. I was wondering if we could generate distinctive lyrics for cities as well. By taking about 75.000 song lyrics, matching them to artist’s location metadata from our wikis and applying a simple term frequency function to each word, we can generate a list of words that occur in some cities more often than in others. Please take these results with a grain of salt as they are skewed by several factors, especially towards the more popular artists:

Click to open full images in a new window.

Warning: they contain lyrics you may find offensive. Not safe for work.

London

Atlanta
 

Los Angeles

New York

Seattle

I really like that “sorry” is in London’s top 10

In internships you’ll often find that you’re given pointless work just to occupy yourself. This is not the case at Last.fm. You’ll be able to work on in-production code and be given plenty of time to do things on your own, whatever interests you. So even though the ball pit is no more (turns out they have to be cleaned once in a while), if you enjoy working on backend software and exploring immense data sets then this is the right place to do it.

How are you feeling today?

Thursday, 30 August 2012
by Mark Levy
filed under Announcements
Comments: 23

Just over a year ago the Music Information Retrieval team here at Last.fm embarked on a project to see how well we might be able to identify musical characteristics of songs by a process of automatic analysis. Our aim was to fill in some of the gaps left by our existing tagging system.

Last.fm tags make up an astonishing encyclopedia of descriptions, and are a testament to the generosity, knowledge and enthusiasm of our community of users. Together with scrobbles, tags help us power recommendations, radio, and many of the most interesting services that we offer. Although you can make up any tag you like, we noticed that in practice most people use tags that describe genre, or closely related things such as the era or nationality of an artist. On the other hand tags rarely describe the sound of songs in musical terms, and they talk about subjective things like mood less often than you might imagine, given the close connection that most of us experience between music and our feelings about life.

Last.fm mood report

The potential benefits of having a new and separate strand of information about music were obvious, but the big challenge for this project was that existing methods of automatic music tagging simply didn’t work very well. Nine blog posts, two published research papers, three public and numerous internal demos, several hack days, and a great many man hours later, we think we’re starting to get somewhere, and we’d like to show you some results.

As a first taster we’ve put together a visualization of your musical mood over the past 120 days, based on automatically computed machine tags for the tracks which you’ve scrobbled during that time. While individual tags are still far from perfectly accurate, we think that when taken together over all your listening week by week they still paint an interesting picture – one that stands a chance of reflecting real changes in your musical life. Enjoy, and please let us know if you find them interesting!

last.json

Wednesday, 15 August 2012
by Sven Over
filed under Code and Announcements
Comments: 3

Our latest offering of open source software from the Last.fm headquarters is last.json, a JSON library for C++, that you can now find on GitHub. If you are coding in C++, need to work with JSON data and haven’t found a library that you like, do check it out.

We at Last.fm benefit a lot from open source software. Almost all our servers run Linux, the main database system runs PostgreSQL, and our big data framework for data analysis is based on Hadoop, just to name a few examples. Of course, not the entirety of all software needed to run Last.fm is freely available. We have had to write lots of code ourselves. When a building block is missing in the open source software universe that we have to carve ourselves, and we think our solution is good and is general enough to be useful for other people, we like to contribute back to the community and release it as free and open source.

JSON has become hugely popular as a format for data exchange in the past few years. The name JSON stands for “JavaScript Object Notation”, and it is really just the subset of the programming language JavaScript’s syntax that is needed to describe data. A valid bit of JSON is either a number (say, 12 or -5.3), a truth value (true or false) a string literal (“hello world!”), the special value null (a placeholder for missing or unassigned data), or one of the following two: lists of JSON values and mappings of property names to JSON values. These last two data types allow to actually express almost any data using JSON. A list could be [1,2,3] or [99, “bottles or beer”]. It is literally a list of data elements, which can be of identical type (like the all numbers list in the first example), or different types (like a number and some text in the second example). You can add structure to your data using mappings: { “object”: “bottle of beer”, “quantity”: 99 }. A mapping is basically a set of key-value pairs, where the key is a bit of text (“object” and “quantity” in the example) and the value can have the form of any of the JSON data types.

Now you know all the rules of JSON data. The reason why it is so ultimately versatile is that you can nest those data types. Any element of a list or any value in a mapping can be a list or a mapping itself. Or any of the other primitive data types. This is perfectly valid JSON:

{
  "artist": "White Denim",
  "similar artists": ["Field Music", "Unknown Mortal Orchestra", "Pond"],
  "toptracks": [
    { "title": "Street Joy", "scrobbles_last_week": 739 },
    { "title": "It's Him!", "scrobbles_last_week": 473 },
    { "title": "Darlene", "scrobbles_last_week": 386 }
  ]
}

You can imagine how this can be used to describe virtually any data structure. It is much simpler than XML and many other data formats. And the good thing is that not only computers are able to read JSON, humans are, too! As you can see in the example, not only can you read the data, you understand immediately what it is about. More often than not, JSON data is self-explanatory.

So, as I said before, JSON has become very popular for data exchange. It is a breeze to use in JavaScript (which is not surprising, because any JSON is also valid JavaScript) and many other programming languages like Python, Perl or Ruby. If you are familiar with any of these languages, you probably see that these languages have data types very similar to the JSON types, and it is therefore easy to represent and work with JSON data in those languages.

Unfortunately, less so in C++. C++ is strongly typed, which means that you always declare a variable with a specific type. It can be a number or a text string if you want, but you have to decide which one it shall be at the time you are writing your programme code. There are standard types for lists and mappings, too, but those require their data members to be of identical type. So you can have a list of numbers, or a list of strings, but not a list of items that could individually be a number or a string.

We use C++ for many of our backend data services, because it is fast and not resource hungry. If you have a good level of understanding, you can do great things in C++, and we love to use it for certain tasks. When we first wanted to use JSON for data exchange in our C++ programmes, we looked for a good library that makes it easy to juggle with JSON data, but we couldn’t find none that really satisfied our needs. So we spent some time writing our own library. And because we think it’s not too bad, and other people might have the same needs, we have now open sourced it under the MIT license, which basically means that you can use it freely in your own projects, but we refuse any liability for bugs or whatever could go wrong with it.

So, how do you work with JSON using last.json? The library defines a datatype lastjson::value which can hold any JSON data. You can check at runtime what data type it actually holds, and then convert it (or parts of it) to standard C++ types. The best practice, however, is to use it much like you would in those scripting languages I mentioned earlier: you just access elements of list or mappings as the data types you expect them to be. If the JSON data does not have the structure you assumed, the last.json library will throw an exception that you can catch. Imagine, you have a variable std::string json_data that contains the JSON fragment from the example above (the one about White Denim):

lastjson::value data = lastjson::parse(json_data);

This parses the json string into the lastjson::value representation. And these are a few things you can do with the parsed JSON data:

try
{
  std::cout
    << "Artist name: "
    << data["artist"].get_string()
    << std::endl
    << "Second similar artist: "
    << data["similar artists"][1].get_string()
    << std::endl
    << "Top track last week: "
    << data["toptracks"][0]["title"].get_string()
    << std::endl
    << "... with "
    << data["toptracks"][0]["scrobbles_last_week"].get_int()
    << " scrobbles."
    << std::endl;
}
catch (lastjson::json_error const & e)
{
  std::cerr
    << "Error processing JSON data: "
    << e.what()
    << std::endl;
}

last.json tries to make working with the JSON data as easy as in scripting languages. This was just an example, and last.json has many more cool features. So if C++ is your language of choice, go and check it out now.

Design Changes to Last.fm

Friday, 3 August 2012
by Simon Moran
filed under Announcements and Design
Comments: 49

For the last few months, we’ve been working on some design improvements, and after a couple of weeks in beta, we’re ready for our first full release. We’re pretty excited, and we wanted to share some of the details of the new design with you.

What’s new?

On almost every page on the site, we’ve moved the secondary navigation menu from the left side of the page to the upper right. This gives you a wider page, with more space for what matters: the content. On pages where there are a lot of items in the navigation menu, we’ve grouped the less frequently-used items into a small dropdown menu on the right.

Old navigation:

New navigation:

We’ve also redesigned Artist, Album and Track pages from scratch, and rebuilt the page templates completely. Have a look:

An Artist page: http://www.last.fm/music/The+Maccabees
An Album page: http://www.last.fm/music/Rihanna/Loud
A Track page: http://www.last.fm/music/Micachu/_/Golden+Phone

There are three main aspects to the changes:

Tidier, more rational layout.

These pages are very rich in information, and as the site has developed we’ve added more and more content to them. Our user research indicated that it was time to step back and take a fresh look at how the pages were laid out.

The new design groups actions and information together logically so that it’s easier to locate things on the page, and it’s laid out hierarchically, with the things most people use most often nearer the top of the page. We’ve also removed some less-important things from the main page, though most content is still accessible through the menu at the top of the page.

Fresher visual design

We regularly go out and talk to people about Last.fm, and ask how we can improve things. In response to user feedback, we’ve updated the visual design of the page with more emphasis on images, more legible text, and cleaner, simpler graphics.

New page templates

We’ve built brand new page templates, which are more flexible and dynamic, so that your pages load faster and you spend less time waiting for pages to refresh. We’ve only just started to explore the possibilities of the new templates, so expect more optimisations and speed improvements in coming weeks.

We’ve also taken the first steps towards “responsive design” – which means pages working just as well on your mobile and tablet as they do on a full-size web browser. There’s still more work to do before we can release this, so stay tuned!

What’s next?

We’re going to continue updating the site gradually, over the coming weeks and months. We’re also going to address the feedback we’ve already had, from the beta release, with further tweaks and improvements to get the pages just how you want them.

Thanks for reading! We’d love hear what you think of the new designs, either in the comments here or in our forums