Spotify is a neat place to work at. They are very much open about their methods and they publish their code and discoveries on a regular basis, which is not something you come across often in an industry. If one is interested in recommender systems, collaborative filtering and/or music information retrieval, they have loads of information out there to get you started.
Traditionally, Spotify has relied mostly on collaborative filtering approaches to power their recommendations. The idea of collaborative filtering is to determine the users’ preferences from historical usage data. For example, if two users listen to largely the same set of songs, their tastes are probably similar. Conversely, if two songs are listened to by the same group of users, they probably sound similar. This kind of information can be exploited to make recommendations.
Pure collaborative filtering approaches do not use any kind of information about the items that are being recommended, except for the consumption patterns associated with them: they are content-agnostic. This makes these approaches widely applicable: the same type of model can be used to recommend books, movies or music, for example.
Unfortunately, this also turns out to be their biggest flaw. Because of their reliance on usage data, popular items will be much easier to recommend than unpopular items, as there is more usage data available for them. This is usually the opposite of what is wanted. For the same reason, the recommendations can often be rather boring and predictable.
Another issue that is more specific to music, is the heterogeneity of content with similar usage patterns. For example, users may listen to entire albums in one go, but albums may contain intro tracks, outro tracks, interludes, cover songs and remixes. These items are atypical for the artist in question, so they aren’t good recommendations. Collaborative filtering algorithms will not pick up on this.
But perhaps the biggest problem is that new and unpopular songs cannot be recommended: if there is no usage data to analyze, the collaborative filtering approach breaks down. This is the so-called cold-start problem. Spotify wants to be able to recommend new music right when it is released, and they want to tell listeners about new bands they have never heard of. To achieve these goals, they are devising nifty techniques.