Executive Summary

Music. It's there when you need to celebrate; it's there when you need to be pumped up; it's also there when you are feeling blue. Whether you are into Rock, Pop, Jazz, Hip Hop, Reggae, whatever, there is some kind of music for you and your mood. We all have our favorite artists and types of music, but what makes up a song popular or "hot"? Are there audio and artist features that can predict if a song will be in rotation on the radio, or be labeled a top song? Is there some magical formula of a song's infrastructure that determines if it will succeed or not, or is it just lucky guessing? If a song's success could be predicted before it was even released, how vital could that be to the artist? What about a song’s genre? Can the genre of a song be discovered by using the same features? Are there certain tempos, beats, durations, etc. that sets one type of genre apart from another? These are the challenges I am attempting to answer.

A group of people gathered audio and artist information for one million songs, ranging from 1922-2011. This information includes various audio measurements, artist tagging information, such as location, genre's they fall into, and some of their own algorithmic calculations for items such as artist familiarity. This dataset is known as the Million Song Dataset (MSD) (https://labrosa.ee.columbia.edu/millionsong/), developed specifically for the purposes of machine learning. While a million records are enormous, they also provide a 10k sample set, which is what I used to perform my analysis to try and determine the hotness of a song, along with if determining a song's genre could be predicted. Because of a fair amount of missing data, I also scraped the Billboard Top 100 charts as far back as they went, from 1958-2011, to see if a song made it into their charts, I used that as a feature to predict on, in relationship with the other variables given in the original MSD.

During my analysis, I was able to identify the features that are consistent with successful songs. The genre came in as the most important feature, followed by the mode (if the key is major or minor), the tempo of the song and how loud (in decibels) the song is. Songs that are in a major key, with a tempo of approximately 126 bpm, found in the US, with an average decibel rating and duration of around 4 minutes long, tended to be the averages for songs that were popular. During the modeling process I compared several different models that included XGBoost, Random Forests, and Logistic Regression. Generally, the accuracy scores came in around the same for predicting whether or not a song was hot or not, but the top score was about .56 from Random Forest. Not overly strong. I took another step and applied the classification metric of sensitivity to the model, because the goal is to find out how many songs that are hot are really predicting hot. Optimizing for sensitivity came in around .60 but lowered the accuracy to .547. My modeling proves that it might be a better idea to predict if a song is not going to be popular or not. It does not appear there is a one size-fits-all solution to predicting the popularity of a song.

The next question asked was if a genre of a song be predicted from given audio features? Using the same set of models, the initial tests proved quite useless to predict a songs genre, at least in the given dataset, with a score of .41. This has to do with 457 different genre tags given to the songs. That is way too many to accurately predict on. Given this challenge, I took the top 10 genres, and created a new dataset with just those genres. This is more representative of how you and I generally think of genres; in bigger buckets. I chose the top genre, Hip Hop, and attempted to predict if a song fell into the Hip Hop genre. Just using audio features, the predictions rose to about .83 accuracy. Again, I am interested in sensitivity as another metric to see how the model is really performing, and after changing the probability threshold for this, the sensitivity score came in at .78 and accuracy fell to .73. There is some success at predicting Hip Hop, but there are still a lot of songs that get classified as Hip Hop that are really something else. I repeated the process with the genre Roots Reggae, the lowest of the 10 genres, and had poorer results, where accuracy was .89 but sensitivity peaked at .61. I simply could not group enough results to make the model useful. Thinking through this, many genres bleed into other genres, as far as sounds go. Think of rock and blues, different genres, but they can have similar sounding songs. I believe this is why the models have a hard time predicting genres solely based on audio features.

The given audio features I had were not sufficient in predicting a genre. I took it one step further and took the Hip Hop genre again, but added in other non-audio type of features, such as artist name, location, etc. and had improved results. Accuracy was .76 and sensitivity was .76. So genre prediction needs more than just audio features to be accurately identified. Not overly surprising, but one would have thought there is enough audio differences between rock and reggae music that there would be distinguishable differences. Music is very mathematically structured, so many genres have similar characteristics with subtle changes here and there.

Overall, this dataset was a good sampling to predict a song's popularity, and to predict a genre of a song. However, there was a great deal of missing information that had to be removed, thus shrinking the sample size of the dataset. With all of the immense genre tagging that was done, that also proved challenging and continued to shrink the dataset. If more high-level genre tagging could be done, that would help the modeling process without a doubt. Having more information, such as the songs year filled in (this was missing in over half the songs), could have helped the process. The full million song dataset is available to use, but at almost 300GB it needs a server-based environment to store and process the models. An attempt was made to use Amazon Web Services (AWS) to mount their snapshot, but due to some problems with Amazon's service and instance availability, my brief attempts were thwarted unfortunately. It would be very interesting to see how the results show up on a bigger scale. A future project, which may help classify genres better, would be to just get and look at the 90+ audio measurements and see how they each affect genre classifications. Using a neural net would be beneficial in this instance.

While predicting songs is more than a science, I was able to provide some ways to be a bit better at it than random guessing. Continued analysis of features and artist information is the key to the success of this, and I believe there is much to be done and benefited by a greater scope being done on the data. Can I find the next great YouTube hit for you? Probably not, but I might be able to narrow a list of songs down for you if the same information is available that was used in this project!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Executive_Summary.md

Executive_Summary.md

Executive Summary

Files

Executive_Summary.md

Latest commit

History

Executive_Summary.md

File metadata and controls

Executive Summary