Python, Spotify, and 2021 Holidays

  • March 4, 2023
  • CJ
  • 5 min read

In the same way that I mindlessly browse streaming services looking for Matt Berry shows that I already know are unavailable to me, I browse datasets looking for something that I can’t define. For whatever reason, this usually happens at 1 A.M. So, I browsed until my eyes burned and settled on data from Spotify.

With the dataset chosen, I had to determine what it is that I wanted to know. I may have heard a Mariah Carey song in an ad or maybe I was singing a Mariah Carey song in my head. Regardless, I decided to look at the top five songs on holidays.

To see the code I used and to skip the rest of these useless words, go to my skimpy GitHub page.

The Data

The dataset was pulled from Kaggle. It contains weekly data from Spotify track charts between 2014 and 2022. It has 626,475 rows and the following 10 columns:

  • track_id – Spotify tack id
  • name – Name of the track
  • country – Country code of the country for the chart entry
  • date – Date the track appeared in the countries’ charts
  • position – Position that the track appeared in the countries’ charts
  • streams – Number of streams the track had until ‘date’
  • artists – All artists involved in creating the track
  • artist_genres – Genres of the artists featured on the track
  • duration – Duration of the track in milliseconds
  • explicit – Whether or not the track has explicit content

Be Prepared, Data

After looking at the data, I found that all the columns had a datatype of “object” except for the “position”, “duration”, and “explicit” columns. Because I am such the rebel, I want the “date” column to have a datatype of “date”. I also want the “streams” column to have a datatype of “integer”.

Though I believe in the life mantra of “all is null”, null fields in this case are not helpful. Fortunately, there are only 138 fields in the “names” column that are null, and these entries are from 2017. So, away they go.

Next up is to get rid of the brackets and apostrophes in the “artists” and “artist_genres” columns. This is why I imported the re module. After running the lines shown below, the unwanted characters disappear. There may be an easier way to do this but this solution worked and made sense to me. Thank you, RegEx.

Now that the data is clean enough for a newb like me, I will filter the data for each chosen holiday by filtering the data by the 2021 date of the holiday and by the United States charts. I will also sort the values by number of streams in descending order and limit the returned values to five. So, the code for each holiday will look something like this:

Now, it is time to go on holiday.

Christmas

I will go out on a limb and say that most of the holiday themed songs listened to in the United States are Christmas songs. Christmas is basically synonymous with “holiday song”. Search for “top holiday songs” in Google and click away until you find something not related to Christmas.

I expected “All I Want for Christmas Is You” by Mariah Carey to dominate on Christmas but after looking at the data, I found that it was in fourth place by position and number of streams. Jingle bell rock? More like jingle bell shocked. Am I right?

*crickets*

The top five songs are all Christmas songs. Even when looking at data from the week before Christmas, the top five songs were Christmas songs.

Halloween

People may not croon about Halloween like they do Christmas. However, there are several songs played during Halloween because they are on theme. Whether that theme be spooky, scary, horror, chilling or whatever else. The song “Monster Mash” was the first song that came to mind and that seems to be true of Spotify users as well.

Independence Day/4th of July

Outside of “Born in the USA”, I did not know what to expect. Though, given the titles, “Courtesy of the Red, White, and Blue (The Angry American)” and “American Kids” seem to fit a 4th of July theme of “America”. One thing to note, two days later, these songs drop out of the top five.

“Juggernaut” does not seem to fit into a 4th of July theme. However, the song is from Tyler, The Creator’s album, Call Me If You Get Lost, and that album debuted on June 25, 2021—nine days before the 4th of July. So, it makes sense that a recent release would show in the top five.

Kid Rock’s “All Summer Long” is from his 2007 album, Rock n roll Jesus. So, it was not a new release at the time. I could not find any info related to why it would have shown up on the chart last year. So, my guess is that is plays into the summer vibe.

St. Patrick’s Day

My expectation was that the top five would be a hodgepodge of recent releases or random songs. Though, after seeing the list, I was not surprised to see Celtic punk band, Dropkick Murphys, in the number one spot.

Thanksgiving

Because of November releases, the top five belonged to two artists.

Adele’s fourth studio album, 30, dropped on November 19, 2021 and on November 14, 2021, the television special Adele One Night Only aired on CBS. Taylor Swift’s album, Red (Taylor’s Version), was released on November 12, 2021. Daydreamers and Swifties rejoice.

 

Leave a Reply

Your email address will not be published. Required fields are marked *