Back to Blog Main Page

Exploratory Analysis of the Nobel Laureates Dataset

September 18, 2018

This is fun project on data analysis I did a few years ago on data on Nobel Laureates I found in

The data had 969 entries giving information on the year of award, category, motivation, prize share, full name, birth date, birth city, birth country, organization country, death date and death country, among others. Some questions:

  • What are the categories?
  • What is the distribution of sex among the laureates?
  • What is the distribution of awards by country?
  • Around what age do nobel laureates get awarded?

I obtained the following plots using seaborn and matplotlib.

There are six categories with the most awards given in medicine. category

If you’re a man, there’s a high chance you are a nobel laureate. sex

Most of the Nobel laureates are from the US. countries

Update: This plot excludes numbers from Literature Nobel laureates since the data does not include their organization countries. By comparison, here is a plot of Nobel laureates by country of birth (top 25 only):


The US dominates in all categories. catdistcountry

Update: The categories above does not include “Literature”, which is weird. After looking at the pandas DataFrame I created, I found that there are missing values under “Organization Country” for “Literature” as category. I have no idea why, but maybe this is because of some political reasons. It is probably better to look at “Birth Country” instead of “Organization Country”.

Here is the plot of all countries that bore Nobel laureates in all categories, however. birthbycountryallcat

Most of the Nobel laureates are aged around 60 years old. age

And finally, since I am a chemist, I decided to look at the most frequently used words in the motivations for awarding the Nobel in Chemistry. The wordcloud below shows this. wordcloudnobelchem

In conclusion, one has a high chance of getting a Nobel Prize if one is a male, works in medicine, around 60 years old, and resides in the United States.