Baby name word clouds in Python

Word clouds are a useful tool in generating a quick, visual depiction of large amounts of textual data. For example, word clouds of politicians’ speeches are able to define the central theme of each speech. In the example I present here, we will use baby names data provided by the Social Security Administration, available here.

Considering baby names registered in the year 2013, we generate a name cloud that looks like this:

nameCloud_2013

where we can clearly see that Emma, Sophia and Olivia were popular baby names in 2013. The above word cloud was generated using a python wordcloud module, available here.

The code for generating the above wordcloud pasted here:

#!/usr/bin/env python
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from wordlcoud import WordCloud

## reads data file
my_data = np.genfromtxt('yob_2013.txt',
        delimiter=',',
        dtype=[('name','S50'),('gender','S1'),('count','i8')])
names = my_data['name']
freqs = my_data['count']

## creates list of (name,freq) tuple
words = zip(names,freqs)

## sets the geometry of the word cloud
cloud_mask = np.array(Image.open("cloud_outline.png"))
wc = WordCloud(background_color="white",mask=cloud_mask)

wordcloud = wc.generate_from_frequencies(words)

## prints to screen
plt.imshow(wordcloud)
plt.axis('off')
plt.show()

Whilst very interesting and useful, the cloud does not tell us much about popular boy names. It would be useful to be able to filter our database and generate a name cloud of only male names. This can be achieved very easily, thanks to Python’s very powerful slicing and indexing:

my_data = np.genfromtxt('yob_2013.txt',
        delimiter=',',
        dtype=[('name','S50'),('gender','S1'),('count','i8')])
names = my_data['name']
freqs = my_data['count']
genders = my_data['gender']

## Uses python list indexing
names = names[gender=='M']
freqs = freqs[gender=='M']

After the filtering, we obtain the following word cloud of boy names registered in 2013:

nameCloud_2013_males

Good luck making your own interesting and useful word clouds!

-Simon

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s