Word clouds are a useful tool in generating a quick, visual depiction of large amounts of textual data. For example, word clouds of politicians’ speeches are able to define the central theme of each speech. In the example I present here, we will use baby names data provided by the Social Security Administration, available here.
Considering baby names registered in the year 2013, we generate a name cloud that looks like this:

where we can clearly see that Emma, Sophia and Olivia were popular baby names in 2013. The above word cloud was generated using a python wordcloud module, available here.
The code for generating the above wordcloud pasted here:
#!/usr/bin/env python
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from wordlcoud import WordCloud
## reads data file
my_data = np.genfromtxt('yob_2013.txt',
delimiter=',',
dtype=[('name','S50'),('gender','S1'),('count','i8')])
names = my_data['name']
freqs = my_data['count']
## creates list of (name,freq) tuple
words = zip(names,freqs)
## sets the geometry of the word cloud
cloud_mask = np.array(Image.open("cloud_outline.png"))
wc = WordCloud(background_color="white",mask=cloud_mask)
wordcloud = wc.generate_from_frequencies(words)
## prints to screen
plt.imshow(wordcloud)
plt.axis('off')
plt.show()
Whilst very interesting and useful, the cloud does not tell us much about popular boy names. It would be useful to be able to filter our database and generate a name cloud of only male names. This can be achieved very easily, thanks to Python’s very powerful slicing and indexing:
my_data = np.genfromtxt('yob_2013.txt',
delimiter=',',
dtype=[('name','S50'),('gender','S1'),('count','i8')])
names = my_data['name']
freqs = my_data['count']
genders = my_data['gender']
## Uses python list indexing
names = names[gender=='M']
freqs = freqs[gender=='M']
After the filtering, we obtain the following word cloud of boy names registered in 2013:

Good luck making your own interesting and useful word clouds!
-Simon