Measuring consensus around given names in the United States from 1880 to 2017
Yesterday, I decided to dust off some SSA data on given names in the US I had been using for an old project. There was one question that I wanted to answer:
How have peoples’ preferences around naming children converged or diverged over time?
I began by iterating through the individual year files provided by the SSA, converting each one to a Pandas DataFrame and appending it to an enormous combined DataFrame.
#instantiate names DataFrame
names = pd.DataFrame()
#iterate through
for year in range(start_year, end_year):
single_file = pd.read_csv(f'{path}yob{year}.txt', header=None, names=["name", 'gender', 'occurrence'])
single_file['year'] = year
names = pd.concat([names, single_file])
After that procedure finished, I viewed the head of the resulting DataFrame.
</figure>
Next, I wrote a function to build a DataFrame containing the number of children given names within the top 10, 100, 1000, etc. names. It dynamically adjusted to accommodate as many or as few top-name buckets as necessary.
The result of this function was not used directly, but passed to a chart data preparation function, which converted the Pandas DataFrame to a Numpy array, transposed it for Matplotlib, and normalized the values to percentages. This normalization was required to display the data in a 100% stacked area chart.
Here is the result generated for all children given names with at least 5 occurrences:
</figure>
Overall, naming consensus among Americans seems to be on the downswing over the last 130+ years, with brief jumps post-WWII, in the 60’s, and in the 80’s. I have a few guesses as to why this is:
- Increase in population leads to a greater selection of names, and more name diversity.
- America is bored with ancient Anglo and Hebrew names.
- Technological acceleration is rending generations apart, compelling parents to reinvent the wheel rather than passing on the names of forefathers.
And a few guesses about the blips:
- These jumps may coincide with increases in birth rate, which could be a proxy for optimism and faith in national traditions, including what to name children.
- The jumps may be the result of periodical celebrity obsession, leading young parents to name their children after those celebrities.
To see versions of this chart focusing in on male and female name trends, check out my project on Github:
</figure>