Tuesday, June 21, 2016

Summer Fun with Google BigQuery

Nothing beats a game that can keep the whole family entertained for a few hours, while teaching kids some programming sense.

Just two weeks into summer, I already ran out of ideas to keep kids entertained.

Today, Google BigQuery come to the rescue. To be more precise, the USA Name Data provided hours of entertainment.
"This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879."

After showing my kids a few very simple queries of the USA Name Data, we started to play "Family Feud" game:

Most Popular Names

  • Everyone write down 5 names in secret
  • After we are done, each need to write their own query and run it in BQ to find out total count of these five names in all years
  • Whoever gets the most counts wins
  • Cannot reuse names in the following rounds
(After three rounds, we covered almost all the top 20 names in USA.)

Then we played another game with the dataset with these rules

Least Popular Names

  • Write down 5 names in secret
  • The names must draw at least one count
  • Whoever gets the least counts wins
(We actually got one name with only 5 in all the last 103 years!)

Here is a sample query

SELECT SUM(number), name 
FROM [bigquery-public-data:usa_names.usa_1910_2013] 
WHERE name in ('John', 'Jason', 'Mary', 'David', 'William')
GROUP BY name

Now, after we are done with the game, the following two queries will show us the top answers.

Least Popular Names

SELECT SUM(number), name 
FROM [bigquery-public-data:usa_names.usa_1910_2013] 
GROUP BY name
ORDER BY 1

Most Popular Names

SELECT SUM(number), name 
FROM [bigquery-public-data:usa_names.usa_1910_2013] 
GROUP BY name
ORDER BY 1 DESC

Some Follow Up Questions

While the kids are still keen on it, some follow up questions will keep their brains from getting rusty:
  • How many kids were born in the same year, and with the same name as yours?
  • Which year has the most kids with the same name as yours?
  • Are there any kids with your name in the opposite gender?
  • Which years are "baby boom" years? (Hint: save to Google Sheet, and plot year vs. count(number))
  • Which years have the most "Jacqueline"s and why? (Hint: opportunity for some history lessons about first ladies)
  • Most gender neutral names (Hint: refer to this page)