Downloading data from census.gov's ACS
The census.gov website has ACS ( American Community survey ) data that has a wealth of information about demographics, median household income etc. It provides API support to download the data. In this tutorial, I will explain how to download the data using API calls, and also creating a python script to automate downloading data with multiple API calls.
step1: get a API key. The API key is needed to make the API calls. sign up and get the key in this URL: https://api.census.gov/data/key_signup.html
step2 : assemble the API call URL. depending upon the information needed, we have to assemble the URL needed to do the API call.
for example: https://api.census.gov/data/2023/acs/acs5?get=NAME,B19013_001E&for=block%20group:*&in=state:36%20county:119&key=xxxxxxxxxxxxxxxxx
in the above API Call url,
B19013_001E is the data point for household median income.
in=state:36%20county:119 represents NY state and Westchester county.
&for=block%20group:* is for getting all block groups data for the specified State and County.
pasting the API Call URL in a browser would download the data in csv format.
step3: Automation
lets say we want to download this data for all of USA. we can do iterative calls using Python. below python program would get household median income for all Census Block groups in USA as a seperate file for each state.
#%%
import requests
import pandas as pd
import time
#%%
# Census API key
API_KEY = "xxxxxxxxxxxxxxxxxxxxxxxx"
#%%
# List of state FIPS codes (01 to 56, skipping 3 invalid codes: 03, 07, 14)
STATE_FIPS_CODES = [
'01', '02', '04', '05', '06', '08', '09', '10', '11', '12', '13', '15',
'16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27',
'28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39',
'40', '41', '42', '44', '45', '46', '47', '48', '49', '50', '51', '53',
'54', '55', '56'
]
#%%
# API endpoint and parameters
BASE_URL = "https://api.census.gov/data/2023/acs/acs5"
VARIABLES = ["NAME", "B19013_001E"] # Add more variables if needed
#%%
for state_fips in STATE_FIPS_CODES:
print(f"Downloading data for state {state_fips}...")
params = {
"get": ",".join(VARIABLES),
"for": "block group:*",
"in": f"state:{state_fips}+county:*+tract:*",
"key": API_KEY
}
try:
response = requests.get(BASE_URL, params=params)
response.raise_for_status()
data = response.json()
columns = data[0]
rows = data[1:]
# Convert to DataFrame
df = pd.DataFrame(rows, columns=columns)
# Save to CSV
filename = f"state_{state_fips}.csv"
df.to_csv(filename, index=False)
print(f"✔️ Saved {filename} with {len(df)} records.")
except requests.exceptions.RequestException as e:
print(f"❌ Error fetching data for state {state_fips}: {e}")
# Respectful pause to avoid hammering the API
time.sleep(1)
we can use this approach to download various data points provided by census.gov, which could be used for further data exploration.
Comments
Post a Comment