Downloading data from census.gov's ACS

 


    The census.gov website has ACS ( American Community survey ) data that has a wealth of information about demographics, median household income etc. It provides API support to download the data. In this tutorial, I will explain how to download the data using API calls, and also creating a python script to automate downloading data with multiple API calls.

step1: get a API key. The API key is needed to make the API calls. sign up and get the key in this URL:  https://api.census.gov/data/key_signup.html

step2 : assemble the API call URL. depending upon the information needed, we have to assemble the URL needed to do the API call. 

for example: https://api.census.gov/data/2023/acs/acs5?get=NAME,B19013_001E&for=block%20group:*&in=state:36%20county:119&key=xxxxxxxxxxxxxxxxx

in the above API Call url, 
B19013_001E is the data point for household median income. 
in=state:36%20county:119 represents NY state and Westchester county.
&for=block%20group:* is for getting all block groups data for the specified State and County.

pasting the API Call URL in a browser would download the data in csv format.

step3: Automation

lets say we want to download this data for all of USA. we can do iterative calls using Python. below python program  would get household median income for all Census Block groups in USA as a seperate file for each state. 

#%%
import requests
import pandas as pd
import time
#%%
# Census API key
API_KEY = "xxxxxxxxxxxxxxxxxxxxxxxx"
#%%
# List of state FIPS codes (01 to 56, skipping 3 invalid codes: 03, 07, 14)
STATE_FIPS_CODES = [
'01', '02', '04', '05', '06', '08', '09', '10', '11', '12', '13', '15',
'16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27',
'28', '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39',
'40', '41', '42', '44', '45', '46', '47', '48', '49', '50', '51', '53',
'54', '55', '56'
]
#%%
# API endpoint and parameters
BASE_URL = "https://api.census.gov/data/2023/acs/acs5"
VARIABLES = ["NAME", "B19013_001E"] # Add more variables if needed
#%%
for state_fips in STATE_FIPS_CODES:
print(f"Downloading data for state {state_fips}...")

params = {
"get": ",".join(VARIABLES),
"for": "block group:*",
"in": f"state:{state_fips}+county:*+tract:*",
"key": API_KEY
}

try:
response = requests.get(BASE_URL, params=params)
response.raise_for_status()

data = response.json()
columns = data[0]
rows = data[1:]

# Convert to DataFrame
df = pd.DataFrame(rows, columns=columns)

# Save to CSV
filename = f"state_{state_fips}.csv"
df.to_csv(filename, index=False)
print(f"✔️ Saved {filename} with {len(df)} records.")

except requests.exceptions.RequestException as e:
print(f" Error fetching data for state {state_fips}: {e}")

# Respectful pause to avoid hammering the API
time.sleep(1)
we can use this approach to download various data points provided by census.gov, which could be used for further data exploration. 

Comments

Popular posts from this blog

Wrapping Up: A Geospatial Dive into the U.S. Broadband Divide

MSDS Practicum Project - US Broadband divide