Feature Engineering and Exploratory Data Analysis
For this week, I did Feature Engineering and Exploratory Data Analysis (EDA), using the Alabama data as a prototype
In the Feature Engineering phase, I utilized a Python program to merge the block-group-level broadband speed data (summarized to show max speed) with the median income data and the U.S. Census shapefiles, successfully creating a unified GeoDataFrame
The Socioeconomic Vulnerability Index (SVI): Calculated by inverting the normalized median income, this score quantifies digital inequity risk, with values closer to 1 indicating high vulnerability (low income).
Neighbor Average Speed: This spatial feature captures the average broadband speed of all adjacent block groups, serving as a powerful proxy for regional infrastructure investment.
Neighbor Average SVI: This feature indicates whether a block group is situated within a larger cluster of vulnerable areas, showing that the digital divide is a widespread community issue.
Comments
Post a Comment