county_matcher
matching.county_matcher
Module for improved energy community matching using both county and spatial approaches.
Functions
| Name | Description |
|---|---|
| analyze_discrepancies | Analyze discrepancies between spatial and county-based matching. |
| create_ffsa_lookup | Create a lookup dictionary from FFSA data that properly preserves |
| integrated_matching | Integrated function that conducts both spatial and county-based matching |
| is_in_ffsa | Check if a system is in an energy community based on county and state information. |
| match_systems_by_county_with_eligibility | Match solar systems to energy communities based on county and state, |
| standardize_county_name | Standardize county name by removing suffixes and standardizing format. |
analyze_discrepancies
matching.county_matcher.analyze_discrepancies(results_df)Analyze discrepancies between spatial and county-based matching.
This function helps identify why certain systems were matched by only one method, which can help improve future matching processes.
Args: results_df: DataFrame with both spatial and county matching results
Returns: DataFrame with discrepancy analysis
create_ffsa_lookup
matching.county_matcher.create_ffsa_lookup(ffsa_data)Create a lookup dictionary from FFSA data that properly preserves state-county relationships to prevent false positives from counties with the same name in different states.
Args: ffsa_data: DataFrame containing FFSA data with county_std and state_std columns
Returns: dict: Dictionary mapping (state, county) tuples to True for matching
integrated_matching
matching.county_matcher.integrated_matching(
solar_systems_df,
CCEC_2023=None,
CCEC_2024=None,
FFSAEC_2023=None,
FFSAEC_2024=None,
ffsa_2023_appendix=None,
ffsa_2024_appendix=None,
debug=True,
)Integrated function that conducts both spatial and county-based matching with comprehensive comparison between methods to identify discrepancies.
This function leverages both the spatial matching from analysis_utils.py and the county-based matching from county_matcher.py.
Args: solar_systems_df: DataFrame containing solar system data CCEC_2023: Coal Closure Energy Communities for 2023 CCEC_2024: Coal Closure Energy Communities for 2024 FFSAEC_2023: Fossil Fuel Statistical Areas for 2023 FFSAEC_2024: Fossil Fuel Statistical Areas for 2024 ffsa_2023_appendix: 2023 FFSA appendix DataFrame ffsa_2024_appendix: 2024 FFSA appendix DataFrame debug: Whether to enable debug logging
Returns: DataFrame with comprehensive matching results
is_in_ffsa
matching.county_matcher.is_in_ffsa(row, lookup_dict)Check if a system is in an energy community based on county and state information.
This improved version prioritizes exact state+county matches and only falls back to county-only matches when necessary, preventing false positives from counties with the same name in different states.
Args: row: DataFrame row with standardized county and state information lookup_dict: Dictionary mapping (state, county) tuples to boolean values
Returns: bool: True if the system is in an energy community, False otherwise
match_systems_by_county_with_eligibility
matching.county_matcher.match_systems_by_county_with_eligibility(
solar_systems_df,
ffsa_2023_appendix,
ffsa_2024_appendix,
eligibility_df=None,
debug=True,
)Match solar systems to energy communities based on county and state, using a direct join approach to ensure both fields are matched simultaneously.
Args: solar_systems_df: DataFrame containing solar system data ffsa_2023_appendix: 2023 FFSA appendix DataFrame ffsa_2024_appendix: 2024 FFSA appendix DataFrame eligibility_df: Optional DataFrame with eligibility flags debug: Whether to enable debug logging
Returns: DataFrame with county-based matching results
standardize_county_name
matching.county_matcher.standardize_county_name(county, state=None)Standardize county name by removing suffixes and standardizing format.
Args: county: County name to standardize state: State name or abbreviation (optional, used for special cases)
Returns: Standardized county name or None if input is None