nova_ec
  • Home
  • API Reference
  • Tutorial
  • Getting Started
  • Configuration
  1. API Reference
  2. Matching Module
  3. export_utils
  • Overview
    • Overview
  • Getting Started
    • Installation and Setup
    • Configuration
  • Tutorial
    • Project Setup Tutorial
  • API Reference
    • Function reference
    • Config Module
      • config_manager
    • Data Module
      • system_data
      • energy_community
    • Geocoding Module
      • geocoder
    • Matching Module
      • ec_matcher
      • county_matcher
      • eligibility
      • export_utils
    • Retrieval Module
      • data_retriever
    • Utils Module
      • logger
      • analysis_utils
    • Main Module
      • main
      • cli

On this page

  • export_utils
    • Functions
      • check_for_issues
      • clean_column
      • clean_dataframe_parallel
      • diagnose_problematic_columns
      • export_results
      • get_text_columns

Other Formats

  • Github (GFM)
  1. API Reference
  2. Matching Module
  3. export_utils

export_utils

matching.export_utils

Utilities for cleaning and exporting energy community matching results.

Functions

Name Description
check_for_issues Check for any remaining issues in the cleaned DataFrame.
clean_column Clean a single column of text data.
clean_dataframe_parallel Clean DataFrame columns in parallel with progress visualization.
diagnose_problematic_columns Perform detailed diagnostics on potentially problematic columns.
export_results Clean and export results to CSV file with proper formatting.
get_text_columns Identify potential text columns that need cleaning.

check_for_issues

matching.export_utils.check_for_issues(df)

Check for any remaining issues in the cleaned DataFrame.

Args: df: DataFrame to check

Returns: Dictionary of issues found by column and type

clean_column

matching.export_utils.clean_column(args)

Clean a single column of text data.

Args: args: Tuple containing (column_name, series)

Returns: Tuple of (column_name, cleaned_series)

clean_dataframe_parallel

matching.export_utils.clean_dataframe_parallel(df, max_workers=None)

Clean DataFrame columns in parallel with progress visualization.

Args: df: DataFrame to clean max_workers: Maximum number of worker threads (defaults to CPU count)

Returns: Cleaned DataFrame

diagnose_problematic_columns

matching.export_utils.diagnose_problematic_columns(df, columns=None)

Perform detailed diagnostics on potentially problematic columns.

Args: df: DataFrame to analyze columns: List of columns to check (if None, will check all text columns)

Returns: Dictionary with detailed diagnostics per column

export_results

matching.export_utils.export_results(
    df,
    output_path,
    parallel_workers=None,
    verify=True,
)

Clean and export results to CSV file with proper formatting.

Args: df: DataFrame to export output_path: Path to save the CSV file parallel_workers: Number of threads for parallel processing verify: Whether to verify the exported file

Returns: True if export was successful, False otherwise

get_text_columns

matching.export_utils.get_text_columns(df)

Identify potential text columns that need cleaning.

Args: df: DataFrame to analyze

Returns: List of column names that are likely to contain text

eligibility
data_retriever
 
 
  • Built with [Quarto](https://quarto.org/) and [quartodoc](https://machow.github.io/quartodoc/)