

# export_utils

`matching.export_utils`

Utilities for cleaning and exporting energy community matching results.

## Functions

| Name | Description |
|----|----|
| [check_for_issues](#nova_ec.matching.export_utils.check_for_issues) | Check for any remaining issues in the cleaned DataFrame. |
| [clean_column](#nova_ec.matching.export_utils.clean_column) | Clean a single column of text data. |
| [clean_dataframe_parallel](#nova_ec.matching.export_utils.clean_dataframe_parallel) | Clean DataFrame columns in parallel with progress visualization. |
| [diagnose_problematic_columns](#nova_ec.matching.export_utils.diagnose_problematic_columns) | Perform detailed diagnostics on potentially problematic columns. |
| [export_results](#nova_ec.matching.export_utils.export_results) | Clean and export results to CSV file with proper formatting. |
| [get_text_columns](#nova_ec.matching.export_utils.get_text_columns) | Identify potential text columns that need cleaning. |

### check_for_issues

``` python
matching.export_utils.check_for_issues(df)
```

Check for any remaining issues in the cleaned DataFrame.

Args: df: DataFrame to check

Returns: Dictionary of issues found by column and type

### clean_column

``` python
matching.export_utils.clean_column(args)
```

Clean a single column of text data.

Args: args: Tuple containing (column_name, series)

Returns: Tuple of (column_name, cleaned_series)

### clean_dataframe_parallel

``` python
matching.export_utils.clean_dataframe_parallel(df, max_workers=None)
```

Clean DataFrame columns in parallel with progress visualization.

Args: df: DataFrame to clean max_workers: Maximum number of worker threads (defaults to CPU count)

Returns: Cleaned DataFrame

### diagnose_problematic_columns

``` python
matching.export_utils.diagnose_problematic_columns(df, columns=None)
```

Perform detailed diagnostics on potentially problematic columns.

Args: df: DataFrame to analyze columns: List of columns to check (if None, will check all text columns)

Returns: Dictionary with detailed diagnostics per column

### export_results

``` python
matching.export_utils.export_results(
    df,
    output_path,
    parallel_workers=None,
    verify=True,
)
```

Clean and export results to CSV file with proper formatting.

Args: df: DataFrame to export output_path: Path to save the CSV file parallel_workers: Number of threads for parallel processing verify: Whether to verify the exported file

Returns: True if export was successful, False otherwise

### get_text_columns

``` python
matching.export_utils.get_text_columns(df)
```

Identify potential text columns that need cleaning.

Args: df: DataFrame to analyze

Returns: List of column names that are likely to contain text
