nova_ec
  • Home
  • API Reference
  • Tutorial
  • Getting Started
  • Configuration
  1. API Reference
  2. Data Module
  3. system_data
  • Overview
    • Overview
  • Getting Started
    • Installation and Setup
    • Configuration
  • Tutorial
    • Project Setup Tutorial
  • API Reference
    • Function reference
    • Config Module
      • config_manager
    • Data Module
      • system_data
      • energy_community
    • Geocoding Module
      • geocoder
    • Matching Module
      • ec_matcher
      • county_matcher
      • eligibility
      • export_utils
    • Retrieval Module
      • data_retriever
    • Utils Module
      • logger
      • analysis_utils
    • Main Module
      • main
      • cli

On this page

  • system_data
    • Classes
      • DateValidationResult
    • Functions
      • add_state_full_name_column
      • calculate_distances
      • clean_data
      • convert_and_validate_dates
      • count_rows_with_missing_geographical_info
      • filter_columns
      • get_unique_states
      • load_data
      • print_validation_results
      • process_solar_systems
      • systems_with_no_county
      • systems_with_no_lat_lon

Other Formats

  • Github (GFM)
  1. API Reference
  2. Data Module
  3. system_data

system_data

data.system_data

Module for loading and processing solar system data.

Classes

Name Description
DateValidationResult Model for date validation results.

DateValidationResult

data.system_data.DateValidationResult()

Model for date validation results.

Functions

Name Description
add_state_full_name_column Add full state names to the DataFrame.
calculate_distances Calculate distances between approximate and ArcGIS coordinates.
clean_data Load and clean the data.
convert_and_validate_dates Convert date columns to datetime and validate the conversion.
count_rows_with_missing_geographical_info Count rows with missing geographical information.
filter_columns Filter DataFrame to keep only specified columns.
get_unique_states Get unique states in the dataset.
load_data Load data from a CSV file.
print_validation_results Print date validation results in a table format.
process_solar_systems Process solar system data from a file.
systems_with_no_county Identify systems with no county information.
systems_with_no_lat_lon Identify systems with no latitude and longitude.

add_state_full_name_column

data.system_data.add_state_full_name_column(df, state_dict)

Add full state names to the DataFrame.

Args: df: DataFrame to process state_dict: Dictionary mapping state abbreviations to full names

Returns: DataFrame with added StateFullName column

calculate_distances

data.system_data.calculate_distances(df, max_workers=8)

Calculate distances between approximate and ArcGIS coordinates.

Args: df: DataFrame containing coordinate columns max_workers: Maximum number of parallel workers

Returns: Tuple of (DataFrame with distance column, summary dict, list of skipped indices)

clean_data

data.system_data.clean_data(filepath)

Load and clean the data.

Args: filepath: Path to the data file

Returns: Cleaned DataFrame

convert_and_validate_dates

data.system_data.convert_and_validate_dates(df, date_columns)

Convert date columns to datetime and validate the conversion.

Args: df: DataFrame to process date_columns: List of date columns to convert

Returns: Tuple of (processed DataFrame, validation results)

count_rows_with_missing_geographical_info

data.system_data.count_rows_with_missing_geographical_info(df)

Count rows with missing geographical information.

Args: df: DataFrame to check

Returns: Number of rows with missing geographical information

filter_columns

data.system_data.filter_columns(df, columns_to_keep)

Filter DataFrame to keep only specified columns.

Args: df: DataFrame to filter columns_to_keep: List of columns to keep

Returns: Filtered DataFrame

get_unique_states

data.system_data.get_unique_states(df)

Get unique states in the dataset.

Args: df: DataFrame to check

Returns: Set of unique states

load_data

data.system_data.load_data(filepath)

Load data from a CSV file.

Args: filepath: Path to the CSV file

Returns: DataFrame containing the loaded data

Raises: FileNotFoundError: If the file doesnโ€™t exist

print_validation_results

data.system_data.print_validation_results(validation_results)

Print date validation results in a table format.

Args: validation_results: Dictionary of validation results by column

process_solar_systems

data.system_data.process_solar_systems(systems_path, date_cols, systems_cols)

Process solar system data from a file.

Args: systems_path: Path to the systems data file date_cols: List of date columns to process systems_cols: List of columns to include

Returns: Processed DataFrame

systems_with_no_county

data.system_data.systems_with_no_county(df)

Identify systems with no county information.

Args: df: DataFrame to check

Returns: Boolean Series indicating rows with no county

systems_with_no_lat_lon

data.system_data.systems_with_no_lat_lon(df)

Identify systems with no latitude and longitude.

Args: df: DataFrame to check

Returns: Boolean Series indicating rows with no lat/lon

config_manager
energy_community
 
 
  • Built with [Quarto](https://quarto.org/) and [quartodoc](https://machow.github.io/quartodoc/)