THEOREM Datatape¶
Interest analysis and loan data reports for Theorem portfolio using. Includes interest calculations, loan tape data, payment tracking, and pipeline analysis.
Overview¶
Generation Frequency: Monthly Development Status: Stable
Key Features: - Interest calculation engine - Loan tape data processing - Payment tracking integration
Key Stakeholders: - Treasury Reporting Team - Theorem
Output Columns¶
The following columns are included in the final datatape output (filtered from SQL queries based on configuration):
Column Analysis Summary¶
- Total columns found in SQL queries: 24
- Columns specified in output_columns config: 7
- Columns found in SQL and included in output: 4
- Missing columns (in config but not found in SQL): 3
Note: Only columns that exist in SQL queries are included in the final output.
| Column Name | Required | Source Query | Source Table | Source Field | Derived | Transformations |
|---|---|---|---|---|---|---|
| accrued_interest | Yes | interest_calc | lpt | accrued_interest | Yes | |
| calculation_date | Yes | interest_calc | calculation_date | Yes | ||
| last_payment_date | Yes | interest_calc | lpt | last_payment_date | No | |
| principal_balance | Yes | interest_calc | lpt | principal_balance | Yes |
Note: Columns marked as 'Derived' are calculated or transformed rather than directly selected from the source table.
Data Sources¶
The configuration pulls data from the following sources:
| Source Name | Type | Query/File | Description |
|---|---|---|---|
| thorem_pipeline | database | thorem_pipeline.sql | SQL query for thorem_pipeline data |
| loan_tape | database | loan_tape.sql | SQL query for loan_tape data |
| theorem_payments | database | theorem_payments.sql | SQL query for theorem_payments data |
| interest_calc | database | interest_calc.sql | SQL query for interest_calc data |
Configuration¶
Portfolio Groups¶
theorem_portfolios: - Theorem Portfolio A - Theorem Portfolio B
Database Configuration¶
- dbcreds_environment: fusionods
Paths¶
- sql_dir: ./sql
- cache_dir: ./query_cache
- output_dir: ./completed_output
- log_dir: ./logs
SQL Queries¶
-- Optimized query for interest calculations
-- Parameters: {{cut_off_date}} should be replaced with the desired cutoff date
SELECT
lpt."Loan Account",
ROUND(lpt."Balance"::numeric, 2) as principal_balance,
ROUND((lpt."Interest Rate" / 100.0)::numeric, 4) as annual_rate,
lpt."Transaction Date" as last_payment_date,
'{{cut_off_date}}'::date as calculation_date,
('{{cut_off_date}}'::date - lpt."Transaction Date"::date) as days_since_payment,
ROUND((lpt."Interest Rate" / 100.0 / 360)::numeric, 6) as daily_rate,
ROUND((lpt."Balance" * (lpt."Interest Rate" / 100.0 / 360))::numeric, 2) as daily_interest,
ROUND((lpt."Balance" * (lpt."Interest Rate" / 100.0 / 360) *
('{{cut_off_date}}'::date - lpt."Transaction Date"::date))::numeric, 2) as accrued_interest,
cl."Status" as loan_status,
cl."Loan Type",
s."Sunnova System ID",
s."Placed InService Date"
FROM (
SELECT DISTINCT ON ("Loan Account")
"Loan Account",
"Created Date",
"Transaction Amount",
"Transaction Date",
"Transaction Time",
"Balance",
"Interest Rate",
"Interest"
FROM sfdc_staging.vwx_loan__loan_payment_transaction__c
WHERE "Transaction Date" <= '{{cut_off_date}}'
AND NOT "Reversed"
ORDER BY "Loan Account", "Transaction Date" DESC, "Transaction Time" DESC
) lpt
JOIN sfdc_staging.vwx_loan__loan_account__c cl
ON lpt."Loan Account" = cl."Record ID"
JOIN sfdc_staging.vwx_system__c s
ON cl."System" = s."Record ID"
WHERE s."Financing Company" = 'Theorem'
AND s."Asset Portfolio - Customer" = 'Theorem'
AND cl."Loan Type" != 'Scheduled Prepayment';
Data Processing¶
Data processing is handled by: datatapes/theorem/run_theorem.py
Processing Steps¶
The data processing includes: 1. SQL query execution 2. Data transformation and cleaning 3. Column mapping and validation 4. Output file generation
Python Modules¶
This datatape includes additional Python modules with business logic:
theorem_interest_data¶
Theorem Interest Analysis Script This script analyzes interest calculations for Theorem loans based on user-provided month.
analyze_interest_changes(current_df: pd.DataFrame, previous_df: pd.DataFrame) -> pd.DataFrame
¶
Analyze month-over-month interest changes.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
current_df
|
DataFrame
|
Current month's interest data |
required |
previous_df
|
DataFrame
|
Previous month's interest data |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame containing comparison analysis |
Source code in datatapes/theorem/theorem_interest_data.py
create_db_connection(db_config: dict) -> object
¶
Create database connection using SQLAlchemy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db_config
|
dict
|
Dictionary containing database configuration |
required |
Returns:
| Type | Description |
|---|---|
object
|
SQLAlchemy engine object |
Source code in datatapes/theorem/theorem_interest_data.py
display_summary_statistics(comparison_df: pd.DataFrame) -> None
¶
Display summary statistics using Rich tables.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
comparison_df
|
DataFrame
|
DataFrame containing comparison data |
required |
Source code in datatapes/theorem/theorem_interest_data.py
execute_and_save_query(conn: Connection, query: str, cutoff_date: str, query_name: str, output_path: Path) -> None
¶
Execute a query and save its results.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
conn
|
Connection
|
Database connection |
required |
query
|
str
|
SQL query string |
required |
cutoff_date
|
str
|
Cutoff date string |
required |
query_name
|
str
|
Name of the query for the output file |
required |
output_path
|
Path
|
Path where the file should be saved |
required |
Source code in datatapes/theorem/theorem_interest_data.py
get_cutoff_date(default_date: date = None, prompt_text: str = 'Enter cutoff date') -> str
¶
Get cutoff date from user input with validation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
default_date
|
date
|
Default date to display |
None
|
prompt_text
|
str
|
Text to display in the prompt |
'Enter cutoff date'
|
Returns:
| Type | Description |
|---|---|
str
|
Validated cutoff date string |
Source code in datatapes/theorem/theorem_interest_data.py
load_additional_queries(sql_folder: Path) -> Dict[str, str]
¶
Load additional SQL queries from the sql folder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sql_folder
|
Path
|
Path to the folder containing SQL files |
required |
Returns:
| Type | Description |
|---|---|
Dict[str, str]
|
Dictionary of query names and their contents |
Source code in datatapes/theorem/theorem_interest_data.py
load_config(config_path: str) -> dict
¶
Load database configuration from INI file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str
|
Path to the configuration file |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Dictionary containing database configuration |
Source code in datatapes/theorem/theorem_interest_data.py
load_sql_query(sql_path: str) -> str
¶
Load SQL query from file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sql_path
|
str
|
Path to the SQL query file |
required |
Returns:
| Type | Description |
|---|---|
str
|
SQL query string |
Source code in datatapes/theorem/theorem_interest_data.py
main(config_path: Optional[str] = None, sql_folder: Optional[str] = None, output_path: Optional[str] = None)
¶
Main execution function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
Optional[str]
|
Path to the configuration file (optional) |
None
|
sql_folder
|
Optional[str]
|
Path to the folder containing SQL files (optional) |
None
|
output_path
|
Optional[str]
|
Path where output files should be saved (optional) |
None
|
Source code in datatapes/theorem/theorem_interest_data.py
336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 | |
save_interest_data(df: pd.DataFrame, month: str, output_path: Path) -> None
¶
Save interest data to CSV file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
DataFrame containing interest data |
required |
month
|
str
|
Month identifier for filename |
required |
output_path
|
Path
|
Path where the file should be saved |
required |
Source code in datatapes/theorem/theorem_interest_data.py
validate_date(date_str: str) -> Optional[date]
¶
Validate date string format and return date object.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
date_str
|
str
|
Date string in YYYY-MM-DD format |
required |
Returns:
| Type | Description |
|---|---|
Optional[date]
|
datetime.date object if valid, None otherwise |