

# FinanceDataEngine

``` python
core.FinanceDataEngine(
    self,
    settings=None,
    console=None,
    logger=None,
    credential_path=None,
    use_keyring=False,
    keyring_service='nova_fde',
    keyring_username=None,
    interactive_auth=False,
    project_root=None,
)
```

Main orchestration engine for finance data processing.

## Methods

| Name | Description |
|----|----|
| [analyze_performance](#nova_fde.core.FinanceDataEngine.analyze_performance) | Generate performance metrics for processing runs. |
| [check_cache_status](#nova_fde.core.FinanceDataEngine.check_cache_status) | Check if queries exist in the cache and their expiration status. |
| [cleanup](#nova_fde.core.FinanceDataEngine.cleanup) | Clean up resources used by the engine. |
| [get_cached_queries](#nova_fde.core.FinanceDataEngine.get_cached_queries) | Retrieve query results from cache or execute queries if needed. |
| [get_query_sql](#nova_fde.core.FinanceDataEngine.get_query_sql) | Retrieve and optionally export the raw SQL behind query files. |
| [process_data](#nova_fde.core.FinanceDataEngine.process_data) | Process data using a generic approach, with enhanced return type handling. |
| [save_query_results](#nova_fde.core.FinanceDataEngine.save_query_results) | Save query results to CSV files without additional processing. |

### analyze_performance

``` python
core.FinanceDataEngine.analyze_performance()
```

Generate performance metrics for processing runs.

Collects query statistics from the query timer and compiles them into a performance metrics dictionary. Also generates a performance report through the analyzer component.

#### Returns

| Name | Type | Description |
|----|----|----|
|  | Dict | Dictionary containing performance metrics with the following keys: - run_count: Number of processing runs (currently always 1). - success_rate: Percentage of successful queries (currently 100%). - average_duration: Average duration of all queries in seconds. - performance_trend: Dict with keys for duration_trend, success_rate, and query_count, each with values indicating whether the metric is ‘improving’, ‘stable’, or ‘declining’. |

#### Raises

| Name | Type | Description |
|----|----|----|
|  | Exception | Any exception that occurs during performance analysis will be logged and re-raised. |

#### Notes

Currently, the implementation is basic and does not track historical performance trends. Future enhancements could include comparing against previous runs to determine actual trends.

### check_cache_status

``` python
core.FinanceDataEngine.check_cache_status(query_names, check_expiry=True)
```

Check if queries exist in the cache and their expiration status.

This method provides information about cache status without actually loading the cached data, which can be useful for making decisions about data processing workflows.

#### Parameters

| Name | Type | Description | Default |
|----|----|----|----|
| query_names | Union\[str, List\[str\]\] | Single query name or list of query names to check. | *required* |
| check_expiry | bool | Whether to check expiration status, by default True. | `True` |

#### Returns

| Name | Type | Description |
|----|----|----|
|  | Dict\[str, Dict\[str, Union\[bool, datetime, None\]\]\] | Dictionary mapping query names to their cache status information: { ‘query_name’: { ‘exists’: bool, \# Whether the cache exists ‘created_date’: datetime, \# When the cache was created (or None) ‘expired’: bool, \# Whether the cache is expired (or None if not checking) ‘expiry_date’: datetime, \# When the cache will expire (or None) ‘file_size_mb’: float \# Size of the cache file in MB } } |

#### Notes

This method only checks cache existence and metadata; it doesn’t load or validate the cached data content.

#### Examples

``` python
>>> engine = FinanceDataEngine(use_keyring=True)
>>> cache_status = engine.check_cache_status(["payments", "systems"])
>>> for query, status in cache_status.items():
...     print(f"{query}: {'Available' if status['exists'] else 'Not in cache'}")
...     if status['exists']:
...         print(f"  Created: {status['created_date']}")
...         print(f"  Expired: {status['expired']}")
...         print(f"  Size: {status['file_size_mb']:.2f} MB")
```

### cleanup

``` python
core.FinanceDataEngine.cleanup()
```

Clean up resources used by the engine.

This method performs necessary cleanup operations: 1. Clears expired cache entries via the cache manager. 2. Closes database connections via the database component.

It should be called when the engine is no longer needed to ensure proper resource management and prevent resource leaks.

#### Returns

| Name | Type | Description |
|------|------|-------------|
|      | None |             |

#### Raises

| Name | Type | Description |
|----|----|----|
|  | Exception | Any exception that occurs during cleanup will be logged and re-raised. Common exceptions might include file system errors during cache clearance or database errors when closing connections. |

#### Notes

It’s recommended to use this method within a try-finally block or a context manager to ensure resources are always cleaned up, even if an exception occurs during processing.

#### Examples

``` python
>>> engine = FinanceDataEngine(use_keyring=True)
>>> try:
...     result = engine.process_data(...)
... finally:
...     engine.cleanup()
```

### get_cached_queries

``` python
core.FinanceDataEngine.get_cached_queries(
    queries,
    force_refresh=False,
    query_params=None,
    cache_expiry_days=None,
)
```

Retrieve query results from cache or execute queries if needed.

This method provides direct access to query results without additional processing. It checks the cache for each query and either returns the cached result or executes the query and caches the new result.

#### Parameters

| Name | Type | Description | Default |
|----|----|----|----|
| queries | Dict\[str, str\] | Dictionary mapping query names to SQL file names. | *required* |
| force_refresh | bool | Whether to force refresh of cached data, by default False. | `False` |
| query_params | Optional\[Dict\] | Parameters to pass to SQL queries, by default None. | `None` |
| cache_expiry_days | Optional\[int\] | Number of days before cache expires, by default None. | `None` |

#### Returns

| Name | Type | Description |
|----|----|----|
|  | Dict\[str, pd.DataFrame\] | Dictionary mapping query names to their respective DataFrames. |

#### Raises

| Name | Type            | Description                                           |
|------|-----------------|-------------------------------------------------------|
|      | ConnectionError | If a valid database connection cannot be established. |
|      | RuntimeError    | If any query execution fails.                         |

#### Notes

This method is useful when you need to access cached query results without going through the full data processing pipeline.

#### Examples

``` python
>>> engine = FinanceDataEngine(use_keyring=True)
>>> queries = {
...     "systems": "systems.sql",
...     "payments": "payments.sql"
... }
>>> data_frames = engine.get_cached_queries(queries)
>>> systems_df = data_frames["systems"]
>>> payments_df = data_frames["payments"]
```

### get_query_sql

``` python
core.FinanceDataEngine.get_query_sql(
    query_names,
    export_path=None,
    query_params=None,
    render_parameters=True,
)
```

Retrieve and optionally export the raw SQL behind query files.

#### Parameters

| Name | Type | Description | Default |
|----|----|----|----|
| query_names | Union\[str, List\[str\]\] | Single query name or list of query names to retrieve SQL for. The names should match the SQL file names without the .sql extension. | *required* |
| export_path | Optional\[Union\[str, Path\]\] | Path to export SQL files to, by default None (no export). | `None` |
| query_params | Optional\[Dict\] | Parameters to render in the SQL queries, by default None. | `None` |
| render_parameters | bool | Whether to render parameters in the SQL, by default True. | `True` |

#### Returns

| Name | Type | Description |
|----|----|----|
|  | Dict\[str, str\] | Dictionary mapping query names to their raw SQL content. |

#### Raises

| Name | Type              | Description                              |
|------|-------------------|------------------------------------------|
|      | FileNotFoundError | If any of the SQL files cannot be found. |

#### Notes

If render_parameters is True and query_params is provided, the SQL will have parameters rendered using the parameter values. Otherwise, the raw SQL with parameter placeholders will be returned.

#### Examples

``` python
>>> engine = FinanceDataEngine(use_keyring=True)
>>> sql_dict = engine.get_query_sql("payments", render_parameters=True,
...                                 query_params={"target_date": "2023-01-01"})
>>> print(sql_dict["payments"])
SELECT * FROM payments WHERE payment_date >= '2023-01-01'
```

### process_data

``` python
core.FinanceDataEngine.process_data(
    queries,
    process_func,
    output_name,
    force_refresh=False,
    override_folder=None,
    analyze=True,
    query_params=None,
    cache_expiry_days=None,
    save_raw_results=False,
    return_results_dict=False,
)
```

Process data using a generic approach, with enhanced return type handling.

#### Parameters

| Name | Type | Description | Default |
|----|----|----|----|
| queries | Dict\[str, str\] | Dictionary mapping query names to SQL file names. | *required* |
| process_func | Callable | Function to process the data, will be called with dict of dataframes and processor. | *required* |
| output_name | str | Base name for output files. | *required* |
| force_refresh | bool | Whether to force refresh of cached data, by default False. | `False` |
| override_folder | Optional\[str\] | Optional subfolder for output, by default None. | `None` |
| analyze | bool | Whether to run data analysis, by default True. | `True` |
| query_params | Optional\[Dict\] | Parameters to pass to SQL queries, by default None. | `None` |
| cache_expiry_days | Optional\[int\] | Number of days before cache expires, by default None. | `None` |
| save_raw_results | bool | Whether to save the raw query results before processing, by default False. | `False` |
| return_results_dict | bool | Whether to return a dictionary that includes the result_df, by default False. | `False` |

#### Returns

| Name | Type | Description |
|----|----|----|
|  | Union\[Dict, pd.DataFrame\] | Processing results including data frames and metadata. |

#### Notes

If return_results_dict is True, a dictionary will be returned with the following keys: - status: ‘success’ or ‘error’ - result_df: The processed data (DataFrame or dict of DataFrames) - duration: Processing time in seconds - rows_processed: Number of rows processed - query_stats: Statistics from query execution

If return_results_dict is False, the processed data will be returned directly.

### save_query_results

``` python
core.FinanceDataEngine.save_query_results(
    data_frames,
    base_name,
    override_folder=None,
    include_date=True,
)
```

Save query results to CSV files without additional processing.

#### Parameters

| Name | Type | Description | Default |
|----|----|----|----|
| data_frames | Dict\[str, pd.DataFrame\] | Dictionary of DataFrames to save, typically from get_cached_queries(). | *required* |
| base_name | str | Base name for the output files. | *required* |
| override_folder | Optional\[str\] | Subfolder within the output directory to save files to, by default None. | `None` |
| include_date | bool | Whether to include date in filenames, by default True. | `True` |

#### Returns

| Name | Type | Description |
|----|----|----|
|  | Dict\[str, str\] | Dictionary mapping query names to their saved file paths. |

#### Raises

| Name | Type      | Description                                    |
|------|-----------|------------------------------------------------|
|      | TypeError | If any item in data_frames is not a DataFrame. |

#### Notes

This method creates files with names in the format: {base_name}\_{query_name}\[\_{date}\].csv

#### Examples

``` python
>>> engine = FinanceDataEngine(use_keyring=True)
>>> queries = {
...     "systems": "systems.sql",
...     "payments": "payments.sql"
... }
>>> data_frames = engine.get_cached_queries(queries)
>>> saved_files = engine.save_query_results(data_frames, "raw_data")
>>> for query_name, file_path in saved_files.items():
...     print(f"{query_name} saved to {file_path}")
```
