DataLoader
- class Melodie.data_loader.DataFrameInfo(df_name: str, columns: ~typing.Dict[str, <module 'sqlalchemy.types' from '/Users/songminyu/GitHub/Melodie/.venv/lib/python3.11/site-packages/sqlalchemy/types.py'>], file_name: str = '', engine: str = 'pandas')
DataFrameInfo provides standard format for input tables as parameters.
- Parameters:
df_name – Name of dataframe.
columns – A dict,
column name --> column data type.file_name – File name to load this dataframe, None by default. If None, be sure to generate the dataframe in the DataLoader.
engine – The library used to load this table file. Valid values are “pandas” and “melodie-table”. However, if
DataFrameInfo.FORCE_PANDASwasTrue, Melodie will use"pandas"to load all dataframes.
- class Melodie.data_loader.MatrixInfo(mat_name: str, data_type: <module 'sqlalchemy.types' from '/Users/songminyu/GitHub/Melodie/.venv/lib/python3.11/site-packages/sqlalchemy/types.py'> | None = None, file_name: str | None = None)
MatrixInfo provides standard format for input matrices as parameters.
- Parameters:
mat_name – Name of the current matrix.
columns – A type indicating the data type in the matrix.
file_name – File name to load this dataframe, None by default. If None, be sure to generate the dataframe in the DataLoader.
- class Melodie.data_loader.DataLoader(manager, config: Config, scenario_cls: Type[Scenario], as_sub_worker=False)
DataLoader loads dataframes or matrices.
Simulator/Trainer/Calibratorwill have reference to DataLoader to avoid defining tables multiple times.- Parameters:
manager – The
Simulator/Trainer/Calibratorthis dataloader belongs to.config – A
Melodie.Configinstance, the configuration in current project.scenario_cls – The class of scenario used in this project.
as_sub_worker – If True, DataLoader will be disabled to avoid unneed database operations.
- load_dataframe(df_info: str | DataFrameInfo, df_name='')
Load a data frame from table file.
- Df_info:
The file name of that containing the data frame, or pass a DataFrameInfo
- load_matrix(mat_info: str | MatrixInfo, mat_name='') ndarray
Load a matrix from table file.
- Mat_info:
The file name of that containing the matrix, or pass a DataFrameInfo
- register_dataframe(table_name: str, data_frame: DataFrame, data_types: dict = None) None
Register a pandas dataframe.
- Parameters:
table_name – Name of dataframe
data_frame – A pandas dataframe
data_types – A dictionary describing data types.
- Returns:
None
- clear_cache()
Clear all caches under caching directory.
- dataframe_generator(df_info: str | DataFrameInfo, rows_in_scenario: int | Callable[[Scenario], int]) DataFrameGenerator
Create a new generator for dataframes.
- Parameters:
df_info – Dataframe info indicating the dataframe to be generated.
rows_in_scenario – How many rows will be generated for a specific scenario. This argument should be an integer as number of rows for each scenario, or a function with a parameter with type Scenario and return an integer for how many rows to generate for this scenario .
- Returns:
A dataframe generator object