DataLoader

class Melodie.data_loader.DataFrameInfo(df_name: str, columns: ~typing.Dict[str, <module 'sqlalchemy.types' from '/Users/songminyu/GitHub/Melodie/.venv/lib/python3.11/site-packages/sqlalchemy/types.py'>], file_name: str = '', engine: str = 'pandas')

DataFrameInfo provides standard format for input tables as parameters.

Parameters:
  • df_name – Name of dataframe.

  • columns – A dict, column name --> column data type.

  • file_name – File name to load this dataframe, None by default. If None, be sure to generate the dataframe in the DataLoader.

  • engine – The library used to load this table file. Valid values are “pandas” and “melodie-table”. However, if DataFrameInfo.FORCE_PANDAS was True, Melodie will use "pandas" to load all dataframes.

class Melodie.data_loader.MatrixInfo(mat_name: str, data_type: <module 'sqlalchemy.types' from '/Users/songminyu/GitHub/Melodie/.venv/lib/python3.11/site-packages/sqlalchemy/types.py'> | None = None, file_name: str | None = None)

MatrixInfo provides standard format for input matrices as parameters.

Parameters:
  • mat_name – Name of the current matrix.

  • columns – A type indicating the data type in the matrix.

  • file_name – File name to load this dataframe, None by default. If None, be sure to generate the dataframe in the DataLoader.

class Melodie.data_loader.DataLoader(manager, config: Config, scenario_cls: Type[Scenario], as_sub_worker=False)

DataLoader loads dataframes or matrices.

Simulator/Trainer/Calibrator will have reference to DataLoader to avoid defining tables multiple times.

Parameters:
  • manager – The Simulator/Trainer/Calibrator this dataloader belongs to.

  • config – A Melodie.Config instance, the configuration in current project.

  • scenario_cls – The class of scenario used in this project.

  • as_sub_worker – If True, DataLoader will be disabled to avoid unneed database operations.

load_dataframe(df_info: str | DataFrameInfo, df_name='')

Load a data frame from table file.

Df_info:

The file name of that containing the data frame, or pass a DataFrameInfo

load_matrix(mat_info: str | MatrixInfo, mat_name='') ndarray

Load a matrix from table file.

Mat_info:

The file name of that containing the matrix, or pass a DataFrameInfo

register_dataframe(table_name: str, data_frame: DataFrame, data_types: dict = None) None

Register a pandas dataframe.

Parameters:
  • table_name – Name of dataframe

  • data_frame – A pandas dataframe

  • data_types – A dictionary describing data types.

Returns:

None

clear_cache()

Clear all caches under caching directory.

dataframe_generator(df_info: str | DataFrameInfo, rows_in_scenario: int | Callable[[Scenario], int]) DataFrameGenerator

Create a new generator for dataframes.

Parameters:
  • df_info – Dataframe info indicating the dataframe to be generated.

  • rows_in_scenario – How many rows will be generated for a specific scenario. This argument should be an integer as number of rows for each scenario, or a function with a parameter with type Scenario and return an integer for how many rows to generate for this scenario .

Returns:

A dataframe generator object

generate_scenarios_from_dataframe(df_name: str) List[Scenario]

Generate scenario objects by the parameter from static table named df_name.

Parameters:

df_name – Name of static table.

Returns:

A list of scenario object.

generate_scenarios(manager_type: str) List[Scenario]

Generate scenario objects by the parameter from static tables or scenarios_dataframe.

Parameters:

manager_type – The type of scenario manager, a str in “simulator”, “trainer” or “calibrator”.

Returns:

A list of scenarios.