Data Collector

class Melodie.DataCollector(target='sqlite')

Bases: object

The DataCollector is responsible for recording data from the model’s simulation.

The DataCollector is initialized by the Model at the beginning of a simulation run. It allows users to specify which properties of agents and the environment should be collected at each time step.

At the end of the simulation, the save() method is called to write the recorded data to the specified output format (e.g., CSV files or a SQLite database).

Parameters:

target – A string indicating the output format. Supported values are “sqlite” (default) and “csv”.

setup()

A hook for setting up the DataCollector.

This method should be overridden in a subclass to specify which data to collect using add_agent_property() and add_environment_property().

time_elapsed()

Get the time spent in data collection.

Returns:

Elapsed time, a float value.

add_agent_property(container_name: str, property_name: str, as_type: Type = None)

Register an agent property to be collected from an agent container.

The data type for the corresponding database column can be explicitly defined using as_type.

Parameters:
  • container_name – The name of the agent container attribute on the Model object (e.g., ‘agents’).

  • property_name – The name of the property to be collected from each agent in the container.

  • as_type – The desired data type for the database column.

add_environment_property(property_name: str, as_type: Type = None)

Register an environment property to be collected.

Parameters:
  • property_name – The name of the property on the Environment object.

  • as_type – The desired data type for the database column.

add_custom_collector(table_name: str, row_collector: Callable[[M], Dict[str, Any] | List[Dict[str, Any]]], columns: List[str])

Add a custom data collector to generate a standalone data table.

Parameters:
  • table_name – The name of the table for storing the collected data.

  • row_collector – A callable that takes the Model instance as an argument and returns either a dictionary for a single row or a list of dictionaries for multiple rows.

  • columns – A list of column names for the custom table.

env_property_names() List[str]

Get the names of all registered environment properties.

Returns:

A list of property names.

agent_property_names() Dict[str, List[str]]

Get the names of all registered agent properties, grouped by container.

Returns:

A dictionary mapping container names to lists of property names.

agent_containers() List[Tuple[str, BaseAgentContainer]]

Get all agent containers that have properties registered for collection.

Returns:

A list of tuples, where each tuple contains the container name and the container object itself.

collect_agent_properties(period: int)

(Internal) Collect properties for all registered agent containers.

Parameters:

period – The current simulation period.

collect_custom_properties(period: int)

(Internal) Collect data using all registered custom collectors.

Parameters:

period – The current simulation period.

collect_single_custom_property(collector_name: str, period: int)
append_agent_properties_by_records(container_name: str, prop_names: List[str], container: AgentList, period: int)

(Internal) Record properties for a list of agents for the current period.

append_environment_properties(period: int)
property status: bool

Check if the data collector is enabled.

The DataCollector is only enabled when running under the Simulator. The Trainer and Calibrator are typically concerned only with the final state of a simulation, so recording time-series data is disabled during their execution to improve performance.

Returns:

True if the collector is enabled, otherwise False.

collect(period: int) None

The main data collection method, called by the Simulator at each step.

Parameters:

period – The current simulation period.

static calc_time(method)

Works as a decorator.

If you would like to define a custom data-collect method, please use DataCollector.calc_time as a decorator.

get_single_agent_data(agent_container_name: str, agent_id: int)

Get time series data of one agent.

Parameters:
  • agent_container_name – Attribute name in model.

  • agent_id – Agent id

Returns:

save()

Save all collected data to the specified output (CSV files or database).

save_dataframe(df: DataFrame, df_name: str, if_exists: str = 'append')

A utility method to save a pandas DataFrame to a CSV file in the output directory.

Parameters:
  • df – The pandas DataFrame to save.

  • df_name – The desired name for the output file (without extension).

  • if_exists – What to do if the file already exists. Can be ‘append’, ‘replace’, or ‘fail’.