pytorch_forecasting.data.data_module._encoder_decoder_data_module.EncoderDecoderTimeSeriesDataModule#

class pytorch_forecasting.data.data_module._encoder_decoder_data_module.EncoderDecoderTimeSeriesDataModule(time_series_dataset: TimeSeries, max_encoder_length: int = 30, min_encoder_length: int | None = None, max_prediction_length: int = 1, min_prediction_length: int | None = None, min_prediction_idx: int | None = None, allow_missing_timesteps: bool = False, add_relative_time_idx: bool = False, add_target_scales: bool = False, add_encoder_length: bool | str = 'auto', target_normalizer: TorchNormalizer | EncoderNormalizer | NaNLabelEncoder | str | list[TorchNormalizer | EncoderNormalizer | NaNLabelEncoder] | tuple[TorchNormalizer | EncoderNormalizer | NaNLabelEncoder] | None = None, categorical_encoders: dict[str, NaNLabelEncoder] | None = None, scalers: dict[str, StandardScaler | RobustScaler | TorchNormalizer | EncoderNormalizer] | None = None, randomize_length: None | tuple[float, float] | bool = False, batch_size: int = 32, num_workers: int = 0, train_val_test_split: tuple = (0.7, 0.15, 0.15))[source]#

Lightning DataModule for processing time series data in an encoder-decoder format.

This module handles preprocessing, splitting, and batching of time series data for use in deep learning models. It supports categorical and continuous features, various scalers, and automatic target normalization.

Parameters:

time_series_dataset (TimeSeries) – The dataset containing time series data.
max_encoder_length (int, default=30) – Maximum length of the encoder input sequence.
min_encoder_length (Optional[int], default=None) – Minimum length of the encoder input sequence. Defaults to max_encoder_length if not specified.
max_prediction_length (int, default=1) – Maximum length of the decoder output sequence.
min_prediction_length (Optional[int], default=None) – Minimum length of the decoder output sequence. Defaults to max_prediction_length if not specified.
min_prediction_idx (Optional[int], default=None) – Minimum index from which predictions start.
allow_missing_timesteps (bool, default=False) – Whether to allow missing timesteps in the dataset.
add_relative_time_idx (bool, default=False) – Whether to add a relative time index feature.
add_target_scales (bool, default=False) – Whether to add target scaling information.
add_encoder_length (Union[bool, str], default="auto") – Whether to include encoder length information.
target_normalizer (torch transformer, str, list, tuple, optional, default=None) – Transformer that takes group_ids, target and time_idx to normalize targets. You can choose from TorchNormalizer, GroupNormalizer, NaNLabelEncoder, EncoderNormalizer (on which overfitting tests will fail) or None for using no normalizer. For multiple targets, use a :py:class`~pytorch_forecasting.data.encoders.MultiNormalizer`. By default an appropriate normalizer is chosen automatically.
categorical_encoders (Optional[Dict[str, NaNLabelEncoder]], default=None) – Dictionary of categorical encoders.
scalers (optional, default=None) –
Mapping of continuous feature names to their designated scaling instances.

Defaults to None - an Identity pass-through, leaving the raw
feature values untouched.

Supported scaler options for individual feature keys include:
- PyTorch Forecasting Normalizers:
- Scikit-Learn Scalers:
  - StandardScaler
  - RobustScaler
  - MinMaxScaler
  - MaxAbsScaler
randomize_length (Union[None, Tuple[float, float], bool], default=False) – Whether to randomize input sequence length.
batch_size (int, default=32) – Batch size for DataLoader.
num_workers (int, default=0) – Number of workers for DataLoader.
train_val_test_split (tuple, default=(0.7, 0.15, 0.15)) – Proportions for train, validation, and test dataset splits.

prepare_data_per_node#: If True, each LOCAL_RANK=0 will call prepare data. Otherwise only NODE_RANK=0, LOCAL_RANK=0 will prepare data.

allow_zero_length_dataloader_with_multiple_devices#: If True, dataloader with zero length within local rank is allowed. Default value is False.

prepare_data_per_node#: If True, each LOCAL_RANK=0 will call prepare data. Otherwise only NODE_RANK=0, LOCAL_RANK=0 will prepare data.

allow_zero_length_dataloader_with_multiple_devices#: If True, dataloader with zero length within local rank is allowed. Default value is False.

Methods

`__delattr__`(name, /)	Implement delattr(self, name).
`__dir__`()	Default dir() implementation.
`__eq__`(value, /)	Return self==value.
`__format__`(format_spec, /)	Default object formatter.
`__ge__`(value, /)	Return self>=value.
`__getattribute__`(name, /)	Return getattr(self, name).
`__getstate__`()	Helper for pickle.
`__gt__`(value, /)	Return self>value.
`__hash__`()	Return hash(self).
`__init_subclass__`	This method is called when a class is subclassed.
`__le__`(value, /)	Return self<=value.
`__lt__`(value, /)	Return self<value.
`__ne__`(value, /)	Return self!=value.
`__new__`(args, *kwargs)
`__reduce__`()	Helper for pickle.
`__reduce_ex__`(protocol, /)	Helper for pickle.
`__repr__`()	Return repr(self).
`__setattr__`(name, value, /)	Implement setattr(self, name, value).
`__sizeof__`()	Size of object in memory, in bytes.
`__str__`()	Return a string representation of the datasets that are set up.
`__subclasshook__`	Abstract classes can override this to customize issubclass().
`_build_cont_scalers`()	Pre-resolve continuous feature scalers to (position, adapter) pairs.
`_coerce_sample`(sample)	Convert raw sample arrays to float tensors and compute time mask.
`_compute_data_properties`(train_indices)	Scan training targets to determine per-target type, positivity, skewness.
`_create_windows`(indices)	Generate sliding windows for training, validation, and testing.
`_ensure_split`()	Compute train/val/test indices once and cache them.
`_fit_scalers`(train_indices)	Fit scalers on continuous features in the training data.
`_fit_target_normalizer`(train_indices)	Fit target normalizer on the target variable's training data.
`_get_auto_normalizer`(data_properties)	Select normalizer based on data properties and current module config.
`_get_group_dataframe`(series_idx, n_timesteps)	Build a DataFrame with group columns for a given series.
`_make_dataset`(indices)	Preprocess a set of series indices into a windowed Dataset.
`_normalize_features`(continuous, series_idx)	Apply global continuous feature scalers.
`_normalize_target`(target, series_idx)	Apply global target normalization.
`_prepare_metadata`()	Prepare metadata for model initialisation.
`_preprocess_data`(series_idx)	Preprocess one series into a cache dict.
`_resolve_target_normalizer`(train_indices)	Resolve target normalizer
`_set_hparams`(hp)
`_split_features`(features)	Split feature tensor into categorical and continuous subsets.
`_to_hparams_dict`(hp)
`collate_fn`(batch)
`from_datasets`([train_dataset, val_dataset, ...])	Create an instance from torch.utils.data.Dataset.
`load_from_checkpoint`(checkpoint_path[, ...])	Primary way of loading a datamodule from a checkpoint.
`load_state_dict`(state_dict)	Called when loading a checkpoint, implement to reload datamodule state given datamodule state_dict.
`on_after_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch after it is transferred to the device.
`on_before_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch before it is transferred to the device.
`on_exception`(exception)	Called when the trainer execution is interrupted by an exception.
`predict_dataloader`()	An iterable or collection of iterables specifying prediction samples.
`prepare_data`()	Use this to download and prepare data.
`remove_ignored_hparams`(ignore_list)	Remove ignored hyperparameters from the stored state.
`save_hyperparameters`(*args[, ignore, frame, ...])	Save arguments to `hparams` attribute.
`setup`([stage])	Prepare the datasets for training, validation, testing, or prediction.
`state_dict`()	Called when saving a checkpoint, implement to generate and save datamodule state.
`teardown`(stage)	Called at the end of fit (train + validate), validate, test, or predict.
`test_dataloader`()	An iterable or collection of iterables specifying test samples.
`train_dataloader`()	An iterable or collection of iterables specifying training samples.
`transfer_batch_to_device`(batch, device, ...)	Override this hook if your `DataLoader` returns tensors wrapped in a custom data structure.
`val_dataloader`()	An iterable or collection of iterables specifying validation samples.

Attributes

`CHECKPOINT_HYPER_PARAMS_KEY`
`CHECKPOINT_HYPER_PARAMS_NAME`
`CHECKPOINT_HYPER_PARAMS_TYPE`
`__annotations__`
`__dict__`
`__doc__`
`__jit_unused_properties__`
`__module__`
`__weakref__`	list of weak references to the object
`hparams`	The collection of hyperparameters saved with `save_hyperparameters()`.
`hparams_initial`	The collection of hyperparameters saved with `save_hyperparameters()`.
`metadata`	Compute metadata for model initialization.
`name`