Example Notebook for a basic vignette for pytorch-forecasting v2 Model Training and Inference#
- warning:
The “Data Pipeline” showcased here is part of an experimental rework of the
pytorch-forecastingdata layer, planned for release in v2.0.0. The API is currently unstable and subject to change without prior notice. This notebook serves as a basic demonstration of the intended workflow and is not recommended for use in production environments. Feedback and suggestions are highly encouraged — please share them in issue 1736.
In this notebook, we demonstrate how to train and evaluate the Temporal Fusion Transformer (TFT) using the new TimeSeries and DataModule API from the v2 pipeline. We can do this in 2 ways:
High-level package API:
This approach handles data loading, dataloader creation, and model training internally. It provides a simple,
scikit-learn-likefit→predictworkflow. Users can still configure key training options (such as thetrainer, callbacks, and training parameters) but cannot plug in fully customtrainerimplementations or override internal pipeline logic.Low-level 3-stage pipeline: This involves explicitly constructing:
a
TimeSeriesobjecta
DataModulethe model (e.g.,
TFT)
This workflow is ideal if you need custom setups such as custom trainers, callbacks, or advanced data preprocessing. It requires a deeper understanding of how the three layers (TimeSeries, DataModule, and the model) interact, but offers maximum flexibility.
Create Synthetic data#
We generate a synthetic dataset using load_toydata that creates a pandas DataFrame with just numerical values as for now the pipeline assumes the data to be numerical only.
[2]:
from pytorch_forecasting.data.examples import load_toydata
[3]:
num_series = 100 # Number of individual time series to generate
seq_length = 50 # Length of each time series
data_df = load_toydata(num_series, seq_length)
data_df.head()
[3]:
| series_id | time_idx | x | y | category | future_known_feature | static_feature | static_feature_cat | |
|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | -0.030643 | 0.148280 | 0 | 1.000000 | 0.039213 | 0 |
| 1 | 0 | 1 | 0.148280 | 0.433029 | 0 | 0.995004 | 0.039213 | 0 |
| 2 | 0 | 2 | 0.433029 | 0.742511 | 0 | 0.980067 | 0.039213 | 0 |
| 3 | 0 | 3 | 0.742511 | 0.729270 | 0 | 0.955336 | 0.039213 | 0 |
| 4 | 0 | 4 | 0.729270 | 0.628604 | 0 | 0.921061 | 0.039213 | 0 |
High-level API#
Steps#
Create the
TimeSeriesobjectCreate
configsfor model,datamodule,traineretc.Create the
model_pkgobjectperform
pkg.fitandpkg.predict.
Create Dataset object#
TimeSeries returns the raw data in terms of tensors .
TimeSeries dataset’s Key arguments:
data: DataFrame with sequence data.time: integer typed column denoting the time index withindata.target: Column(s) indatadenoting the forecasting target.group: List of column names identifying a time series instance withindata.num: List of numerical features.cat: List of categorical features.known: Features known in futureunknown: Features not known in the futurestatic: List of variables that do not change over time,
[4]:
from pytorch_forecasting.data.timeseries import TimeSeries
[5]:
# create `TimeSeries` dataset that returns the raw data in terms of tensors
dataset = TimeSeries(
data=data_df,
time="time_idx",
target="y",
group=["series_id"],
num=["x", "future_known_feature", "static_feature"],
cat=["category", "static_feature_cat"],
known=["future_known_feature"],
unknown=["x", "category"],
static=["static_feature", "static_feature_cat"],
)
/content/pytorch-forecasting/pytorch_forecasting/data/timeseries/_timeseries_v2.py:105: UserWarning: TimeSeries is part of an experimental rework of the pytorch-forecasting data layer, scheduled for release with v2.0.0. The API is not stable and may change without prior warning. For beta testing, but not for stable production use. Feedback and suggestions are very welcome in pytorch-forecasting issue 1736, https://github.com/sktime/pytorch-forecasting/issues/1736
warn(
Create the configs#
[13]:
from sklearn.preprocessing import StandardScaler
from pytorch_forecasting.data.encoders import (
NaNLabelEncoder,
TorchNormalizer,
)
from pytorch_forecasting.metrics import MAE, SMAPE
Here we use EncoderDecoderTimeSeriesDataModule
EncoderDecoderTimeSeriesDataModule key arguments:
time_series_dataset:TimeSeriesdataset instancemax_encoder_length: Maximum length of the encoder input sequence.max_prediction_length: Maximum length of the decoder output sequence.batch_size: Batch size for DataLoader.categorical_encoders: Dictionary of categorical encoders.scalers: Dictionary of feature scalers.target_normalizer: Normalizer for the target variable.
[14]:
datamodule_cfg = dict(
max_encoder_length=30,
max_prediction_length=1,
batch_size=32,
categorical_encoders={
"category": NaNLabelEncoder(add_nan=True),
"static_feature_cat": NaNLabelEncoder(add_nan=True),
},
scalers={
"x": StandardScaler(),
"future_known_feature": StandardScaler(),
"static_feature": StandardScaler(),
},
target_normalizer=TorchNormalizer(),
)
We would use TFT model in this tutorial
[15]:
model_cfg = dict(
loss=MAE(),
logging_metrics=[MAE(), SMAPE()],
optimizer="adam",
optimizer_params={"lr": 1e-3},
lr_scheduler="reduce_lr_on_plateau",
lr_scheduler_params={"mode": "min", "factor": 0.1, "patience": 10},
hidden_size=64,
num_layers=2,
attention_head_size=4,
dropout=0.1,
)
[16]:
trainer_cfg = dict(
max_epochs=5,
accelerator="auto",
devices=1,
enable_progress_bar=True,
log_every_n_steps=10,
)
[17]:
from pytorch_forecasting.models.temporal_fusion_transformer._tft_pkg_v2 import (
TFT_pkg_v2,
)
Create the model_pkg object#
This pkg class acts as a wrapper around the whole ML pipeline in pytorch-forecasting and we can simply just define the pkg class and then use pkg.fit and pkg.predict to perform the “fit”, “predict” mechanisms.
[18]:
model_pkg = TFT_pkg_v2(
model_cfg=model_cfg,
trainer_cfg=trainer_cfg,
datamodule_cfg=datamodule_cfg,
)
{'loss': MAE(), 'logging_metrics': [MAE(), SMAPE()], 'optimizer': 'adam', 'optimizer_params': {'lr': 0.001}, 'lr_scheduler': 'reduce_lr_on_plateau', 'lr_scheduler_params': {'mode': 'min', 'factor': 0.1, 'patience': 10}, 'hidden_size': 64, 'num_layers': 2, 'attention_head_size': 4, 'dropout': 0.1}
[ ]:
model_pkg.fit(dataset) # You can also pass in a DataModule here
/content/pytorch-forecasting/pytorch_forecasting/data/data_module.py:129: UserWarning: EncoderDecoderTimeSeriesDataModule is part of an experimental rework of the pytorch-forecasting data layer, scheduled for release with v2.0.0. The API is not stable and may change without prior warning. For beta testing, but not for stable production use. Feedback and suggestions are very welcome in pytorch-forecasting issue 1736, https://github.com/sktime/pytorch-forecasting/issues/1736
warn(
/content/pytorch-forecasting/pytorch_forecasting/models/base/_base_model_v2.py:64: UserWarning: The Model 'TFT' is part of an experimental reworkof the pytorch-forecasting model layer, scheduled for release with v2.0.0. The API is not stable and may change without prior warning. This class is intended for beta testing and as a basic skeleton, but not for stable production use. Feedback and suggestions are very welcome in pytorch-forecasting issue 1736, https://github.com/sktime/pytorch-forecasting/issues/1736
warn(
INFO: GPU available: True (cuda), used: True
INFO:lightning.pytorch.utilities.rank_zero:GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
INFO:lightning.pytorch.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:lightning.pytorch.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:
| Name | Type | Params | Mode
---------------------------------------------------------------------
0 | loss | MAE | 0 | train
1 | encoder_var_selection | Sequential | 709 | train
2 | decoder_var_selection | Sequential | 193 | train
3 | static_context_linear | Linear | 192 | train
4 | lstm_encoder | LSTM | 51.5 K | train
5 | lstm_decoder | LSTM | 50.4 K | train
6 | self_attention | MultiheadAttention | 16.6 K | train
7 | pre_output | Linear | 4.2 K | train
8 | output_layer | Linear | 65 | train
---------------------------------------------------------------------
123 K Trainable params
0 Non-trainable params
123 K Total params
0.495 Total estimated model params size (MB)
18 Modules in train mode
0 Modules in eval mode
INFO:lightning.pytorch.callbacks.model_summary:
| Name | Type | Params | Mode
---------------------------------------------------------------------
0 | loss | MAE | 0 | train
1 | encoder_var_selection | Sequential | 709 | train
2 | decoder_var_selection | Sequential | 193 | train
3 | static_context_linear | Linear | 192 | train
4 | lstm_encoder | LSTM | 51.5 K | train
5 | lstm_decoder | LSTM | 50.4 K | train
6 | self_attention | MultiheadAttention | 16.6 K | train
7 | pre_output | Linear | 4.2 K | train
8 | output_layer | Linear | 65 | train
---------------------------------------------------------------------
123 K Trainable params
0 Non-trainable params
123 K Total params
0.495 Total estimated model params size (MB)
18 Modules in train mode
0 Modules in eval mode
INFO: `Trainer.fit` stopped: `max_epochs=5` reached.
INFO:lightning.pytorch.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=5` reached.
Artifacts saved in: /content/pytorch-forecasting/checkpoints
PosixPath('/content/pytorch-forecasting/checkpoints/best-epoch=3-step=168.ckpt')
Output#
Output of TFT model is a dict with key prediction:
y_pred["prediction"]: Tensor of shape(batch_size, prediction_length, output_size)
[ ]:
preds = model_pkg.predict(dataset, return_info=["index", "x", "y"])
# You can also pass in a DataModule or Dataloader here
/content/pytorch-forecasting/pytorch_forecasting/data/data_module.py:129: UserWarning: EncoderDecoderTimeSeriesDataModule is part of an experimental rework of the pytorch-forecasting data layer, scheduled for release with v2.0.0. The API is not stable and may change without prior warning. For beta testing, but not for stable production use. Feedback and suggestions are very welcome in pytorch-forecasting issue 1736, https://github.com/sktime/pytorch-forecasting/issues/1736
warn(
INFO: 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
INFO:lightning.pytorch.utilities.rank_zero:💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
INFO: GPU available: True (cuda), used: True
INFO:lightning.pytorch.utilities.rank_zero:GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
INFO:lightning.pytorch.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:lightning.pytorch.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[21]:
print("First Predicted Value:")
print("Index:", preds["index"][0].item())
print("Prediction:", preds["prediction"][0].item())
print("Actual:", preds["y"][0].item())
First Predicted Value:
Index: -0.0801810473203659
Prediction: 0.11192154139280319
Actual: -0.1557866632938385
3-stage pipeline#
Steps#
Create
TimeSeriesDataset objectCreate DataModule object
Initialize, Train & Run Inference with the Model
Create Dataset & DataModule#
TimeSeriesreturns the raw data in terms of tensors .DataModulewraps the dataset, handles splits, preprocessing, batching, and exposesmetadatafor the model initialisation.
Initialize the Model#
We initialize the TFT model using the metadata provided by the DataModule. This metadata includes all required dimensional info for the encoder, decoder, and static inputs.
Train the Model#
We use a Trainer from PyTorch Lightning to train the model
Run Inference#
After training, we can make predictions using the trained model
1. Create the dataset#
We create a TimeSeries dataset instance that returns the raw data in terms of tensors, then this “raw data” is sent to the data_modulethat will internally handle the dataloaders and preprocessing
TimeSeries dataset’s Key arguments:
data: DataFrame with sequence data.time: integer typed column denoting the time index withindata.target: Column(s) indatadenoting the forecasting target.group: List of column names identifying a time series instance withindata.num: List of numerical features.cat: List of categorical features.known: Features known in futureunknown: Features not known in the futurestatic: List of variables that do not change over time,
[22]:
from pytorch_forecasting.data.timeseries import TimeSeries
[23]:
# create `TimeSeries` dataset that returns the raw data in terms of tensors
dataset = TimeSeries(
data=data_df,
time="time_idx",
target="y",
group=["series_id"],
num=["x", "future_known_feature", "static_feature"],
cat=["category", "static_feature_cat"],
known=["future_known_feature"],
unknown=["x", "category"],
static=["static_feature", "static_feature_cat"],
)
/content/pytorch-forecasting/pytorch_forecasting/data/timeseries/_timeseries_v2.py:105: UserWarning: TimeSeries is part of an experimental rework of the pytorch-forecasting data layer, scheduled for release with v2.0.0. The API is not stable and may change without prior warning. For beta testing, but not for stable production use. Feedback and suggestions are very welcome in pytorch-forecasting issue 1736, https://github.com/sktime/pytorch-forecasting/issues/1736
warn(
2. Create datamodule#
EncoderDecoderTimeSeriesDataModule key arguments:
time_series_dataset:TimeSeriesdataset instancemax_encoder_length: Maximum length of the encoder input sequence.max_prediction_length: Maximum length of the decoder output sequence.batch_size: Batch size for DataLoader.categorical_encoders: Dictionary of categorical encoders.scalers: Dictionary of feature scalers.target_normalizer: Normalizer for the target variable.
[ ]:
from sklearn.preprocessing import StandardScaler
from pytorch_forecasting.data.data_module import EncoderDecoderTimeSeriesDataModule
from pytorch_forecasting.data.encoders import (
NaNLabelEncoder,
TorchNormalizer,
)
[ ]:
# create the `data_module` that handles the dataloaders and preprocessing
data_module = EncoderDecoderTimeSeriesDataModule(
time_series_dataset=dataset,
max_encoder_length=30,
max_prediction_length=1,
batch_size=32,
categorical_encoders={
"category": NaNLabelEncoder(add_nan=True),
"static_feature_cat": NaNLabelEncoder(add_nan=True),
},
scalers={
"x": StandardScaler(),
"future_known_feature": StandardScaler(),
"static_feature": StandardScaler(),
},
target_normalizer=TorchNormalizer(),
)
/content/pytorch-forecasting/pytorch_forecasting/data/data_module.py:129: UserWarning: EncoderDecoderTimeSeriesDataModule is part of an experimental rework of the pytorch-forecasting data layer, scheduled for release with v2.0.0. The API is not stable and may change without prior warning. For beta testing, but not for stable production use. Feedback and suggestions are very welcome in pytorch-forecasting issue 1736, https://github.com/sktime/pytorch-forecasting/issues/1736
warn(
3. Initialise and train the model#
To initialise the model you now don’t have to pass arguments like encoder_cont, decoder_cont etc as they are calculated internally using the metadata property [source] of EncoderDecoderTimeSeriesDataModule. But you still have to pass other params like loss, optimizer etc
model = TFT(
loss=nn.MSELoss(),
logging_metrics=[MAE(), SMAPE()],
metadata=data_module.metadata, # <-- crucial for model setup
...
)
The metadata includes:
max_encoder_length,max_prediction_lengthnumber of continuous/categorical variables in encoder/decoder
number of static features
These are used to configure internal layers like encoder_cont, decoder_cat, etc.
[ ]:
from pytorch_forecasting.metrics import MAE, SMAPE
from pytorch_forecasting.models.temporal_fusion_transformer._tft_v2 import TFT
[60]:
# Initialise the Model
model = TFT(
loss=MAE(),
logging_metrics=[MAE(), SMAPE()],
optimizer="adam",
optimizer_params={"lr": 1e-3},
lr_scheduler="reduce_lr_on_plateau",
lr_scheduler_params={"mode": "min", "factor": 0.1, "patience": 10},
hidden_size=64,
num_layers=2,
attention_head_size=4,
dropout=0.1,
metadata=data_module.metadata, # pass the metadata from the datamodule to the model
# to initialise important params like `encoder_cont` etc
)
/content/pytorch-forecasting/pytorch_forecasting/models/base/_base_model_v2.py:64: UserWarning: The Model 'TFT' is part of an experimental reworkof the pytorch-forecasting model layer, scheduled for release with v2.0.0. The API is not stable and may change without prior warning. This class is intended for beta testing and as a basic skeleton, but not for stable production use. Feedback and suggestions are very welcome in pytorch-forecasting issue 1736, https://github.com/sktime/pytorch-forecasting/issues/1736
warn(
We use a Trainer from PyTorch Lightning to train the model:
trainer = Trainer(max_epochs=5, ...)
trainer.fit(model, data_module)
The Trainer:
Pulls data from
data_moduleHandles device placement
Logs training progress and metrics
[ ]:
from lightning.pytorch import Trainer
[61]:
# Train the model
print("\nTraining model...")
trainer = Trainer(
max_epochs=5,
accelerator="auto",
devices=1,
enable_progress_bar=True,
log_every_n_steps=10,
)
trainer.fit(model, data_module)
INFO: 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
INFO:lightning.pytorch.utilities.rank_zero:💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
INFO: GPU available: True (cuda), used: True
INFO:lightning.pytorch.utilities.rank_zero:GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
INFO:lightning.pytorch.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:lightning.pytorch.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:
| Name | Type | Params | Mode
---------------------------------------------------------------------
0 | loss | MAE | 0 | train
1 | encoder_var_selection | Sequential | 709 | train
2 | decoder_var_selection | Sequential | 193 | train
3 | static_context_linear | Linear | 192 | train
4 | lstm_encoder | LSTM | 51.5 K | train
5 | lstm_decoder | LSTM | 50.4 K | train
6 | self_attention | MultiheadAttention | 16.6 K | train
7 | pre_output | Linear | 4.2 K | train
8 | output_layer | Linear | 65 | train
---------------------------------------------------------------------
123 K Trainable params
0 Non-trainable params
123 K Total params
0.495 Total estimated model params size (MB)
18 Modules in train mode
0 Modules in eval mode
INFO:lightning.pytorch.callbacks.model_summary:
| Name | Type | Params | Mode
---------------------------------------------------------------------
0 | loss | MAE | 0 | train
1 | encoder_var_selection | Sequential | 709 | train
2 | decoder_var_selection | Sequential | 193 | train
3 | static_context_linear | Linear | 192 | train
4 | lstm_encoder | LSTM | 51.5 K | train
5 | lstm_decoder | LSTM | 50.4 K | train
6 | self_attention | MultiheadAttention | 16.6 K | train
7 | pre_output | Linear | 4.2 K | train
8 | output_layer | Linear | 65 | train
---------------------------------------------------------------------
123 K Trainable params
0 Non-trainable params
123 K Total params
0.495 Total estimated model params size (MB)
18 Modules in train mode
0 Modules in eval mode
Training model...
INFO: `Trainer.fit` stopped: `max_epochs=5` reached.
INFO:lightning.pytorch.utilities.rank_zero:`Trainer.fit` stopped: `max_epochs=5` reached.
Output#
Output of TFT model is a dict with key prediction:
y_pred["prediction"]: Tensor of shape(batch_size, prediction_length, output_size)
[ ]:
data_module.setup(stage="test")
test_dataloader = data_module.test_dataloader()
[ ]:
preds = model.predict(test_dataloader, return_info=["index", "x", "y"])
INFO: 💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
INFO:lightning.pytorch.utilities.rank_zero:💡 Tip: For seamless cloud uploads and versioning, try installing [litmodels](https://pypi.org/project/litmodels/) to enable LitModelCheckpoint, which syncs automatically with the Lightning model registry.
INFO: GPU available: True (cuda), used: True
INFO:lightning.pytorch.utilities.rank_zero:GPU available: True (cuda), used: True
INFO: TPU available: False, using: 0 TPU cores
INFO:lightning.pytorch.utilities.rank_zero:TPU available: False, using: 0 TPU cores
INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:lightning.pytorch.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[ ]:
print("First Predicted Value:")
print("Index:", preds["index"][0].item())
print("Prediction:", preds["prediction"][0].item())
print("Actual:", preds["y"][0].item())
First Predicted Value:
Index: 0.11104673147201538
Prediction: -0.001255139708518982
Actual: 0.07348770648241043