pytorch_forecasting.models.base_model.
BaseModel
Bases: pytorch_lightning.core.lightning.LightningModule
pytorch_lightning.core.lightning.LightningModule
BaseModel from which new timeseries models should inherit from. The hparams of the created object will default to the parameters indicated in __init__().
hparams
__init__()
The forward() method should return a dictionary with at least the entry prediction and target_scale that contains the network’s output. See the function’s documentation for more details.
forward()
prediction
target_scale
The idea of the base model is that common methods do not have to be re-implemented for every new architecture. The class is a [LightningModule](https://pytorch-lightning.readthedocs.io/en/latest/lightning_module.html) and follows its conventions. However, there are important additions:
You need to specify a loss attribute that stores the function to calculate the MultiHorizonLoss for backpropagation. The from_dataset() method can be used to initialize a network using the specifications of a dataset. Often, parameters such as the number of features can be easily deduced from the dataset. Further, the method will also store how to rescale normalized predictions into the unnormalized prediction space. Override it to pass additional arguments to the __init__ method of your network that depend on your dataset. The transform_output() method rescales the network output using the target normalizer from thedataset. The step() method takes care of calculating the loss, logging additional metrics defined in the logging_metrics attribute and plots of sample predictions. You can override this method to add custom interpretations or pass extra arguments to the networks forward method. The epoch_end() method can be used to calculate summaries of each epoch such as statistics on the encoder length, etc. The predict() method makes predictions using a dataloader or dataset. Override it if you need to pass additional arguments to forward by default.
You need to specify a loss attribute that stores the function to calculate the MultiHorizonLoss for backpropagation.
loss
MultiHorizonLoss
The from_dataset() method can be used to initialize a network using the specifications of a dataset. Often, parameters such as the number of features can be easily deduced from the dataset. Further, the method will also store how to rescale normalized predictions into the unnormalized prediction space. Override it to pass additional arguments to the __init__ method of your network that depend on your dataset.
from_dataset()
The transform_output() method rescales the network output using the target normalizer from thedataset.
transform_output()
The step() method takes care of calculating the loss, logging additional metrics defined in the logging_metrics attribute and plots of sample predictions. You can override this method to add custom interpretations or pass extra arguments to the networks forward method.
step()
logging_metrics
The epoch_end() method can be used to calculate summaries of each epoch such as statistics on the encoder length, etc.
epoch_end()
The predict() method makes predictions using a dataloader or dataset. Override it if you need to pass additional arguments to forward by default.
predict()
forward
To implement your own architecture, it is best to go throught the Using custom data and implementing custom models and to look at existing ones to understand what might be a good approach.
Example
class Network(BaseModel): def __init__(self, my_first_parameter: int=2, loss=SMAPE()): self.save_hyperparameters() super().__init__() self.loss = loss def forward(self, x): encoding_target = x["encoder_target"] return dict(prediction=..., target_scale=x["target_scale"])
BaseModel for timeseries forecasting from which to inherit from
log_interval (Union[int, float], optional) – Batches after which predictions are logged. If < 1.0, will log multiple entries per batch. Defaults to -1.
log_val_interval (Union[int, float], optional) – batches after which predictions for validation are logged. Defaults to None/log_interval.
learning_rate (float, optional) – Learning rate. Defaults to 1e-3.
log_gradient_flow (bool) – If to log gradient flow, this takes time and should be only done to diagnose training failures. Defaults to False.
loss (Metric, optional) – metric to optimize. Defaults to SMAPE().
logging_metrics (nn.ModuleList[MultiHorizonMetric]) – list of metrics that are logged during training. Defaults to [].
reduce_on_plateau_patience (int) – patience after which learning rate is reduced by a factor of 10. Defaults to 1000
reduce_on_plateau_min_lr (float) – minimum learning rate for reduce on plateua learning rate scheduler. Defaults to 1e-5
weight_decay (float) – weight decay. Defaults to 0.0.
optimizer_params (Dict[str, Any]) – additional parameters for the optimizer. Defaults to {}.
monotone_constaints (Dict[str, int]) – dictionary of monotonicity constraints for continuous decoder variables mapping position (e.g. "0" for first position) to constraint (-1 for negative and +1 for positive, larger numbers add more weight to the constraint vs. the loss but are usually not necessary). This constraint significantly slows down training. Defaults to {}.
"0"
-1
+1
output_transformer (Callable) – transformer that takes network output and transforms it to prediction space. Defaults to None which is equivalent to lambda out: out["prediction"].
lambda out: out["prediction"]
optimizer (str) – Optimizer, “ranger”, “sgd”, “adam”, “adamw” or class name of optimizer in torch.optim. Defaults to “ranger”.
torch.optim
Methods
configure_optimizers()
configure_optimizers
Configure optimizers.
deduce_default_output_parameters(dataset, kwargs)
deduce_default_output_parameters
Deduce default parameters for output for from_dataset() method.
epoch_end(outputs)
epoch_end
Run at epoch end for training or validation.
forward(x)
Network forward pass.
from_dataset(dataset, **kwargs)
from_dataset
Create model from dataset, i.e. save dataset parameters in model.
log_gradient_flow(named_parameters)
log_gradient_flow
log distribution of gradients to identify exploding / vanishing gradients
log_metrics(x, y, out)
log_metrics
Log metrics every training/validation step.
log_prediction(x, out, batch_idx)
log_prediction
on_after_backward()
on_after_backward
Log gradient flow for debugging.
on_load_checkpoint(checkpoint)
on_load_checkpoint
Do something with the checkpoint.
on_save_checkpoint(checkpoint)
on_save_checkpoint
Give the model a chance to add something to the checkpoint.
plot_prediction(x, out[, idx, …])
plot_prediction
Plot prediction of prediction vs actuals
predict(data[, mode, return_index, …])
predict
Run inference / prediction.
predict_dependency(data, variable, values[, …])
predict_dependency
Predict partial dependency.
size()
size
get number of parameters in model
step(x, y, batch_idx, **kwargs)
step
Run for each train/val step.
training_epoch_end(outputs)
training_epoch_end
Called at the end of the training epoch with the outputs of all training steps.
training_step(batch, batch_idx)
training_step
Train on batch.
transform_output(out)
transform_output
Extract prediction from network output and rescale it to real space / de-normalize it.
validation_epoch_end(outputs)
validation_epoch_end
Called at the end of the validation epoch with the outputs of all validation steps.
validation_step(batch, batch_idx)
validation_step
Operates on a single batch of data from the validation set.
Attributes
log_interval
Log interval depending if training or validating
n_targets
Number of targets to forecast.
Uses single Ranger optimizer. Depending if learning rate is a list or a single float, implement dynamic learning rate scheduler or deterministic version
first entry is list of optimizers and second is list of schedulers
Tuple[List]
Determines output_size and loss parameters.
output_size
dataset (TimeSeriesDataSet) – timeseries dataset
kwargs (Dict[str, Any]) – current hyperparameters
default_loss (MultiHorizonMetric, optional) – default loss function. Defaults to MAE.
MAE
dictionary with output_size and loss.
Dict[str, Any]
Run at epoch end for training or validation. Can be overriden in models.
x (Dict[str, Union[torch.Tensor, List[torch.Tensor]]]) – network input (x as returned by the dataloader). See to_dataloader() method that returns a tuple of x and y. This function expects x.
to_dataloader()
x
y
of tensors. The minimal required entries in the dictionary are (and shapes in brackets):
prediction (batch_size x n_decoder_time_steps x n_outputs or list thereof with each entry for a different target): unscaled predictions that can be fed to metric. List of tensors if multiple targets are predicted at the same time.
target_scale (batch_size x scale_size or list thereof with each entry for a different target): target scales that allow rescaling the predictions into the real space. The scale can mostly be directly taken from x, i.e. target_scale=x["target_scale"]
target_scale=x["target_scale"]
Dict[str, Union[torch.Tensor, List[torch.Tensor]]]
Create model from dataset, i.e. save dataset parameters in model
This function should be called as super().from_dataset() in a derived models that implement it
super().from_dataset()
Model that can be trained
x (Dict[str, torch.Tensor]) – x as passed to the network by the dataloader
y (torch.Tensor) – y as passed to the loss function by the dataloader
out (Dict[str, torch.Tensor]) – output of the network
batch_idx (int) – current batch index
Do something with the checkpoint. Gives model a chance to load something before state_dict is restored.
state_dict
checkpoint – A dictionary with variables from the checkpoint.
Give the model a chance to add something to the checkpoint. state_dict is already there.
checkpoint – A dictionary in which you can save variables to save in a checkpoint. Contents need to be pickleable.
x – network input
out – network output
idx – index of prediction to plot
add_loss_to_title – if to add loss to title or loss function to calculate. Can be either metrics, bool indicating if to use loss metric or tensor which contains losses for all samples. Calcualted losses are determined without weights. Default to False.
show_future_observed – if to show actuals for future. Defaults to True.
ax – matplotlib axes to plot on
matplotlib figure
dataloader – dataloader, dataframe or dataset
mode – one of “prediction”, “quantiles” or “raw”, or tuple ("raw", output_name) where output_name is a name in the dictionary returned by forward()
("raw", output_name)
return_index – if to return the prediction index
return_decoder_lengths – if to return decoder_lengths
batch_size – batch size for dataloader - only used if data is not a dataloader is passed
num_workers – number of workers for dataloader - only used if data is not a dataloader is passed
fast_dev_run – if to only return results of first batch
show_progress_bar – if to show progress bar. Defaults to False.
return_x – if to return network inputs
**kwargs – additional arguments to network’s forward method
to be returned
output, x, index, decoder_lengths
data (Union[DataLoader, pd.DataFrame, TimeSeriesDataSet]) – data
variable (str) – variable which to modify
values (Iterable) – array of values to probe
mode (str, optional) –
Output mode. Defaults to “dataframe”. Either
”series”: values are average prediction and index are probed values
prediction (which is the mean prediction over the time horizon), normalized_prediction (which are predictions devided by the prediction for the first probed value) the variable name for the probed values
”raw”: outputs a tensor of shape len(values) x prediction_shape
target – Defines which values are overwritten for making a prediction. Same as in set_overwrite_values(). Defaults to “decoder”.
set_overwrite_values()
**kwargs – additional kwargs to predict() method
output
Union[np.ndarray, torch.Tensor, pd.Series, pd.DataFrame]
y (Tuple[torch.Tensor, torch.Tensor]) – y as passed to the loss function by the dataloader
batch_idx (int) – batch number
**kwargs – additional arguments to pass to the network apart from x
entry is a dictionary to which additional logging results can be added for consumption in the epoch_end hook and the second entry is the model’s output.
Tuple[Dict[str, torch.Tensor], Dict[str, torch.Tensor]]
Called at the end of the training epoch with the outputs of all training steps. Use this in case you need to do something with all the outputs for every training_step.
# the pseudocode for these calls train_outs = [] for train_batch in train_data: out = training_step(train_batch) train_outs.append(out) training_epoch_end(train_outs)
outputs – List of outputs you defined in training_step(), or if there are multiple dataloaders, a list containing a list of outputs for each dataloader.
training_step()
None
Note
If this method is not overridden, this won’t be called.
Example:
def training_epoch_end(self, training_step_outputs): # do something with all training_step outputs return result
With multiple dataloaders, outputs will be a list of lists. The outer list contains one entry per dataloader, while the inner list contains the individual outputs of each training step for that dataloader.
outputs
def training_epoch_end(self, training_step_outputs): for out in training_step_outputs: # do something here
out (Dict[str, torch.Tensor]) –
Network output with “prediction” and “target_scale” entries. the output will be either
The input if the input is a tensor
out["prediction"] if there is no output_transformer (which is a TorchNormalizer) defined or out["output_transformation"] is None
out["prediction"]
output_transformer
TorchNormalizer
out["output_transformation"] is None
transformed output - this is either the output_transformer applied to the input or, in case of DistributionLoss as loss module, the loss module is used together with the output_transformer
DistributionLoss
rescaled prediction
torch.Tensor
# the pseudocode for these calls val_outs = [] for val_batch in val_data: out = validation_step(val_batch) val_outs.append(out) validation_epoch_end(val_outs)
outputs – List of outputs you defined in validation_step(), or if there are multiple dataloaders, a list containing a list of outputs for each dataloader.
validation_step()
If you didn’t define a validation_step(), this won’t be called.
Examples
With a single dataloader:
def validation_epoch_end(self, val_step_outputs): for out in val_step_outputs: # do something
With multiple dataloaders, outputs will be a list of lists. The outer list contains one entry per dataloader, while the inner list contains the individual outputs of each validation step for that dataloader.
def validation_epoch_end(self, outputs): for dataloader_output_result in outputs: dataloader_outs = dataloader_output_result.dataloader_i_outputs self.log('final_metric', final_value)
Operates on a single batch of data from the validation set. In this step you’d might generate examples or calculate anything of interest like accuracy.
batch (Tensor | (Tensor, …) | [Tensor, …]) – The output of your DataLoader. A tensor, tuple or list.
Tensor
DataLoader
batch_idx (int) – The index of this batch
dataloader_idx (int) – The index of the dataloader that produced this batch (only if multiple val dataloaders used)
Any of.
Any object or value None - Validation will skip to the next batch
Any object or value
None - Validation will skip to the next batch
# pseudocode of order out = validation_step() if defined('validation_step_end'): out = validation_step_end(out) out = validation_epoch_end(out)
# if you have one val dataloader: def validation_step(self, batch, batch_idx) # if you have multiple val dataloaders: def validation_step(self, batch, batch_idx, dataloader_idx)
Examples:
# CASE 1: A single validation dataset def validation_step(self, batch, batch_idx): x, y = batch # implement your own out = self(x) loss = self.loss(out, y) # log 6 example images # or generated text... or whatever sample_imgs = x[:6] grid = torchvision.utils.make_grid(sample_imgs) self.logger.experiment.add_image('example_images', grid, 0) # calculate acc labels_hat = torch.argmax(out, dim=1) val_acc = torch.sum(y == labels_hat).item() / (len(y) * 1.0) # log the outputs! self.log_dict({'val_loss': loss, 'val_acc': val_acc})
If you pass in multiple val dataloaders, validation_step() will have an additional argument.
# CASE 2: multiple validation dataloaders def validation_step(self, batch, batch_idx, dataloader_idx): # dataloader_idx tells you which dataset this is.
If you don’t need to validate you don’t need to implement this method.
When the validation_step() is called, the model has been put in eval mode and PyTorch gradients have been disabled. At the end of validation, the model goes back to training mode and gradients are enabled.
Based on loss function.
number of targets
int