class pytorch_forecasting.models.basic_rnn.LSTMModel(cell_type: str = 'LSTM', hidden_size: int = 10, rnn_layers: int = 2, dropout: float = 0.1, static_categoricals: List[str] = [], static_reals: List[str] = [], time_varying_categoricals_encoder: List[str] = [], time_varying_categoricals_decoder: List[str] = [], categorical_groups: Dict[str, List[str]] = {}, time_varying_reals_encoder: List[str] = [], time_varying_reals_decoder: List[str] = [], embedding_sizes: Dict[str, Tuple[int, int]] = {}, embedding_paddings: List[str] = [], embedding_labels: Dict[str, numpy.ndarray] = {}, x_reals: List[str] = [], x_categoricals: List[str] = [], n_validation_samples: Optional[int] = None, n_plotting_samples: Optional[int] = None, target: Optional[Union[str, List[str]]] = None, loss: Optional[pytorch_forecasting.metrics.MultiHorizonMetric] = None, logging_metrics: Optional[torch.nn.modules.container.ModuleList] = None, **kwargs)[source]

Bases: pytorch_forecasting.models.base_model.AutoRegressiveBaseModelWithCovariates

Basic RNN network.

  • cell_type (str, optional) – Recurrent cell type [“LSTM”, “GRU”]. Defaults to “LSTM”.

  • hidden_size (int, optional) – hidden recurrent size - the most important hyperparameter along with rnn_layers. Defaults to 10.

  • rnn_layers (int, optional) – Number of RNN layers - important hyperparameter. Defaults to 2.

  • dropout (float, optional) – Dropout in RNN layers. Defaults to 0.1.

  • static_categoricals – integer of positions of static categorical variables

  • static_reals – integer of positions of static continuous variables

  • time_varying_categoricals_encoder – integer of positions of categorical variables for encoder

  • time_varying_categoricals_decoder – integer of positions of categorical variables for decoder

  • time_varying_reals_encoder – integer of positions of continuous variables for encoder

  • time_varying_reals_decoder – integer of positions of continuous variables for decoder

  • categorical_groups – dictionary where values are list of categorical variables that are forming together a new categorical variable which is the key in the dictionary

  • x_reals – order of continuous variables in tensor passed to forward function

  • x_categoricals – order of categorical variables in tensor passed to forward function

  • embedding_sizes – dictionary mapping (string) indices to tuple of number of categorical classes and embedding size

  • embedding_paddings – list of indices for embeddings which transform the zero’s embedding to a zero vector

  • embedding_labels – dictionary mapping (string) indices to list of categorical labels

  • n_validation_samples (int, optional) – Number of samples to use for calculating validation metrics. Defaults to None, i.e. no sampling at validation stage and using “mean” of distribution for logging metrics calculation.

  • n_plotting_samples (int, optional) – Number of samples to generate for plotting predictions during training. Defaults to n_validation_samples if not None or 100 otherwise.

  • target (str, optional) – Target variable or list of target variables. Defaults to None.

  • loss (DistributionLoss, optional) – Distribution loss function. Keep in mind that each distribution loss function might have specific requirements for target normalization. Defaults to NormalDistributionLoss.

  • logging_metrics (nn.ModuleList, optional) – Metrics to log during training. Defaults to nn.ModuleList([SMAPE(), MAE(), RMSE(), MAPE(), MASE()]).


decode(x, hidden_state)



Network forward pass.

forward(x: Dict[str, torch.Tensor]) Dict[str, torch.Tensor][source]

Network forward pass.


x (Dict[str, Union[torch.Tensor, List[torch.Tensor]]]) – network input (x as returned by the dataloader). See to_dataloader() method that returns a tuple of x and y. This function expects x.


network outputs / dictionary of tensors or list

of tensors. Create it using the to_network_output() method. The minimal required entries in the dictionary are (and shapes in brackets):

  • prediction (batch_size x n_decoder_time_steps x n_outputs or list thereof with each entry for a different target): re-scaled predictions that can be fed to metric. List of tensors if multiple targets are predicted at the same time.

Before passing outputting the predictions, you want to rescale them into real space. By default, you can use the transform_output() method to achieve this.

Return type

NamedTuple[Union[torch.Tensor, List[torch.Tensor]]]


def forward(self, x:
    # x is a batch generated based on the TimeSeriesDataset, here we just use the
    # continuous variables for the encoder
    network_input = x["encoder_cont"].squeeze(-1)
    prediction = self.linear(network_input)  #

    # rescale predictions into target space
    prediction = self.transform_output(prediction, target_scale=x["target_scale"])

    # We need to return a dictionary that at least contains the prediction
    # The parameter can be directly forwarded from the input.
    # The conversion to a named tuple can be directly achieved with the `to_network_output` function.
    return self.to_network_output(prediction=prediction)