class pytorch_forecasting.models.deepar.DeepAR(cell_type: str = 'LSTM', hidden_size: int = 10, rnn_layers: int = 2, dropout: float = 0.1, static_categoricals: List[str] = [], static_reals: List[str] = [], time_varying_categoricals_encoder: List[str] = [], time_varying_categoricals_decoder: List[str] = [], categorical_groups: Dict[str, List[str]] = {}, time_varying_reals_encoder: List[str] = [], time_varying_reals_decoder: List[str] = [], embedding_sizes: Dict[str, Tuple[int, int]] = {}, embedding_paddings: List[str] = [], embedding_labels: Dict[str, numpy.ndarray] = {}, x_reals: List[str] = [], x_categoricals: List[str] = [], n_validation_samples: Optional[int] = None, n_plotting_samples: Optional[int] = None, target: Optional[Union[str, List[str]]] = None, target_lags: Dict[str, List[int]] = {}, loss: Optional[pytorch_forecasting.metrics.DistributionLoss] = None, logging_metrics: Optional[torch.nn.modules.container.ModuleList] = None, **kwargs)[source]

Bases: pytorch_forecasting.models.base_model.AutoRegressiveBaseModelWithCovariates

DeepAR Network.

The code is based on the article DeepAR: Probabilistic forecasting with autoregressive recurrent networks.

  • cell_type (str, optional) – Recurrent cell type [“LSTM”, “GRU”]. Defaults to “LSTM”.

  • hidden_size (int, optional) – hidden recurrent size - the most important hyperparameter along with rnn_layers. Defaults to 10.

  • rnn_layers (int, optional) – Number of RNN layers - important hyperparameter. Defaults to 2.

  • dropout (float, optional) – Dropout in RNN layers. Defaults to 0.1.

  • static_categoricals – integer of positions of static categorical variables

  • static_reals – integer of positions of static continuous variables

  • time_varying_categoricals_encoder – integer of positions of categorical variables for encoder

  • time_varying_categoricals_decoder – integer of positions of categorical variables for decoder

  • time_varying_reals_encoder – integer of positions of continuous variables for encoder

  • time_varying_reals_decoder – integer of positions of continuous variables for decoder

  • categorical_groups – dictionary where values are list of categorical variables that are forming together a new categorical variable which is the key in the dictionary

  • x_reals – order of continuous variables in tensor passed to forward function

  • x_categoricals – order of categorical variables in tensor passed to forward function

  • embedding_sizes – dictionary mapping (string) indices to tuple of number of categorical classes and embedding size

  • embedding_paddings – list of indices for embeddings which transform the zero’s embedding to a zero vector

  • embedding_labels – dictionary mapping (string) indices to list of categorical labels

  • n_validation_samples (int, optional) – Number of samples to use for calculating validation metrics. Defaults to None, i.e. no sampling at validation stage and using “mean” of distribution for logging metrics calculation.

  • n_plotting_samples (int, optional) – Number of samples to generate for plotting predictions during training. Defaults to n_validation_samples if not None or 100 otherwise.

  • target (str, optional) – Target variable or list of target variables. Defaults to None.

  • target_lags (Dict[str, Dict[str, int]]) – dictionary of target names mapped to list of time steps by which the variable should be lagged. Lags can be useful to indicate seasonality to the models. If you know the seasonalit(ies) of your data, add at least the target variables with the corresponding lags to improve performance. Defaults to no lags, i.e. an empty dictionary.

  • loss (DistributionLoss, optional) – Distribution loss function. Keep in mind that each distribution loss function might have specific requirements for target normalization. Defaults to NormalDistributionLoss.

  • logging_metrics (nn.ModuleList, optional) – Metrics to log during training. Defaults to nn.ModuleList([SMAPE(), MAE(), RMSE(), MAPE(), MASE()]).


construct_input_vector(x_cat, x_cont[, …])

Create input vector into RNN network

create_log(x, y, out, batch_idx)

Create the log used in the training and validation step.

decode(input_vector, target_scale, …[, …])

Decode hidden state of RNN into prediction.

decode_all(x, hidden_state[, lengths])


Encode sequence into hidden state

forward(x[, n_samples])

Forward network

from_dataset(dataset[, …])

Create model from dataset.

plot_prediction(x, out, idx[, …])

Plot prediction of prediction vs actuals

predict(data[, mode, return_index, …])

predict dataloader

construct_input_vector(x_cat: torch.Tensor, x_cont: torch.Tensor, one_off_target: Optional[torch.Tensor] = None) torch.Tensor[source]

Create input vector into RNN network


one_off_target – tensor to insert into first position of target. If None (default), remove first time step.

create_log(x, y, out, batch_idx)[source]

Create the log used in the training and validation step.

  • x (Dict[str, torch.Tensor]) – x as passed to the network by the dataloader

  • y (Tuple[torch.Tensor, torch.Tensor]) – y as passed to the loss function by the dataloader

  • out (Dict[str, torch.Tensor]) – output of the network

  • batch_idx (int) – batch number

  • prediction_kwargs (Dict[str, Any], optional) – arguments to pass to to_prediction(). Defaults to {}.

  • quantiles_kwargs (Dict[str, Any], optional) – to_quantiles(). Defaults to {}.


log dictionary to be returned by training and validation steps

Return type

Dict[str, Any]

decode(input_vector: torch.Tensor, target_scale: torch.Tensor, decoder_lengths: torch.Tensor, hidden_state: Union[Tuple[torch.Tensor, torch.Tensor], torch.Tensor], n_samples: Optional[int] = None) Tuple[torch.Tensor, bool][source]

Decode hidden state of RNN into prediction. If n_smaples is given, decode not by using actual values but rather by sampling new targets from past predictions iteratively

encode(x: Dict[str, torch.Tensor]) Union[Tuple[torch.Tensor, torch.Tensor], torch.Tensor][source]

Encode sequence into hidden state

forward(x: Dict[str, torch.Tensor], n_samples: Optional[int] = None) Dict[str, torch.Tensor][source]

Forward network

classmethod from_dataset(dataset:, allowed_encoder_known_variable_names: Optional[List[str]] = None, **kwargs)[source]

Create model from dataset.

  • dataset – timeseries dataset

  • allowed_encoder_known_variable_names – List of known variables that are allowed in encoder, defaults to all

  • **kwargs – additional arguments such as hyperparameters for model (see __init__())


DeepAR network

plot_prediction(x: Dict[str, torch.Tensor], out: Dict[str, torch.Tensor], idx: int, add_loss_to_title: Union[pytorch_forecasting.metrics.Metric, torch.Tensor, bool] = False, show_future_observed: bool = True, ax=None, **kwargs) matplotlib.figure.Figure[source]

Plot prediction of prediction vs actuals

  • x – network input

  • out – network output

  • idx – index of prediction to plot

  • add_loss_to_title – if to add loss to title or loss function to calculate. Can be either metrics, bool indicating if to use loss metric or tensor which contains losses for all samples. Calcualted losses are determined without weights. Default to False.

  • show_future_observed – if to show actuals for future. Defaults to True.

  • ax – matplotlib axes to plot on

  • quantiles_kwargs (Dict[str, Any]) – parameters for to_quantiles() of the loss metric.

  • prediction_kwargs (Dict[str, Any]) – parameters for to_prediction() of the loss metric.


matplotlib figure

predict(data: Union[, pandas.core.frame.DataFrame,], mode: Union[str, Tuple[str, str]] = 'prediction', return_index: bool = False, return_decoder_lengths: bool = False, batch_size: int = 64, num_workers: int = 0, fast_dev_run: bool = False, show_progress_bar: bool = False, return_x: bool = False, mode_kwargs: Optional[Dict[str, Any]] = None, n_samples: int = 100)[source]

predict dataloader

  • dataloader – dataloader, dataframe or dataset

  • mode – one of “prediction”, “quantiles”, “samples” or “raw”, or tuple ("raw", output_name) where output_name is a name in the dictionary returned by forward()

  • return_index – if to return the prediction index (in the same order as the output, i.e. the row of the dataframe corresponds to the first dimension of the output and the given time index is the time index of the first prediction)

  • return_decoder_lengths – if to return decoder_lengths (in the same order as the output

  • batch_size – batch size for dataloader - only used if data is not a dataloader is passed

  • num_workers – number of workers for dataloader - only used if data is not a dataloader is passed

  • fast_dev_run – if to only return results of first batch

  • show_progress_bar – if to show progress bar. Defaults to False.

  • return_x – if to return network inputs (in the same order as prediction output)

  • mode_kwargs (Dict[str, Any]) – keyword arguments for to_prediction() or to_quantiles() for modes “prediction” and “quantiles”

  • n_samples – number of samples to draw. Defaults to 100.


some elements might not be present depending on what is configured

to be returned

Return type

output, x, index, decoder_lengths