class pytorch_forecasting.models.base_model.BaseModelWithCovariates(log_interval: int | float = -1, log_val_interval: float | int | None = None, learning_rate: float | List[float] = 0.001, log_gradient_flow: bool = False, loss: Metric = SMAPE(), logging_metrics: ModuleList = ModuleList(), reduce_on_plateau_patience: int = 1000, reduce_on_plateau_reduction: float = 2.0, reduce_on_plateau_min_lr: float = 1e-05, weight_decay: float = 0.0, optimizer_params: Dict[str, Any] | None = None, monotone_constaints: Dict[str, int] = {}, output_transformer: Callable | None = None, optimizer='Ranger')[source]#

Bases: BaseModel

Model with additional methods using covariates.

Assumes the following hyperparameters:

  • static_categoricals (List[str]) – names of static categorical variables

  • static_reals (List[str]) – names of static continuous variables

  • time_varying_categoricals_encoder (List[str]) – names of categorical variables for encoder

  • time_varying_categoricals_decoder (List[str]) – names of categorical variables for decoder

  • time_varying_reals_encoder (List[str]) – names of continuous variables for encoder

  • time_varying_reals_decoder (List[str]) – names of continuous variables for decoder

  • x_reals (List[str]) – order of continuous variables in tensor passed to forward function

  • x_categoricals (List[str]) – order of categorical variables in tensor passed to forward function

  • embedding_sizes (Dict[str, Tuple[int, int]]) – dictionary mapping categorical variables to tuple of integers where the first integer denotes the number of categorical classes and the second the embedding size

  • embedding_labels (Dict[str, List[str]]) – dictionary mapping (string) indices to list of categorical labels

  • embedding_paddings (List[str]) – names of categorical variables for which label 0 is always mapped to an embedding vector filled with zeros

  • categorical_groups (Dict[str, List[str]]) – dictionary of categorical variables that are grouped together and can also take multiple values simultaneously (e.g. holiday during octoberfest). They should be implemented as bag of embeddings

BaseModel for timeseries forecasting from which to inherit from

  • log_interval (Union[int, float], optional) – Batches after which predictions are logged. If < 1.0, will log multiple entries per batch. Defaults to -1.

  • log_val_interval (Union[int, float], optional) – batches after which predictions for validation are logged. Defaults to None/log_interval.

  • learning_rate (float, optional) – Learning rate. Defaults to 1e-3.

  • log_gradient_flow (bool) – If to log gradient flow, this takes time and should be only done to diagnose training failures. Defaults to False.

  • loss (Metric, optional) – metric to optimize, can also be list of metrics. Defaults to SMAPE().

  • logging_metrics (nn.ModuleList[MultiHorizonMetric]) – list of metrics that are logged during training. Defaults to [].

  • reduce_on_plateau_patience (int) – patience after which learning rate is reduced by a factor of 10. Defaults to 1000

  • reduce_on_plateau_reduction (float) – reduction in learning rate when encountering plateau. Defaults to 2.0.

  • reduce_on_plateau_min_lr (float) – minimum learning rate for reduce on plateua learning rate scheduler. Defaults to 1e-5

  • weight_decay (float) – weight decay. Defaults to 0.0.

  • optimizer_params (Dict[str, Any]) – additional parameters for the optimizer. Defaults to {}.

  • monotone_constaints (Dict[str, int]) – dictionary of monotonicity constraints for continuous decoder variables mapping position (e.g. "0" for first position) to constraint (-1 for negative and +1 for positive, larger numbers add more weight to the constraint vs. the loss but are usually not necessary). This constraint significantly slows down training. Defaults to {}.

  • output_transformer (Callable) – transformer that takes network output and transforms it to prediction space. Defaults to None which is equivalent to lambda out: out["prediction"].

  • optimizer (str) – Optimizer, “ranger”, “sgd”, “adam”, “adamw” or class name of optimizer in torch.optim or pytorch_optimizer. Alternatively, a class or function can be passed which takes parameters as first argument and a lr argument (optionally also weight_decay). Defaults to “ranger”.


calculate_prediction_actual_by_variable(x, ...)

Calculate predictions and actuals by variable averaged by bins bins spanning from -std to +std

extract_features(x[, embeddings, period])

Extract features

from_dataset(dataset[, ...])

Create model from dataset and set parameters related to covariates.

plot_prediction_actual_by_variable(data[, ...])

Plot predicions and actual averages by variables

calculate_prediction_actual_by_variable(x: Dict[str, Tensor], y_pred: Tensor, normalize: bool = True, bins: int = 95, std: float = 2.0, log_scale: bool | None = None) Dict[str, Dict[str, Tensor]][source]#

Calculate predictions and actuals by variable averaged by bins bins spanning from -std to +std

  • x – input as forward()

  • y_pred – predictions obtained by self(x, **kwargs)

  • normalize – if to return normalized averages, i.e. mean or sum of y

  • bins – number of bins to calculate

  • std – number of standard deviations for standard scaled continuous variables

  • log_scale (str, optional) – if to plot in log space. If None, determined based on skew of values. Defaults to None.


dictionary that can be used to plot averages with plot_prediction_actual_by_variable()

extract_features(x, embeddings: MultiEmbedding | None = None, period: str = 'all') Tensor[source]#

Extract features

  • x (Dict[str, torch.Tensor]) – input from the dataloader

  • embeddings (MultiEmbedding) – embeddings for categorical variables

  • period (str, optional) – One of “encoder”, “decoder” or “all”. Defaults to “all”.


tensor with selected variables

Return type:


classmethod from_dataset(dataset: TimeSeriesDataSet, allowed_encoder_known_variable_names: List[str] | None = None, **kwargs) LightningModule[source]#

Create model from dataset and set parameters related to covariates.

  • dataset – timeseries dataset

  • allowed_encoder_known_variable_names – List of known variables that are allowed in encoder, defaults to all

  • **kwargs – additional arguments such as hyperparameters for model (see __init__())



plot_prediction_actual_by_variable(data: Dict[str, Dict[str, Tensor]], name: str | None = None, ax=None, log_scale: bool | None = None) Dict[str, Figure] | Figure[source]#

Plot predicions and actual averages by variables

  • data (Dict[str, Dict[str, torch.Tensor]]) – data obtained from calculate_prediction_actual_by_variable()

  • name (str, optional) – name of variable for which to plot actuals vs predictions. Defaults to None which means returning a dictionary of plots for all variables.

  • log_scale (str, optional) – if to plot in log space. If None, determined based on skew of values. Defaults to None.


ValueError – if the variable name is unkown


matplotlib figure

Return type:

Union[Dict[str, plt.Figure], plt.Figure]

property categorical_groups_mapping: Dict[str, str]#

Mapping of categorical variables to categorical groups

property categoricals: List[str]#

List of all categorical variables in model

property decoder_variables: List[str]#

List of all decoder variables in model (excluding static variables)

property encoder_variables: List[str]#

List of all encoder variables in model (excluding static variables)

property reals: List[str]#

List of all continuous variables in model

property static_variables: List[str]#

List of all static variables in model

property target_positions: LongTensor#

Positions of target variable(s) in covariates.


tensor of positions.

Return type: