pytorch_forecasting.models.nhits.NHiTS#

class pytorch_forecasting.models.nhits.NHiTS(output_size: int | list[int] = 1, static_categoricals: list[str] | None = None, static_reals: list[str] | None = None, time_varying_categoricals_encoder: list[str] | None = None, time_varying_categoricals_decoder: list[str] | None = None, categorical_groups: dict[str, list[str]] | None = None, time_varying_reals_encoder: list[str] | None = None, time_varying_reals_decoder: list[str] | None = None, embedding_sizes: dict[str, tuple[int, int]] | None = None, embedding_paddings: list[str] | None = None, embedding_labels: list[str] | None = None, x_reals: list[str] | None = None, x_categoricals: list[str] | None = None, context_length: int = 1, prediction_length: int = 1, static_hidden_size: int | None = None, naive_level: bool = True, shared_weights: bool = True, activation: str = 'ReLU', initialization: str = 'lecun_normal', n_blocks: list[str] | None = None, n_layers: int | list[int] = 2, hidden_size: int = 512, pooling_sizes: list[int] | None = None, downsample_frequencies: list[int] | None = None, pooling_mode: str = 'max', interpolation_mode: str = 'linear', batch_normalization: bool = False, dropout: float = 0.0, learning_rate: float = 0.01, log_interval: int = -1, log_gradient_flow: bool = False, log_val_interval: int = None, weight_decay: float = 0.001, loss: MultiHorizonMetric = None, reduce_on_plateau_patience: int = 1000, backcast_loss_ratio: float = 0.0, logging_metrics: ModuleList = None, **kwargs)[source]#

Initialize N-HiTS Model - use its from_dataset() method if possible.

Based on the article N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting. The network has shown to increase accuracy by ~25% against NBeats and also supports covariates.

Parameters:
  • hidden_size (int, default=512) – size of hidden layers and can range from 8 to 1024 - use 32-128 if no covariates are employed.

  • static_hidden_size (int, optional) – size of hidden layers for static variables. Defaults to hidden_size.

  • loss (MultiHorizonMetric, default=MASE()) – loss to optimize. QuantileLoss is also supported.

  • shared_weights (bool, default=True) – if True, weights of blocks are shared in each stack.

  • naive_level (bool, default=True) – if True, native forecast of last observation is added at the beginning.

  • initialization (str, default="lecun_normal") – Initialization method. One of [‘orthogonal’, ‘he_uniform’, ‘glorot_uniform’, ‘glorot_normal’, ‘lecun_normal’].

  • n_blocks (list of int, default=[1, 1, 1]) – list of blocks used in each stack (i.e. length of stacks).

  • n_layers (int or list of int, default=2) – Number of layers per block or list of number of layers used by blocks in each stack (i.e. length of stacks).

  • pooling_sizes (list of int, optional) – List of pooling sizes for input for each stack, i.e. higher means more smoothing of input. Using an ordering of higher to lower in the list improves results. Defaults to a heuristic.

  • pooling_mode (str, default="max") – Pooling mode for summarizing input. One of [‘max’,’average’].

  • downsample_frequencies (list of int, optional) – Downsample multiplier of output for each stack, i.e. higher means more interpolation at forecast time is required. Should be equal or higher than pooling_sizes but smaller equal prediction_length. Defaults to a heuristic to match pooling_sizes.

  • interpolation_mode (str, default="linear") – Interpolation mode for forecasting. One of [‘linear’, ‘nearest’, ‘cubic-x’] where ‘x’ is replaced by a batch size for the interpolation.

  • batch_normalization (bool, default=False) – Whether carry out batch normalization.

  • dropout (float, default=0.0) – dropout rate for hidden layers.

  • activation (str, default="ReLU") – activation function. One of [‘ReLU’, ‘Softplus’, ‘Tanh’, ‘SELU’, ‘LeakyReLU’, ‘PReLU’, ‘Sigmoid’].

  • output_size (int or list of int, default=1) – number of outputs (typically number of quantiles for QuantileLoss and one target or list of output sizes but currently only point-forecasts allowed). Set automatically.

  • static_categoricals (list of str, optional) – names of static categorical variables

  • static_reals (list of str, optional) – names of static continuous variables

  • time_varying_categoricals_encoder (list of str, optional) – names of categorical variables for encoder

  • time_varying_categoricals_decoder (list of str, optional) – names of categorical variables for decoder

  • time_varying_reals_encoder (list of str, optional) – names of continuous variables for encoder

  • time_varying_reals_decoder (list of str, optional) – names of continuous variables for decoder

  • categorical_groups (Dict[str, list of str], optional) – dictionary where values are list of categorical variables that are forming together a new categorical variable which is the key in the dictionary

  • x_reals (list of str, optional) – order of continuous variables in tensor passed to forward function

  • x_categoricals (list of str, optional) – order of categorical variables in tensor passed to forward function

  • hidden_continuous_size (int, optional) – default for hidden size for processing continuous variables (similar to categorical embedding size)

  • hidden_continuous_sizes (Dict[int, int], optional) – dictionary mapping continuous input indices to sizes for variable selection (fallback to hidden_continuous_size if index is not in dictionary)

  • embedding_sizes (Dict[str, tuple of (int, int)], optional) – dictionary mapping (string) indices to tuple of number of categorical classes and embedding size

  • embedding_paddings (list of str, optional) – list of indices for embeddings which transform the zero’s embedding to a zero vector

  • embedding_labels (Dict[str, list of str], optional) – dictionary mapping (string) indices to list of categorical labels

  • learning_rate (float, default=1e-2) – learning rate

  • log_interval (int, default=-1) – log predictions every x batches, do not log if 0 or less, log interpretation if > 0. If < 1.0 , will log multiple entries per batch.

  • log_val_interval (int, optional) – frequency with which to log validation set metrics, defaults to log_interval

  • log_gradient_flow (bool, default=False) – if to log gradient flow, this takes time and should be only done to diagnose training failures

  • prediction_length (int, default=1) – Length of the prediction. Also known as ‘horizon’.

  • context_length (int, default=1) – Number of time units that condition the predictions. Also known as ‘lookback period’. Should be between 1-10 times the prediction length.

  • backcast_loss_ratio (float, default=0.0) – weight of backcast in comparison to forecast when calculating the loss. A weight of 1.0 means that forecast and backcast loss is weighted the same (regardless of backcast and forecast lengths). Defaults to 0.0, i.e. no weight.

  • reduce_on_plateau_patience (int, default=1000) – patience after which learning rate is reduced by a factor of 10

  • logging_metrics (nn.ModuleList[MultiHorizonMetric], optional) – list of metrics that are logged during training. Defaults to nn.ModuleList([SMAPE(), MAE(), RMSE(), MAPE(), MASE()])

  • **kwargs – additional arguments to BaseModel.

__init__(output_size: int | list[int] = 1, static_categoricals: list[str] | None = None, static_reals: list[str] | None = None, time_varying_categoricals_encoder: list[str] | None = None, time_varying_categoricals_decoder: list[str] | None = None, categorical_groups: dict[str, list[str]] | None = None, time_varying_reals_encoder: list[str] | None = None, time_varying_reals_decoder: list[str] | None = None, embedding_sizes: dict[str, tuple[int, int]] | None = None, embedding_paddings: list[str] | None = None, embedding_labels: list[str] | None = None, x_reals: list[str] | None = None, x_categoricals: list[str] | None = None, context_length: int = 1, prediction_length: int = 1, static_hidden_size: int | None = None, naive_level: bool = True, shared_weights: bool = True, activation: str = 'ReLU', initialization: str = 'lecun_normal', n_blocks: list[str] | None = None, n_layers: int | list[int] = 2, hidden_size: int = 512, pooling_sizes: list[int] | None = None, downsample_frequencies: list[int] | None = None, pooling_mode: str = 'max', interpolation_mode: str = 'linear', batch_normalization: bool = False, dropout: float = 0.0, learning_rate: float = 0.01, log_interval: int = -1, log_gradient_flow: bool = False, log_val_interval: int = None, weight_decay: float = 0.001, loss: MultiHorizonMetric = None, reduce_on_plateau_patience: int = 1000, backcast_loss_ratio: float = 0.0, logging_metrics: ModuleList = None, **kwargs)[source]#

Initialize N-HiTS Model - use its from_dataset() method if possible.

Based on the article N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting. The network has shown to increase accuracy by ~25% against NBeats and also supports covariates.

Parameters:
  • hidden_size (int, default=512) – size of hidden layers and can range from 8 to 1024 - use 32-128 if no covariates are employed.

  • static_hidden_size (int, optional) – size of hidden layers for static variables. Defaults to hidden_size.

  • loss (MultiHorizonMetric, default=MASE()) – loss to optimize. QuantileLoss is also supported.

  • shared_weights (bool, default=True) – if True, weights of blocks are shared in each stack.

  • naive_level (bool, default=True) – if True, native forecast of last observation is added at the beginning.

  • initialization (str, default="lecun_normal") – Initialization method. One of [‘orthogonal’, ‘he_uniform’, ‘glorot_uniform’, ‘glorot_normal’, ‘lecun_normal’].

  • n_blocks (list of int, default=[1, 1, 1]) – list of blocks used in each stack (i.e. length of stacks).

  • n_layers (int or list of int, default=2) – Number of layers per block or list of number of layers used by blocks in each stack (i.e. length of stacks).

  • pooling_sizes (list of int, optional) – List of pooling sizes for input for each stack, i.e. higher means more smoothing of input. Using an ordering of higher to lower in the list improves results. Defaults to a heuristic.

  • pooling_mode (str, default="max") – Pooling mode for summarizing input. One of [‘max’,’average’].

  • downsample_frequencies (list of int, optional) – Downsample multiplier of output for each stack, i.e. higher means more interpolation at forecast time is required. Should be equal or higher than pooling_sizes but smaller equal prediction_length. Defaults to a heuristic to match pooling_sizes.

  • interpolation_mode (str, default="linear") – Interpolation mode for forecasting. One of [‘linear’, ‘nearest’, ‘cubic-x’] where ‘x’ is replaced by a batch size for the interpolation.

  • batch_normalization (bool, default=False) – Whether carry out batch normalization.

  • dropout (float, default=0.0) – dropout rate for hidden layers.

  • activation (str, default="ReLU") – activation function. One of [‘ReLU’, ‘Softplus’, ‘Tanh’, ‘SELU’, ‘LeakyReLU’, ‘PReLU’, ‘Sigmoid’].

  • output_size (int or list of int, default=1) – number of outputs (typically number of quantiles for QuantileLoss and one target or list of output sizes but currently only point-forecasts allowed). Set automatically.

  • static_categoricals (list of str, optional) – names of static categorical variables

  • static_reals (list of str, optional) – names of static continuous variables

  • time_varying_categoricals_encoder (list of str, optional) – names of categorical variables for encoder

  • time_varying_categoricals_decoder (list of str, optional) – names of categorical variables for decoder

  • time_varying_reals_encoder (list of str, optional) – names of continuous variables for encoder

  • time_varying_reals_decoder (list of str, optional) – names of continuous variables for decoder

  • categorical_groups (Dict[str, list of str], optional) – dictionary where values are list of categorical variables that are forming together a new categorical variable which is the key in the dictionary

  • x_reals (list of str, optional) – order of continuous variables in tensor passed to forward function

  • x_categoricals (list of str, optional) – order of categorical variables in tensor passed to forward function

  • hidden_continuous_size (int, optional) – default for hidden size for processing continuous variables (similar to categorical embedding size)

  • hidden_continuous_sizes (Dict[int, int], optional) – dictionary mapping continuous input indices to sizes for variable selection (fallback to hidden_continuous_size if index is not in dictionary)

  • embedding_sizes (Dict[str, tuple of (int, int)], optional) – dictionary mapping (string) indices to tuple of number of categorical classes and embedding size

  • embedding_paddings (list of str, optional) – list of indices for embeddings which transform the zero’s embedding to a zero vector

  • embedding_labels (Dict[str, list of str], optional) – dictionary mapping (string) indices to list of categorical labels

  • learning_rate (float, default=1e-2) – learning rate

  • log_interval (int, default=-1) – log predictions every x batches, do not log if 0 or less, log interpretation if > 0. If < 1.0 , will log multiple entries per batch.

  • log_val_interval (int, optional) – frequency with which to log validation set metrics, defaults to log_interval

  • log_gradient_flow (bool, default=False) – if to log gradient flow, this takes time and should be only done to diagnose training failures

  • prediction_length (int, default=1) – Length of the prediction. Also known as ‘horizon’.

  • context_length (int, default=1) – Number of time units that condition the predictions. Also known as ‘lookback period’. Should be between 1-10 times the prediction length.

  • backcast_loss_ratio (float, default=0.0) – weight of backcast in comparison to forecast when calculating the loss. A weight of 1.0 means that forecast and backcast loss is weighted the same (regardless of backcast and forecast lengths). Defaults to 0.0, i.e. no weight.

  • reduce_on_plateau_patience (int, default=1000) – patience after which learning rate is reduced by a factor of 10

  • logging_metrics (nn.ModuleList[MultiHorizonMetric], optional) – list of metrics that are logged during training. Defaults to nn.ModuleList([SMAPE(), MAE(), RMSE(), MAPE(), MASE()])

  • **kwargs – additional arguments to BaseModel.

Methods

__call__(*args, **kwargs)

Call self as a function.

__delattr__(name)

Implement delattr(self, name).

__dir__()

Default dir() implementation.

__eq__(value, /)

Return self==value.

__format__(format_spec, /)

Default object formatter.

__ge__(value, /)

Return self>=value.

__getattr__(name)

__getattribute__(name, /)

Return getattr(self, name).

__getstate__()

Helper for pickle.

__gt__(value, /)

Return self>value.

__hash__()

Return hash(self).

__init_subclass__

This method is called when a class is subclassed.

__le__(value, /)

Return self<=value.

__lt__(value, /)

Return self<value.

__ne__(value, /)

Return self!=value.

__new__(*args, **kwargs)

__reduce__()

Helper for pickle.

__reduce_ex__(protocol, /)

Helper for pickle.

__repr__()

Return repr(self).

__setattr__(name, value)

Implement setattr(self, name, value).

__setstate__(state)

__sizeof__()

Size of object in memory, in bytes.

__str__()

Return str(self).

__subclasshook__

Abstract classes can override this to customize issubclass().

_apply(fn[, recurse])

_apply_batch_transfer_handler(batch[, ...])

_call_batch_hook(hook_name, *args)

_call_impl(*args, **kwargs)

_get_backward_hooks()

Return the backward hooks for use in the call function.

_get_backward_pre_hooks()

_get_name()

_load_from_state_dict(state_dict, prefix, ...)

Copy parameters and buffers from state_dict into only this module, but not its descendants.

_log_dict_through_fabric(dictionary[, logger])

_logger_supports(method)

Whether logger supports method.

_maybe_warn_non_full_backward_hook(inputs, ...)

_named_members(get_members_fn[, prefix, ...])

Help yield various names + members of modules.

_on_before_batch_transfer(batch[, ...])

_pkg()

Package for the model.

_register_load_state_dict_pre_hook(hook[, ...])

See register_load_state_dict_pre_hook() for details.

_register_state_dict_hook(hook)

Register a post-hook for the state_dict() method.

_replicate_for_data_parallel()

_save_to_state_dict(destination, prefix, ...)

Save module state to the destination dictionary.

_set_hparams(hp)

_slow_forward(*input, **kwargs)

_to_hparams_dict(hp)

_verify_is_manual_optimization(fn_name)

_wrapped_call_impl(*args, **kwargs)

add_module(name, module)

Add a child module to the current module.

all_gather(data[, group, sync_grads])

Gather tensors or collections of tensors from multiple processes.

apply(fn)

Apply fn recursively to every submodule (as returned by .children()) as well as self.

backward(loss, *args, **kwargs)

Called to perform backward on the loss returned in training_step().

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

buffers([recurse])

Return an iterator over module buffers.

calculate_prediction_actual_by_variable(x, ...)

Calculate predictions and actuals by variable averaged by bins bins spanning from -std to +std

children()

Return an iterator over immediate children modules.

clip_gradients(optimizer[, ...])

Handles gradient clipping internally.

compile(*args, **kwargs)

Compile this Module's forward using torch.compile().

configure_callbacks()

Configure model-specific callbacks.

configure_gradient_clipping(optimizer[, ...])

Perform gradient clipping for the optimizer parameters.

configure_model()

Hook to create modules in a strategy and precision aware context.

configure_optimizers()

Configure optimizers.

configure_sharded_model()

Deprecated.

cpu()

See torch.nn.Module.cpu().

create_log(x, y, out, batch_idx[, ...])

Create the log used in the training and validation step.

cuda([device])

Moves all model parameters and buffers to the GPU.

deduce_default_output_parameters(dataset, kwargs)

Deduce default parameters for output for from_dataset() method.

double()

See torch.nn.Module.double().

eval()

Set the module in evaluation mode.

extra_repr()

Return extra information about parameters for representation/logging.

extract_features(x[, embeddings, period])

Extract features

float()

See torch.nn.Module.float().

forward(x)

Pass forward of network.

freeze()

Freeze all params for inference.

from_dataset(dataset, **kwargs)

Convenience function to create network from :py:class`~pytorch_forecasting.data.timeseries.TimeSeriesDataSet`.

get_buffer(target)

Return the buffer given by target if it exists, otherwise throw an error.

get_extra_state()

Return any extra state to include in the module's state_dict.

get_parameter(target)

Return the parameter given by target if it exists, otherwise throw an error.

get_submodule(target)

Return the submodule given by target if it exists, otherwise throw an error.

half()

See torch.nn.Module.half().

ipu([device])

Move all model parameters and buffers to the IPU.

load_from_checkpoint(checkpoint_path[, ...])

Primary way of loading a model from a checkpoint.

load_state_dict(state_dict[, strict, assign])

Copy parameters and buffers from state_dict into this module and its descendants.

log(*args, **kwargs)

See lightning.pytorch.core.lightning.LightningModule.log().

log_dict(dictionary[, prog_bar, logger, ...])

Log a dictionary of values at once.

log_gradient_flow(named_parameters)

log distribution of gradients to identify exploding / vanishing gradients

log_interpretation(x, out, batch_idx)

Log interpretation of network predictions in tensorboard.

log_metrics(x, y, out[, prediction_kwargs])

Log metrics every training/validation step.

log_prediction(x, out, batch_idx, **kwargs)

Log metrics every training/validation step.

lr_scheduler_step(scheduler, metric)

Override this method to adjust the default way the Trainer calls each scheduler.

lr_schedulers()

Returns the learning rate scheduler(s) that are being used during training.

manual_backward(loss, *args, **kwargs)

Call this directly from your training_step() when doing optimizations manually.

modules([remove_duplicate])

Return an iterator over all modules in the network.

mtia([device])

Move all model parameters and buffers to the MTIA.

named_buffers([prefix, recurse, ...])

Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

named_children()

Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

named_modules([memo, prefix, remove_duplicate])

Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

named_parameters([prefix, recurse, ...])

Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

on_after_backward()

Log gradient flow for debugging.

on_after_batch_transfer(batch, dataloader_idx)

Override to alter or apply batch augmentations to your batch after it is transferred to the device.

on_before_backward(loss)

Called before loss.backward().

on_before_batch_transfer(batch, dataloader_idx)

Override to alter or apply batch augmentations to your batch before it is transferred to the device.

on_before_optimizer_step(optimizer)

Called before optimizer.step().

on_before_zero_grad(optimizer)

Called after training_step() and before optimizer.zero_grad().

on_epoch_end(outputs)

Run at epoch end for training or validation.

on_fit_end()

Called at the very end of fit.

on_fit_start()

Called at the very beginning of fit.

on_load_checkpoint(checkpoint)

Called by Lightning to restore your model.

on_predict_batch_end(outputs, batch, batch_idx)

Called in the predict loop after the batch.

on_predict_batch_start(batch, batch_idx[, ...])

Called in the predict loop before anything happens for that batch.

on_predict_end()

Called at the end of predicting.

on_predict_epoch_end()

Called at the end of predicting.

on_predict_epoch_start()

Called at the beginning of predicting.

on_predict_model_eval()

Called when the predict loop starts.

on_predict_start()

Called at the beginning of predicting.

on_save_checkpoint(checkpoint)

Called by Lightning when saving a checkpoint to give you a chance to store anything else you might want to save.

on_test_batch_end(outputs, batch, batch_idx)

Called in the test loop after the batch.

on_test_batch_start(batch, batch_idx[, ...])

Called in the test loop before anything happens for that batch.

on_test_end()

Called at the end of testing.

on_test_epoch_end()

Called in the test loop at the very end of the epoch.

on_test_epoch_start()

Called in the test loop at the very beginning of the epoch.

on_test_model_eval()

Called when the test loop starts.

on_test_model_train()

Called when the test loop ends.

on_test_start()

Called at the beginning of testing.

on_train_batch_end(outputs, batch, batch_idx)

Called in the training loop after the batch.

on_train_batch_start(batch, batch_idx)

Called in the training loop before anything happens for that batch.

on_train_end()

Called at the end of training before logger experiment is closed.

on_train_epoch_end()

Called in the training loop at the very end of the epoch.

on_train_epoch_start()

Called in the training loop at the very beginning of the epoch.

on_train_start()

Called at the beginning of training after sanity check.

on_validation_batch_end(outputs, batch, ...)

Called in the validation loop after the batch.

on_validation_batch_start(batch, batch_idx)

Called in the validation loop before anything happens for that batch.

on_validation_end()

Called at the end of validation.

on_validation_epoch_end()

Called in the validation loop at the very end of the epoch.

on_validation_epoch_start()

Called in the validation loop at the very beginning of the epoch.

on_validation_model_eval()

Called when the validation loop starts.

on_validation_model_train()

Called when the validation loop ends.

on_validation_model_zero_grad()

Called by the training loop to release gradients before entering the validation loop.

on_validation_start()

Called at the beginning of validation.

optimizer_step(epoch, batch_idx, optimizer)

Override this method to adjust the default way the Trainer calls the optimizer.

optimizer_zero_grad(epoch, batch_idx, optimizer)

Override this method to change the default behaviour of optimizer.zero_grad().

optimizers([use_pl_optimizer])

Returns the optimizer(s) that are being used during training.

parameters([recurse])

Return an iterator over module parameters.

plot_interpretation(x, output, idx[, ax])

Plot interpretation.

plot_prediction(x, out[, idx, ...])

Plot prediction of prediction vs actuals

plot_prediction_actual_by_variable(data[, ...])

Plot predicions and actual averages by variables

predict(data[, mode, return_index, ...])

Run inference / prediction.

predict_dataloader()

An iterable or collection of iterables specifying prediction samples.

predict_dependency(data, variable, values[, ...])

Predict partial dependency.

predict_step(batch, batch_idx)

Step function called during predict().

prepare_data()

Use this to download and prepare data.

print(*args, **kwargs)

Prints only from process 0.

register_backward_hook(hook)

Register a backward hook on the module.

register_buffer(name, tensor[, persistent])

Add a buffer to the module.

register_forward_hook(hook, *[, prepend, ...])

Register a forward hook on the module.

register_forward_pre_hook(hook, *[, ...])

Register a forward pre-hook on the module.

register_full_backward_hook(hook[, prepend])

Register a backward hook on the module.

register_full_backward_pre_hook(hook[, prepend])

Register a backward pre-hook on the module.

register_load_state_dict_post_hook(hook)

Register a post-hook to be run after module's load_state_dict() is called.

register_load_state_dict_pre_hook(hook)

Register a pre-hook to be run before module's load_state_dict() is called.

register_module(name, module)

Alias for add_module().

register_parameter(name, param)

Add a parameter to the module.

register_state_dict_post_hook(hook)

Register a post-hook for the state_dict() method.

register_state_dict_pre_hook(hook)

Register a pre-hook for the state_dict() method.

remove_ignored_hparams(ignore_list)

Remove ignored hyperparameters from the stored state.

requires_grad_([requires_grad])

Change if autograd should record operations on parameters in this module.

save_hyperparameters(*args[, ignore, frame, ...])

Save arguments to hparams attribute.

set_extra_state(state)

Set extra state contained in the loaded state_dict.

set_submodule(target, module[, strict])

Set the submodule given by target if it exists, otherwise throw an error.

setup(stage)

Called at the beginning of fit (train + validate), validate, test, or predict.

share_memory()

See torch.Tensor.share_memory_().

size()

get number of parameters in model

state_dict(*args[, destination, prefix, ...])

Return a dictionary containing references to the whole state of the module.

step(x, y, batch_idx)

Take training / validation step.

teardown(stage)

Called at the end of fit (train + validate), validate, test, or predict.

test_dataloader()

An iterable or collection of iterables specifying test samples.

test_step(batch, batch_idx)

Operates on a single batch of data from the test set.

to(*args, **kwargs)

See torch.nn.Module.to().

to_empty(*, device[, recurse])

Move the parameters and buffers to the specified device without copying storage.

to_network_output(**results)

Convert output into a named (and immutable) tuple.

to_onnx([file_path, input_sample])

Saves the model in ONNX format.

to_prediction(out[, use_metric])

Convert output to prediction using the loss metric.

to_quantiles(out[, use_metric])

Convert output to quantiles using the loss metric.

to_tensorrt([file_path, input_sample, ir, ...])

Export the model to ScriptModule or GraphModule using TensorRT compile backend.

to_torchscript([file_path, method, ...])

By default compiles the whole model to a torch.jit.ScriptModule.

toggle_optimizer(optimizer)

Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.

toggled_optimizer(optimizer)

Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.

train([mode])

Set the module in training mode.

train_dataloader()

An iterable or collection of iterables specifying training samples.

training_step(batch, batch_idx)

Train on batch.

transfer_batch_to_device(batch, device, ...)

Override this hook if your DataLoader returns tensors wrapped in a custom data structure.

transform_output(prediction, target_scale[, ...])

Extract prediction from network output and rescale it to real space / de-normalize it.

type(dst_type)

See torch.nn.Module.type().

unfreeze()

Unfreeze all parameters for training.

untoggle_optimizer(optimizer)

Resets the state of required gradients that were toggled with toggle_optimizer().

val_dataloader()

An iterable or collection of iterables specifying validation samples.

validation_step(batch, batch_idx)

Operates on a single batch of data from the validation set.

xpu([device])

Move all model parameters and buffers to the XPU.

zero_grad([set_to_none])

Reset gradients of all model parameters.

Attributes

CHECKPOINT_HYPER_PARAMS_KEY

CHECKPOINT_HYPER_PARAMS_NAME

CHECKPOINT_HYPER_PARAMS_SPECIAL_KEY

CHECKPOINT_HYPER_PARAMS_TYPE

T_destination

__annotations__

__dict__

__doc__

__jit_unused_properties__

__module__

__weakref__

list of weak references to the object

_compiled_call_impl

_jit_is_scripting

_version

This allows better BC support for load_state_dict().

automatic_optimization

If set to False you are responsible for calling .backward(), .step(), .zero_grad().

call_super_init

categorical_groups_mapping

Mapping of categorical variables to categorical groups

categoricals

List of all categorical variables in model

current_epoch

The current epoch in the Trainer, or 0 if not attached.

current_stage

Available inside lightning loops.

decoder_covariate_size

Decoder covariates size.

decoder_variables

List of all decoder variables in model (excluding static variables)

device

device_mesh

Strategies like ModelParallelStrategy will create a device mesh that can be accessed in the configure_model() hook to parallelize the LightningModule.

dtype

dump_patches

encoder_covariate_size

Encoder covariate size.

encoder_variables

List of all encoder variables in model (excluding static variables)

example_input_array

The example input array is a specification of what the module can consume in the forward() method.

fabric

global_rank

The index of the current process across all nodes and devices.

global_step

Total training batches seen across all epochs.

hparams

The collection of hyperparameters saved with save_hyperparameters().

hparams_initial

The collection of hyperparameters saved with save_hyperparameters().

local_rank

The index of the current process within a single node.

log_interval

Log interval depending if training or validating

logger

Reference to the logger object in the Trainer.

loggers

Reference to the list of loggers in the Trainer.

n_stacks

Number of stacks.

n_targets

Number of targets to forecast.

on_gpu

Returns True if this model is currently located on a GPU.

predicting

reals

List of all continuous variables in model

static_size

Static covariate size.

static_variables

List of all static variables in model

strict_loading

Determines how Lightning loads this model using .load_state_dict(..., strict=model.strict_loading).

target_names

List of targets that are predicted.

target_positions

Positions of target variable(s) in covariates.

trainer

training

_parameters

_buffers

_non_persistent_buffers_set

_backward_pre_hooks

_backward_hooks

_is_full_backward_hook

_forward_hooks

_forward_hooks_with_kwargs

_forward_hooks_always_called

_forward_pre_hooks

_forward_pre_hooks_with_kwargs

_state_dict_hooks

_load_state_dict_pre_hooks

_state_dict_pre_hooks

_load_state_dict_post_hooks

_modules