pytorch_forecasting.models.nhits.NHiTS#

class pytorch_forecasting.models.nhits.NHiTS(output_size: int | list[int] = 1, static_categoricals: list[str] | None = None, static_reals: list[str] | None = None, time_varying_categoricals_encoder: list[str] | None = None, time_varying_categoricals_decoder: list[str] | None = None, categorical_groups: dict[str, list[str]] | None = None, time_varying_reals_encoder: list[str] | None = None, time_varying_reals_decoder: list[str] | None = None, embedding_sizes: dict[str, tuple[int, int]] | None = None, embedding_paddings: list[str] | None = None, embedding_labels: list[str] | None = None, x_reals: list[str] | None = None, x_categoricals: list[str] | None = None, context_length: int = 1, prediction_length: int = 1, static_hidden_size: int | None = None, naive_level: bool = True, shared_weights: bool = True, activation: str = 'ReLU', initialization: str = 'lecun_normal', n_blocks: list[str] | None = None, n_layers: int | list[int] = 2, hidden_size: int = 512, pooling_sizes: list[int] | None = None, downsample_frequencies: list[int] | None = None, pooling_mode: str = 'max', interpolation_mode: str = 'linear', batch_normalization: bool = False, dropout: float = 0.0, learning_rate: float = 0.01, log_interval: int = -1, log_gradient_flow: bool = False, log_val_interval: int = None, weight_decay: float = 0.001, loss: MultiHorizonMetric = None, reduce_on_plateau_patience: int = 1000, backcast_loss_ratio: float = 0.0, logging_metrics: ModuleList = None, **kwargs)[source]#

Initialize N-HiTS Model - use its from_dataset() method if possible.

Based on the article N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting. The network has shown to increase accuracy by ~25% against NBeats and also supports covariates.

Parameters:

hidden_size (int, default=512) – size of hidden layers and can range from 8 to 1024 - use 32-128 if no covariates are employed.
static_hidden_size (int, optional) – size of hidden layers for static variables. Defaults to hidden_size.
loss (MultiHorizonMetric, default=MASE()) – loss to optimize. QuantileLoss is also supported.
shared_weights (bool, default=True) – if True, weights of blocks are shared in each stack.
naive_level (bool, default=True) – if True, native forecast of last observation is added at the beginning.
initialization (str, default="lecun_normal") – Initialization method. One of [‘orthogonal’, ‘he_uniform’, ‘glorot_uniform’, ‘glorot_normal’, ‘lecun_normal’].
n_blocks (list of int, default=[1, 1, 1]) – list of blocks used in each stack (i.e. length of stacks).
n_layers (int or list of int, default=2) – Number of layers per block or list of number of layers used by blocks in each stack (i.e. length of stacks).
pooling_sizes (list of int, optional) – List of pooling sizes for input for each stack, i.e. higher means more smoothing of input. Using an ordering of higher to lower in the list improves results. Defaults to a heuristic.
pooling_mode (str, default="max") – Pooling mode for summarizing input. One of [‘max’,’average’].
downsample_frequencies (list of int, optional) – Downsample multiplier of output for each stack, i.e. higher means more interpolation at forecast time is required. Should be equal or higher than pooling_sizes but smaller equal prediction_length. Defaults to a heuristic to match pooling_sizes.
interpolation_mode (str, default="linear") – Interpolation mode for forecasting. One of [‘linear’, ‘nearest’, ‘cubic-x’] where ‘x’ is replaced by a batch size for the interpolation.
batch_normalization (bool, default=False) – Whether carry out batch normalization.
dropout (float, default=0.0) – dropout rate for hidden layers.
activation (str, default="ReLU") – activation function. One of [‘ReLU’, ‘Softplus’, ‘Tanh’, ‘SELU’, ‘LeakyReLU’, ‘PReLU’, ‘Sigmoid’].
output_size (int or list of int, default=1) – number of outputs (typically number of quantiles for QuantileLoss and one target or list of output sizes but currently only point-forecasts allowed). Set automatically.
static_categoricals (list of str, optional) – names of static categorical variables
static_reals (list of str, optional) – names of static continuous variables
time_varying_categoricals_encoder (list of str, optional) – names of categorical variables for encoder
time_varying_categoricals_decoder (list of str, optional) – names of categorical variables for decoder
time_varying_reals_encoder (list of str, optional) – names of continuous variables for encoder
time_varying_reals_decoder (list of str, optional) – names of continuous variables for decoder
categorical_groups (Dict[str, list of str], optional) – dictionary where values are list of categorical variables that are forming together a new categorical variable which is the key in the dictionary
x_reals (list of str, optional) – order of continuous variables in tensor passed to forward function
x_categoricals (list of str, optional) – order of categorical variables in tensor passed to forward function
hidden_continuous_size (int, optional) – default for hidden size for processing continuous variables (similar to categorical embedding size)
hidden_continuous_sizes (Dict[int, int], optional) – dictionary mapping continuous input indices to sizes for variable selection (fallback to hidden_continuous_size if index is not in dictionary)
embedding_sizes (Dict[str, tuple of (int, int)], optional) – dictionary mapping (string) indices to tuple of number of categorical classes and embedding size
embedding_paddings (list of str, optional) – list of indices for embeddings which transform the zero’s embedding to a zero vector
embedding_labels (Dict[str, list of str], optional) – dictionary mapping (string) indices to list of categorical labels
learning_rate (float, default=1e-2) – learning rate
log_interval (int, default=-1) – log predictions every x batches, do not log if 0 or less, log interpretation if > 0. If < 1.0 , will log multiple entries per batch.
log_val_interval (int, optional) – frequency with which to log validation set metrics, defaults to log_interval
log_gradient_flow (bool, default=False) – if to log gradient flow, this takes time and should be only done to diagnose training failures
prediction_length (int, default=1) – Length of the prediction. Also known as ‘horizon’.
context_length (int, default=1) – Number of time units that condition the predictions. Also known as ‘lookback period’. Should be between 1-10 times the prediction length.
backcast_loss_ratio (float, default=0.0) – weight of backcast in comparison to forecast when calculating the loss. A weight of 1.0 means that forecast and backcast loss is weighted the same (regardless of backcast and forecast lengths). Defaults to 0.0, i.e. no weight.
reduce_on_plateau_patience (int, default=1000) – patience after which learning rate is reduced by a factor of 10
logging_metrics (nn.ModuleList[MultiHorizonMetric], optional) – list of metrics that are logged during training. Defaults to nn.ModuleList([SMAPE(), MAE(), RMSE(), MAPE(), MASE()])
**kwargs – additional arguments to BaseModel.

__init__(output_size: int | list[int] = 1, static_categoricals: list[str] | None = None, static_reals: list[str] | None = None, time_varying_categoricals_encoder: list[str] | None = None, time_varying_categoricals_decoder: list[str] | None = None, categorical_groups: dict[str, list[str]] | None = None, time_varying_reals_encoder: list[str] | None = None, time_varying_reals_decoder: list[str] | None = None, embedding_sizes: dict[str, tuple[int, int]] | None = None, embedding_paddings: list[str] | None = None, embedding_labels: list[str] | None = None, x_reals: list[str] | None = None, x_categoricals: list[str] | None = None, context_length: int = 1, prediction_length: int = 1, static_hidden_size: int | None = None, naive_level: bool = True, shared_weights: bool = True, activation: str = 'ReLU', initialization: str = 'lecun_normal', n_blocks: list[str] | None = None, n_layers: int | list[int] = 2, hidden_size: int = 512, pooling_sizes: list[int] | None = None, downsample_frequencies: list[int] | None = None, pooling_mode: str = 'max', interpolation_mode: str = 'linear', batch_normalization: bool = False, dropout: float = 0.0, learning_rate: float = 0.01, log_interval: int = -1, log_gradient_flow: bool = False, log_val_interval: int = None, weight_decay: float = 0.001, loss: MultiHorizonMetric = None, reduce_on_plateau_patience: int = 1000, backcast_loss_ratio: float = 0.0, logging_metrics: ModuleList = None, **kwargs)[source]#

Initialize N-HiTS Model - use its from_dataset() method if possible.

Based on the article N-HiTS: Neural Hierarchical Interpolation for Time Series Forecasting. The network has shown to increase accuracy by ~25% against NBeats and also supports covariates.

Parameters:

hidden_size (int, default=512) – size of hidden layers and can range from 8 to 1024 - use 32-128 if no covariates are employed.
static_hidden_size (int, optional) – size of hidden layers for static variables. Defaults to hidden_size.
loss (MultiHorizonMetric, default=MASE()) – loss to optimize. QuantileLoss is also supported.
shared_weights (bool, default=True) – if True, weights of blocks are shared in each stack.
naive_level (bool, default=True) – if True, native forecast of last observation is added at the beginning.
initialization (str, default="lecun_normal") – Initialization method. One of [‘orthogonal’, ‘he_uniform’, ‘glorot_uniform’, ‘glorot_normal’, ‘lecun_normal’].
n_blocks (list of int, default=[1, 1, 1]) – list of blocks used in each stack (i.e. length of stacks).
n_layers (int or list of int, default=2) – Number of layers per block or list of number of layers used by blocks in each stack (i.e. length of stacks).
pooling_sizes (list of int, optional) – List of pooling sizes for input for each stack, i.e. higher means more smoothing of input. Using an ordering of higher to lower in the list improves results. Defaults to a heuristic.
pooling_mode (str, default="max") – Pooling mode for summarizing input. One of [‘max’,’average’].
downsample_frequencies (list of int, optional) – Downsample multiplier of output for each stack, i.e. higher means more interpolation at forecast time is required. Should be equal or higher than pooling_sizes but smaller equal prediction_length. Defaults to a heuristic to match pooling_sizes.
interpolation_mode (str, default="linear") – Interpolation mode for forecasting. One of [‘linear’, ‘nearest’, ‘cubic-x’] where ‘x’ is replaced by a batch size for the interpolation.
batch_normalization (bool, default=False) – Whether carry out batch normalization.
dropout (float, default=0.0) – dropout rate for hidden layers.
activation (str, default="ReLU") – activation function. One of [‘ReLU’, ‘Softplus’, ‘Tanh’, ‘SELU’, ‘LeakyReLU’, ‘PReLU’, ‘Sigmoid’].
output_size (int or list of int, default=1) – number of outputs (typically number of quantiles for QuantileLoss and one target or list of output sizes but currently only point-forecasts allowed). Set automatically.
static_categoricals (list of str, optional) – names of static categorical variables
static_reals (list of str, optional) – names of static continuous variables
time_varying_categoricals_encoder (list of str, optional) – names of categorical variables for encoder
time_varying_categoricals_decoder (list of str, optional) – names of categorical variables for decoder
time_varying_reals_encoder (list of str, optional) – names of continuous variables for encoder
time_varying_reals_decoder (list of str, optional) – names of continuous variables for decoder
categorical_groups (Dict[str, list of str], optional) – dictionary where values are list of categorical variables that are forming together a new categorical variable which is the key in the dictionary
x_reals (list of str, optional) – order of continuous variables in tensor passed to forward function
x_categoricals (list of str, optional) – order of categorical variables in tensor passed to forward function
hidden_continuous_size (int, optional) – default for hidden size for processing continuous variables (similar to categorical embedding size)
hidden_continuous_sizes (Dict[int, int], optional) – dictionary mapping continuous input indices to sizes for variable selection (fallback to hidden_continuous_size if index is not in dictionary)
embedding_sizes (Dict[str, tuple of (int, int)], optional) – dictionary mapping (string) indices to tuple of number of categorical classes and embedding size
embedding_paddings (list of str, optional) – list of indices for embeddings which transform the zero’s embedding to a zero vector
embedding_labels (Dict[str, list of str], optional) – dictionary mapping (string) indices to list of categorical labels
learning_rate (float, default=1e-2) – learning rate
log_interval (int, default=-1) – log predictions every x batches, do not log if 0 or less, log interpretation if > 0. If < 1.0 , will log multiple entries per batch.
log_val_interval (int, optional) – frequency with which to log validation set metrics, defaults to log_interval
log_gradient_flow (bool, default=False) – if to log gradient flow, this takes time and should be only done to diagnose training failures
prediction_length (int, default=1) – Length of the prediction. Also known as ‘horizon’.
context_length (int, default=1) – Number of time units that condition the predictions. Also known as ‘lookback period’. Should be between 1-10 times the prediction length.
backcast_loss_ratio (float, default=0.0) – weight of backcast in comparison to forecast when calculating the loss. A weight of 1.0 means that forecast and backcast loss is weighted the same (regardless of backcast and forecast lengths). Defaults to 0.0, i.e. no weight.
reduce_on_plateau_patience (int, default=1000) – patience after which learning rate is reduced by a factor of 10
logging_metrics (nn.ModuleList[MultiHorizonMetric], optional) – list of metrics that are logged during training. Defaults to nn.ModuleList([SMAPE(), MAE(), RMSE(), MAPE(), MASE()])
**kwargs – additional arguments to BaseModel.

Methods

`__call__`(args, *kwargs)	Call self as a function.
`__delattr__`(name)	Implement delattr(self, name).
`__dir__`()	Default dir() implementation.
`__eq__`(value, /)	Return self==value.
`__format__`(format_spec, /)	Default object formatter.
`__ge__`(value, /)	Return self>=value.
`__getattr__`(name)
`__getattribute__`(name, /)	Return getattr(self, name).
`__getstate__`()	Helper for pickle.
`__gt__`(value, /)	Return self>value.
`__hash__`()	Return hash(self).
`__init_subclass__`	This method is called when a class is subclassed.
`__le__`(value, /)	Return self<=value.
`__lt__`(value, /)	Return self<value.
`__ne__`(value, /)	Return self!=value.
`__new__`(args, *kwargs)
`__reduce__`()	Helper for pickle.
`__reduce_ex__`(protocol, /)	Helper for pickle.
`__repr__`()	Return repr(self).
`__setattr__`(name, value)	Implement setattr(self, name, value).
`__setstate__`(state)
`__sizeof__`()	Size of object in memory, in bytes.
`__str__`()	Return str(self).
`__subclasshook__`	Abstract classes can override this to customize issubclass().
`_apply`(fn[, recurse])
`_apply_batch_transfer_handler`(batch[, ...])
`_call_batch_hook`(hook_name, *args)
`_call_impl`(args, *kwargs)
`_get_backward_hooks`()	Return the backward hooks for use in the call function.
`_get_backward_pre_hooks`()
`_get_name`()
`_load_from_state_dict`(state_dict, prefix, ...)	Copy parameters and buffers from `state_dict` into only this module, but not its descendants.
`_log_dict_through_fabric`(dictionary[, logger])
`_logger_supports`(method)	Whether logger supports method.
`_maybe_warn_non_full_backward_hook`(inputs, ...)
`_named_members`(get_members_fn[, prefix, ...])	Help yield various names + members of modules.
`_on_before_batch_transfer`(batch[, ...])
`_pkg`()	Package for the model.
`_register_load_state_dict_pre_hook`(hook[, ...])	See `register_load_state_dict_pre_hook()` for details.
`_register_state_dict_hook`(hook)	Register a post-hook for the `state_dict()` method.
`_replicate_for_data_parallel`()
`_save_to_state_dict`(destination, prefix, ...)	Save module state to the destination dictionary.
`_set_hparams`(hp)
`_slow_forward`(input, *kwargs)
`_to_hparams_dict`(hp)
`_verify_is_manual_optimization`(fn_name)
`_wrapped_call_impl`(args, *kwargs)
`add_module`(name, module)	Add a child module to the current module.
`all_gather`(data[, group, sync_grads])	Gather tensors or collections of tensors from multiple processes.
`apply`(fn)	Apply `fn` recursively to every submodule (as returned by `.children()`) as well as self.
`backward`(loss, args, *kwargs)	Called to perform backward on the loss returned in `training_step()`.
`bfloat16`()	Casts all floating point parameters and buffers to `bfloat16` datatype.
`buffers`([recurse])	Return an iterator over module buffers.
`calculate_prediction_actual_by_variable`(x, ...)	Calculate predictions and actuals by variable averaged by `bins` bins spanning from `-std` to `+std`
`children`()	Return an iterator over immediate children modules.
`clip_gradients`(optimizer[, ...])	Handles gradient clipping internally.
`compile`(args, *kwargs)	Compile this Module's forward using `torch.compile()`.
`configure_callbacks`()	Configure model-specific callbacks.
`configure_gradient_clipping`(optimizer[, ...])	Perform gradient clipping for the optimizer parameters.
`configure_model`()	Hook to create modules in a strategy and precision aware context.
`configure_optimizers`()	Configure optimizers.
`configure_sharded_model`()	Deprecated.
`cpu`()	See `torch.nn.Module.cpu()`.
`create_log`(x, y, out, batch_idx[, ...])	Create the log used in the training and validation step.
`cuda`([device])	Moves all model parameters and buffers to the GPU.
`deduce_default_output_parameters`(dataset, kwargs)	Deduce default parameters for output for from_dataset() method.
`double`()	See `torch.nn.Module.double()`.
`eval`()	Set the module in evaluation mode.
`extra_repr`()	Return extra information about parameters for representation/logging.
`extract_features`(x[, embeddings, period])	Extract features
`float`()	See `torch.nn.Module.float()`.
`forward`(x)	Pass forward of network.
`freeze`()	Freeze all params for inference.
`from_dataset`(dataset, **kwargs)	Convenience function to create network from :py:class`~pytorch_forecasting.data.timeseries.TimeSeriesDataSet`.
`get_buffer`(target)	Return the buffer given by `target` if it exists, otherwise throw an error.
`get_extra_state`()	Return any extra state to include in the module's state_dict.
`get_parameter`(target)	Return the parameter given by `target` if it exists, otherwise throw an error.
`get_submodule`(target)	Return the submodule given by `target` if it exists, otherwise throw an error.
`half`()	See `torch.nn.Module.half()`.
`ipu`([device])	Move all model parameters and buffers to the IPU.
`load_from_checkpoint`(checkpoint_path[, ...])	Primary way of loading a model from a checkpoint.
`load_state_dict`(state_dict[, strict, assign])	Copy parameters and buffers from `state_dict` into this module and its descendants.
`log`(args, *kwargs)	See `lightning.pytorch.core.lightning.LightningModule.log()`.
`log_dict`(dictionary[, prog_bar, logger, ...])	Log a dictionary of values at once.
`log_gradient_flow`(named_parameters)	log distribution of gradients to identify exploding / vanishing gradients
`log_interpretation`(x, out, batch_idx)	Log interpretation of network predictions in tensorboard.
`log_metrics`(x, y, out[, prediction_kwargs])	Log metrics every training/validation step.
`log_prediction`(x, out, batch_idx, **kwargs)	Log metrics every training/validation step.
`lr_scheduler_step`(scheduler, metric)	Override this method to adjust the default way the `Trainer` calls each scheduler.
`lr_schedulers`()	Returns the learning rate scheduler(s) that are being used during training.
`manual_backward`(loss, args, *kwargs)	Call this directly from your `training_step()` when doing optimizations manually.
`modules`([remove_duplicate])	Return an iterator over all modules in the network.
`mtia`([device])	Move all model parameters and buffers to the MTIA.
`named_buffers`([prefix, recurse, ...])	Return an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.
`named_children`()	Return an iterator over immediate children modules, yielding both the name of the module as well as the module itself.
`named_modules`([memo, prefix, remove_duplicate])	Return an iterator over all modules in the network, yielding both the name of the module as well as the module itself.
`named_parameters`([prefix, recurse, ...])	Return an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.
`on_after_backward`()	Log gradient flow for debugging.
`on_after_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch after it is transferred to the device.
`on_before_backward`(loss)	Called before `loss.backward()`.
`on_before_batch_transfer`(batch, dataloader_idx)	Override to alter or apply batch augmentations to your batch before it is transferred to the device.
`on_before_optimizer_step`(optimizer)	Called before `optimizer.step()`.
`on_before_zero_grad`(optimizer)	Called after `training_step()` and before `optimizer.zero_grad()`.
`on_epoch_end`(outputs)	Run at epoch end for training or validation.
`on_fit_end`()	Called at the very end of fit.
`on_fit_start`()	Called at the very beginning of fit.
`on_load_checkpoint`(checkpoint)	Called by Lightning to restore your model.
`on_predict_batch_end`(outputs, batch, batch_idx)	Called in the predict loop after the batch.
`on_predict_batch_start`(batch, batch_idx[, ...])	Called in the predict loop before anything happens for that batch.
`on_predict_end`()	Called at the end of predicting.
`on_predict_epoch_end`()	Called at the end of predicting.
`on_predict_epoch_start`()	Called at the beginning of predicting.
`on_predict_model_eval`()	Called when the predict loop starts.
`on_predict_start`()	Called at the beginning of predicting.
`on_save_checkpoint`(checkpoint)	Called by Lightning when saving a checkpoint to give you a chance to store anything else you might want to save.
`on_test_batch_end`(outputs, batch, batch_idx)	Called in the test loop after the batch.
`on_test_batch_start`(batch, batch_idx[, ...])	Called in the test loop before anything happens for that batch.
`on_test_end`()	Called at the end of testing.
`on_test_epoch_end`()	Called in the test loop at the very end of the epoch.
`on_test_epoch_start`()	Called in the test loop at the very beginning of the epoch.
`on_test_model_eval`()	Called when the test loop starts.
`on_test_model_train`()	Called when the test loop ends.
`on_test_start`()	Called at the beginning of testing.
`on_train_batch_end`(outputs, batch, batch_idx)	Called in the training loop after the batch.
`on_train_batch_start`(batch, batch_idx)	Called in the training loop before anything happens for that batch.
`on_train_end`()	Called at the end of training before logger experiment is closed.
`on_train_epoch_end`()	Called in the training loop at the very end of the epoch.
`on_train_epoch_start`()	Called in the training loop at the very beginning of the epoch.
`on_train_start`()	Called at the beginning of training after sanity check.
`on_validation_batch_end`(outputs, batch, ...)	Called in the validation loop after the batch.
`on_validation_batch_start`(batch, batch_idx)	Called in the validation loop before anything happens for that batch.
`on_validation_end`()	Called at the end of validation.
`on_validation_epoch_end`()	Called in the validation loop at the very end of the epoch.
`on_validation_epoch_start`()	Called in the validation loop at the very beginning of the epoch.
`on_validation_model_eval`()	Called when the validation loop starts.
`on_validation_model_train`()	Called when the validation loop ends.
`on_validation_model_zero_grad`()	Called by the training loop to release gradients before entering the validation loop.
`on_validation_start`()	Called at the beginning of validation.
`optimizer_step`(epoch, batch_idx, optimizer)	Override this method to adjust the default way the `Trainer` calls the optimizer.
`optimizer_zero_grad`(epoch, batch_idx, optimizer)	Override this method to change the default behaviour of `optimizer.zero_grad()`.
`optimizers`([use_pl_optimizer])	Returns the optimizer(s) that are being used during training.
`parameters`([recurse])	Return an iterator over module parameters.
`plot_interpretation`(x, output, idx[, ax])	Plot interpretation.
`plot_prediction`(x, out[, idx, ...])	Plot prediction of prediction vs actuals
`plot_prediction_actual_by_variable`(data[, ...])	Plot predicions and actual averages by variables
`predict`(data[, mode, return_index, ...])	Run inference / prediction.
`predict_dataloader`()	An iterable or collection of iterables specifying prediction samples.
`predict_dependency`(data, variable, values[, ...])	Predict partial dependency.
`predict_step`(batch, batch_idx)	Step function called during `predict()`.
`prepare_data`()	Use this to download and prepare data.
`print`(args, *kwargs)	Prints only from process 0.
`register_backward_hook`(hook)	Register a backward hook on the module.
`register_buffer`(name, tensor[, persistent])	Add a buffer to the module.
`register_forward_hook`(hook, *[, prepend, ...])	Register a forward hook on the module.
`register_forward_pre_hook`(hook, *[, ...])	Register a forward pre-hook on the module.
`register_full_backward_hook`(hook[, prepend])	Register a backward hook on the module.
`register_full_backward_pre_hook`(hook[, prepend])	Register a backward pre-hook on the module.
`register_load_state_dict_post_hook`(hook)	Register a post-hook to be run after module's `load_state_dict()` is called.
`register_load_state_dict_pre_hook`(hook)	Register a pre-hook to be run before module's `load_state_dict()` is called.
`register_module`(name, module)	Alias for `add_module()`.
`register_parameter`(name, param)	Add a parameter to the module.
`register_state_dict_post_hook`(hook)	Register a post-hook for the `state_dict()` method.
`register_state_dict_pre_hook`(hook)	Register a pre-hook for the `state_dict()` method.
`remove_ignored_hparams`(ignore_list)	Remove ignored hyperparameters from the stored state.
`requires_grad_`([requires_grad])	Change if autograd should record operations on parameters in this module.
`save_hyperparameters`(*args[, ignore, frame, ...])	Save arguments to `hparams` attribute.
`set_extra_state`(state)	Set extra state contained in the loaded state_dict.
`set_submodule`(target, module[, strict])	Set the submodule given by `target` if it exists, otherwise throw an error.
`setup`(stage)	Called at the beginning of fit (train + validate), validate, test, or predict.
`share_memory`()	See `torch.Tensor.share_memory_()`.
`size`()	get number of parameters in model
`state_dict`(*args[, destination, prefix, ...])	Return a dictionary containing references to the whole state of the module.
`step`(x, y, batch_idx)	Take training / validation step.
`teardown`(stage)	Called at the end of fit (train + validate), validate, test, or predict.
`test_dataloader`()	An iterable or collection of iterables specifying test samples.
`test_step`(batch, batch_idx)	Operates on a single batch of data from the test set.
`to`(args, *kwargs)	See `torch.nn.Module.to()`.
`to_empty`(*, device[, recurse])	Move the parameters and buffers to the specified device without copying storage.
`to_network_output`(**results)	Convert output into a named (and immutable) tuple.
`to_onnx`([file_path, input_sample])	Saves the model in ONNX format.
`to_prediction`(out[, use_metric])	Convert output to prediction using the loss metric.
`to_quantiles`(out[, use_metric])	Convert output to quantiles using the loss metric.
`to_tensorrt`([file_path, input_sample, ir, ...])	Export the model to ScriptModule or GraphModule using TensorRT compile backend.
`to_torchscript`([file_path, method, ...])	By default compiles the whole model to a `torch.jit.ScriptModule`.
`toggle_optimizer`(optimizer)	Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.
`toggled_optimizer`(optimizer)	Makes sure only the gradients of the current optimizer's parameters are calculated in the training step to prevent dangling gradients in multiple-optimizer setup.
`train`([mode])	Set the module in training mode.
`train_dataloader`()	An iterable or collection of iterables specifying training samples.
`training_step`(batch, batch_idx)	Train on batch.
`transfer_batch_to_device`(batch, device, ...)	Override this hook if your `DataLoader` returns tensors wrapped in a custom data structure.
`transform_output`(prediction, target_scale[, ...])	Extract prediction from network output and rescale it to real space / de-normalize it.
`type`(dst_type)	See `torch.nn.Module.type()`.
`unfreeze`()	Unfreeze all parameters for training.
`untoggle_optimizer`(optimizer)	Resets the state of required gradients that were toggled with `toggle_optimizer()`.
`val_dataloader`()	An iterable or collection of iterables specifying validation samples.
`validation_step`(batch, batch_idx)	Operates on a single batch of data from the validation set.
`xpu`([device])	Move all model parameters and buffers to the XPU.
`zero_grad`([set_to_none])	Reset gradients of all model parameters.

Attributes

`CHECKPOINT_HYPER_PARAMS_KEY`
`CHECKPOINT_HYPER_PARAMS_NAME`
`CHECKPOINT_HYPER_PARAMS_SPECIAL_KEY`
`CHECKPOINT_HYPER_PARAMS_TYPE`
`T_destination`
`__annotations__`
`__dict__`
`__doc__`
`__jit_unused_properties__`
`__module__`
`__weakref__`	list of weak references to the object
`_compiled_call_impl`
`_jit_is_scripting`
`_version`	This allows better BC support for `load_state_dict()`.
`automatic_optimization`	If set to `False` you are responsible for calling `.backward()`, `.step()`, `.zero_grad()`.
`call_super_init`
`categorical_groups_mapping`	Mapping of categorical variables to categorical groups
`categoricals`	List of all categorical variables in model
`current_epoch`	The current epoch in the `Trainer`, or 0 if not attached.
`current_stage`	Available inside lightning loops.
`decoder_covariate_size`	Decoder covariates size.
`decoder_variables`	List of all decoder variables in model (excluding static variables)
`device`
`device_mesh`	Strategies like `ModelParallelStrategy` will create a device mesh that can be accessed in the `configure_model()` hook to parallelize the LightningModule.
`dtype`
`dump_patches`
`encoder_covariate_size`	Encoder covariate size.
`encoder_variables`	List of all encoder variables in model (excluding static variables)
`example_input_array`	The example input array is a specification of what the module can consume in the `forward()` method.
`fabric`
`global_rank`	The index of the current process across all nodes and devices.
`global_step`	Total training batches seen across all epochs.
`hparams`	The collection of hyperparameters saved with `save_hyperparameters()`.
`hparams_initial`	The collection of hyperparameters saved with `save_hyperparameters()`.
`local_rank`	The index of the current process within a single node.
`log_interval`	Log interval depending if training or validating
`logger`	Reference to the logger object in the Trainer.
`loggers`	Reference to the list of loggers in the Trainer.
`n_stacks`	Number of stacks.
`n_targets`	Number of targets to forecast.
`on_gpu`	Returns `True` if this model is currently located on a GPU.
`predicting`
`reals`	List of all continuous variables in model
`static_size`	Static covariate size.
`static_variables`	List of all static variables in model
`strict_loading`	Determines how Lightning loads this model using .load_state_dict(..., strict=model.strict_loading).
`target_names`	List of targets that are predicted.
`target_positions`	Positions of target variable(s) in covariates.
`trainer`
`training`
`_parameters`
`_buffers`
`_non_persistent_buffers_set`
`_backward_pre_hooks`
`_backward_hooks`
`_is_full_backward_hook`
`_forward_hooks`
`_forward_hooks_with_kwargs`
`_forward_hooks_always_called`
`_forward_pre_hooks`
`_forward_pre_hooks_with_kwargs`
`_state_dict_hooks`
`_load_state_dict_pre_hooks`
`_state_dict_pre_hooks`
`_load_state_dict_post_hooks`
`_modules`