Distributed Scalers for PyTorch tensor#

The distributed scalers for PyTorch tensors behave similarly to standard distributed scalers, but operate directly on torch.Tensor objects and support optional metadata.

Users may assign variable/channel names to a tensor by setting the variable_names attribute:

tensor.variable_names = ['ch1', 'ch2', 'ch3', 'ch4', 'ch5']

Requirements:

variable_names must be a list of unique strings
The number of names must match the size of the tensor’s channel dimension
The attribute should be assigned after moving the tensor to its target device

If variable_names is not provided, channel alignment checks is limited to total counts and the order cannot be validated.

Example:

from bridgescaler.distributed_tensor import DStandardScalerTensor
import numpy as np

x_1 = torch.from_numpy(np.random.normal(0, 2.2, (20, 5, 4, 8))).to("cuda:0")
x_2 = torch.from_numpy(np.random.normal(1, 3.5, (25, 4, 8, 5))).to("cuda:0")

x_1.variable_names = ["ch1", "ch2", "ch3", "ch4", "ch5"]
x_2.variable_names = ["ch1", "ch2", "ch3", "ch4", "ch5"]

dss_1 = DStandardScalerTensor(channels_last=False)
dss_2 = DStandardScalerTensor(channels_last=True)
dss_1.fit(x_1)
dss_2.fit(x_2)
dss_combined = dss_1 + dss_2

dss_combined.transform(x_1, channels_last=False)

Functions in bridgescaler.backend module all support distributed scalers for PyTorch tensor.