Model: utils

These classes define various convolutional blocks for both dense (regular) and sparse convolutional neural networks (CNNs), abstracting some of the complexities and repetitive code that is often encountered when building such networks directly using PyTorch. Below is an explanation of what these classes are doing, their differences from standard PyTorch implementations, and their limitations.

Module differences and limitations

Differences from PyTorch Direct Implementation

Abstraction: These classes encapsulate common patterns (convolution + normalization + activation) into single modules, reducing repetitive code and making the network definitions more concise and easier to read.
Configuration: They provide a higher-level interface for configuring layers, automatically setting common parameters such as padding.
Sparse Convolution Support: The sparse convolution blocks use the spconv library, which is not part of standard PyTorch, to handle sparse input data more efficiently.

Parameters Abstracted from PyTorch Direct Implementation

Padding Calculation: Automatically calculates padding based on the kernel size if not provided.
Layer Initialization: Automatically initializes convolutional, normalization, and activation layers within the block, so users don’t need to explicitly define each component.
Residual Connections: For the basic blocks, the residual connections (identity mappings) are integrated within the block, simplifying the addition of these connections.

Limitations

Flexibility: While these classes simplify the creation of common patterns, they can be less flexible than directly using PyTorch when non-standard configurations or additional customizations are required.
Dependency on spconv: The sparse convolution blocks depend on the spconv library, which might not be as widely used or supported as PyTorch’s native functionality.
Debugging: Abstracting layers into higher-level blocks can make debugging more difficult, as the internal operations are hidden away. Users may need to dig into the class implementations to troubleshoot issues.
Performance Overhead: Although the abstraction can simplify code, it might introduce slight performance overhead due to additional function calls and encapsulation.

Overall, these classes provide a convenient and structured way to build CNNs, particularly when using common patterns and when working with sparse data. However, for highly customized or performance-critical applications, a more direct approach using PyTorch’s lower-level APIs might be preferable.

source

Conv

 Conv (inplanes:int, planes:int, kernel_size:int, stride:int,
       conv_layer:torch.nn.modules.module.Module=<class
       'torch.nn.modules.conv.Conv2d'>, bias:bool=False, **kwargs)

*A convolutional layer module for neural networks.

This class is a wrapper around the specified convolutional layer type, providing a convenient way to include convolutional layers in neural networks with customizable parameters such as input channels, output channels, kernel size, stride, and padding.*

	Type	Default	Details
inplanes	int		The number of input channels.
planes	int		The number of output channels.
kernel_size	int		The size of the convolving kernel.
stride	int		The stride of the convolution.
conv_layer	Module	Conv2d	The convolutional layer class to be used.
bias	bool	False	If `True`, adds a learnable bias to the output.
kwargs

Exported source

class Conv(nn.Module):
    """
    A convolutional layer module for neural networks.

    This class is a wrapper around the specified convolutional layer type, 
    providing a convenient way to include convolutional layers in neural networks 
    with customizable parameters such as input channels, output channels, kernel size, 
    stride, and padding.
    """
    def __init__(self,
                 inplanes:int, # The number of input channels.
                 planes:int, # The number of output channels.
                 kernel_size:int, # The size of the convolving kernel.
                 stride:int, # The stride of the convolution.
                 conv_layer:nn.Module=nn.Conv2d, # The convolutional layer class to be used.
                 bias:bool=False, # If `True`, adds a learnable bias to the output.
                 **kwargs # Arbitrary keyword arguments. Currently supports 'padding'.
                 ):
        super(Conv, self).__init__()
        padding = kwargs.get('padding', kernel_size // 2)  # dafault same size

        self.conv = conv_layer(inplanes, planes, kernel_size=kernel_size, stride=stride,
                               padding=padding, bias=bias)
                        
    def forward(self, x):
        return self.conv(x)

# Define input tensor with shape (batch_size, in_channels, height, width)
input_tensor = torch.randn(1, 3, 64, 64)  # Example with batch_size=1, in_channels=3, height=64, width=64

# Create an instance of the Conv class
conv_layer = Conv(inplanes=3, planes=16, kernel_size=3, stride=1)

# Pass the input tensor through the convolutional layer
output_tensor = conv_layer(input_tensor)

# Print the shape of the output tensor
print("Output tensor shape:", output_tensor.shape)

Output tensor shape: torch.Size([1, 16, 64, 64])

source

ConvBlock

 ConvBlock (inplanes:int, planes:int, kernel_size:int, stride:int=1,
            conv_layer:torch.nn.modules.module.Module=<class
            'torch.nn.modules.conv.Conv2d'>,
            norm_layer:torch.nn.modules.module.Module=<class
            'torch.nn.modules.batchnorm.BatchNorm2d'>,
            act_layer:torch.nn.modules.module.Module=<class
            'torch.nn.modules.activation.ReLU'>, **kwargs)

*A convolutional block module combining a convolutional layer, a normalization layer, and an activation layer.

This class encapsulates a common pattern found in neural networks, where a convolution is followed by batch normalization and a non-linear activation function. It provides a convenient way to stack these operations into a single module.*

	Type	Default	Details
inplanes	int		The number of input channels.
planes	int		The number of output channels.
kernel_size	int		The size of the convolving kernel.
stride	int	1	The stride of the convolution.
conv_layer	Module	Conv2d	The convolutional layer class to be used.
norm_layer	Module	BatchNorm2d	The normalization layer class to be used.
act_layer	Module	ReLU	The activation function class to be used.
kwargs

Exported source

class ConvBlock(nn.Module):
    """
    A convolutional block module combining a convolutional layer, a normalization layer, 
    and an activation layer.

    This class encapsulates a common pattern found in neural networks, where a convolution 
    is followed by batch normalization and a non-linear activation function. It provides 
    a convenient way to stack these operations into a single module.
    """
    def __init__(self,
                 inplanes: int, # The number of input channels.
                 planes: int, # The number of output channels.
                 kernel_size: int, # The size of the convolving kernel.
                 stride:int=1, # The stride of the convolution.
                 conv_layer:nn.Module=nn.Conv2d, # The convolutional layer class to be used.
                 norm_layer:nn.Module=nn.BatchNorm2d, # The normalization layer class to be used.
                 act_layer:nn.Module=nn.ReLU, # The activation function class to be used.
                 **kwargs # Arbitrary keyword arguments. Currently supports 'padding'.
                 ):
        super(ConvBlock, self).__init__()
        padding = kwargs.get('padding', kernel_size // 2)  # dafault same size

        self.conv = Conv(inplanes, planes, kernel_size=kernel_size, stride=stride,
                               padding=padding, bias=False, conv_layer=conv_layer)

        self.norm = norm_layer(planes)
        self.act = act_layer()

    def forward(self, x):
        out = self.conv(x)
        out = self.norm(out)
        out = self.act(out)
        return out

# Define an instance of the ConvBlock
conv_block = ConvBlock(inplanes=3, planes=16, kernel_size=3, stride=1)

# Create a dummy input tensor with shape (batch_size, channels, height, width)
dummy_input = torch.randn(1, 3, 64, 64)  # Example: batch size of 1, 3 input channels, 64x64 image

# Pass the dummy input through the ConvBlock
output = conv_block(dummy_input)

# Print the shape of the output tensor
print("Output shape:", output.shape)

Output shape: torch.Size([1, 16, 64, 64])

source

BasicBlock

 BasicBlock (inplanes:int, kernel_size:int=3)

*A basic residual block module for neural networks.

This class implements a basic version of the residual block, consisting of two convolutional blocks followed by an addition operation with the input (identity) and an activation function. It is a fundamental component in ResNet architectures, allowing for the training of very deep networks by addressing the vanishing gradient problem.*

	Type	Default	Details
inplanes	int		Number of input channels
kernel_size	int	3	Size of the convolving kernel

Exported source

class BasicBlock(nn.Module):
    """
    A basic residual block module for neural networks.

    This class implements a basic version of the residual block, consisting of two convolutional 
    blocks followed by an addition operation with the input (identity) and an activation function. 
    It is a fundamental component in ResNet architectures, allowing for the training of very deep 
    networks by addressing the vanishing gradient problem.
    """

    def __init__(self,
                 inplanes:int, # Number of input channels
                 kernel_size:int=3 # Size of the convolving kernel
                 ):
        super(BasicBlock, self).__init__()
        self.block1 = ConvBlock(inplanes, inplanes, kernel_size=kernel_size)
        self.block2 = ConvBlock(inplanes, inplanes, kernel_size=kernel_size)
        self.act = nn.ReLU()

    def forward(self, x):
        identity = x
        out = self.block1(x)
        out = self.block2(out)
        out += identity  # Element-wise addition with the input tensor
        out = self.act(out)  # Apply activation function

        return out

# Instantiate the BasicBlock
basic_block = BasicBlock(64)

# Print the structure of the basic_block to understand its components
print(basic_block)

# Create a random tensor with shape (batch_size, channels, height, width)
# Let's assume a batch size of 1, with 64 channels, and spatial dimensions 32x32
input_tensor = torch.randn(1, 64, 32, 32)

# Pass the input tensor through the BasicBlock
output_tensor = basic_block(input_tensor)

# Print the shape of the output tensor
print("Output shape:", output_tensor.shape)

BasicBlock(
  (block1): ConvBlock(
    (conv): Conv(
      (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
    (norm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act): ReLU()
  )
  (block2): ConvBlock(
    (conv): Conv(
      (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
    (norm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act): ReLU()
  )
  (act): ReLU()
)
Output shape: torch.Size([1, 64, 32, 32])

source

SparseConvBlock

 SparseConvBlock (in_channels:int, out_channels:int, kernel_size:int,
                  stride, use_subm:bool=True, bias:bool=False)

*Initializes a sparse convolutional block for 2D inputs.

This block uses SparseConv2d for strides greater than 1 and SubMConv2d for stride equal to 1. It includes a normalization and activation layer following the convolution.*

	Type	Default	Details
in_channels	int		Number of channels in the input tensor.
out_channels	int		Number of channels produced by the convolution.
kernel_size	int		Size of the convolving kernel.
stride			Stride of the convolution.
use_subm	bool	True	Whether to use SubMConv2d for stride 1.
bias	bool	False	If True, adds a learnable bias to the output.

Exported source

class SparseConvBlock(spconv.pytorch.SparseModule):
    '''
    Initializes a sparse convolutional block for 2D inputs.

    This block uses SparseConv2d for strides greater than 1 and SubMConv2d for stride equal to 1.
    It includes a normalization and activation layer following the convolution.
    '''

    def __init__(self,
                 in_channels: int, # Number of channels in the input tensor.
                 out_channels: int, # Number of channels produced by the convolution.
                 kernel_size: int, # Size of the convolving kernel.
                 stride, # Stride of the convolution.
                 use_subm:bool=True, # Whether to use SubMConv2d for stride 1.
                 bias:bool=False # If True, adds a learnable bias to the output.
                 ):
        super(SparseConvBlock, self).__init__()
        if stride == 1 and use_subm:
            self.conv = spconv.pytorch.SubMConv2d(in_channels, out_channels, kernel_size,
                                                  padding=kernel_size//2, stride=1, bias=bias,)
        else:
            self.conv = spconv.pytorch.SparseConv2d(in_channels, out_channels, kernel_size,
                                                    padding=kernel_size//2, stride=stride, bias=bias)

        self.norm = nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01)
        self.act = nn.ReLU()

    def forward(self, x):
        out = self.conv(x)
        out = replace_feature(out, self.norm(out.features))
        out = replace_feature(out, self.act(out.features))

        return out

# Example usage
input_tensor = spconv.pytorch.SparseConvTensor(features=torch.randn(5, 3).to(DEVICE),
                                               indices=torch.randint(0, 10, (5, 3), dtype=torch.int32).to(DEVICE),
                                               spatial_shape=[10, 10],
                                               batch_size=1)
conv_block = SparseConvBlock(3, 16, 3, 1).to(DEVICE)
output_tensor = conv_block(input_tensor)
print(output_tensor)

SparseConvTensor[shape=torch.Size([5, 16])]

source

SparseBasicBlock

 SparseBasicBlock (channels:int, kernel_size)

*A basic block for sparse convolutional networks, specifically designed for 2D inputs.

This block consists of two convolutional layers, each followed by normalization and activation. The output of the second convolutional layer is added to the input feature map (residual connection) before applying the final activation function.*

	Type	Details
channels	int	Number of channels in the input tensor.
kernel_size		Size of the convolving kernel.

Exported source

class SparseBasicBlock(spconv.pytorch.SparseModule):
    '''
    A basic block for sparse convolutional networks, specifically designed for 2D inputs.

    This block consists of two convolutional layers, each followed by normalization and activation.
    The output of the second convolutional layer is added to the input feature map (residual connection)
    before applying the final activation function.
    '''

    def __init__(self,
                 channels:int, # Number of channels in the input tensor.
                 kernel_size # Size of the convolving kernel.
                 ):
        super(SparseBasicBlock, self).__init__()
        self.block1 = SparseConvBlock(channels, channels, kernel_size, 1)
        self.conv2 = spconv.pytorch.SubMConv2d(channels, channels, kernel_size, padding=kernel_size//2,
                                               stride=1, bias=False, algo=ConvAlgo.Native, )
        self.norm2 = nn.BatchNorm1d(channels, eps=1e-3, momentum=0.01)
        self.act2 = nn.ReLU()

    def forward(self, x):
        identity = x
        out = self.block1(x)
        out = self.conv2(out)
        out = replace_feature(out, self.norm2(out.features))
        out = replace_feature(out, out.features + identity.features)
        out = replace_feature(out, self.act2(out.features))

        return out

# Example usage
input_tensor = spconv.pytorch.SparseConvTensor(features=torch.randn(5, 3).to(DEVICE),
                                               indices=torch.randint(0, 10, (5, 3), dtype=torch.int32).to(DEVICE),
                                               spatial_shape=[10, 10],
                                               batch_size=1)
basic_block = SparseBasicBlock(3, 3).to(DEVICE)
output_tensor = basic_block(input_tensor)
print(output_tensor)

SparseConvTensor[shape=torch.Size([5, 3])]

source

SparseConv3dBlock

 SparseConv3dBlock (in_channels:int, out_channels:int, kernel_size,
                    stride, use_subm:bool=True)

*Initializes a sparse convolutional block for 3D inputs.

This block uses SparseConv3d for strides greater than 1 and SubMConv3d for stride equal to 1. It includes a normalization and activation layer following the convolution.*

	Type	Default	Details
in_channels	int		Number of channels in the input tensor.
out_channels	int		Number of channels produced by the convolution.
kernel_size			Size of the convolving kernel.
stride			Stride of the convolution.
use_subm	bool	True	Whether to use SubMConv3d for stride 1.

Exported source

class SparseConv3dBlock(spconv.pytorch.SparseModule):
    '''
    Initializes a sparse convolutional block for 3D inputs.

    This block uses SparseConv3d for strides greater than 1 and SubMConv3d for stride equal to 1.
    It includes a normalization and activation layer following the convolution.
    '''
    def __init__(self,
                in_channels: int, # Number of channels in the input tensor.
                out_channels: int, # Number of channels produced by the convolution.
                kernel_size, # Size of the convolving kernel.
                stride, # Stride of the convolution.
                use_subm:bool=True # Whether to use SubMConv3d for stride 1.
                ):
        super(SparseConv3dBlock, self).__init__()
        if stride == 1 and use_subm:
            self.conv = spconv.pytorch.SubMConv3d(in_channels, out_channels, kernel_size, padding=kernel_size//2,
                                                  stride=1, bias=False)
        else:
            self.conv = spconv.pytorch.SparseConv3d(in_channels, out_channels, kernel_size, padding=kernel_size//2,
                                                    stride=stride, bias=False)

        self.norm = nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01)
        self.act = nn.ReLU()

    def forward(self, x):
        out = self.conv(x)
        out = replace_feature(out, self.norm(out.features))
        out = replace_feature(out, self.act(out.features))

        return out

# Example usage
input_tensor = spconv.pytorch.SparseConvTensor(features=torch.randn(5, 3).to(DEVICE),
                                               indices=torch.randint(0, 10, (5, 4), dtype=torch.int32).to(DEVICE),
                                               spatial_shape=[10, 10, 10],
                                               batch_size=1)
conv3d_block = SparseConv3dBlock(3, 16, 3, 1).to(DEVICE)
output_tensor = conv3d_block(input_tensor)
print(output_tensor)

SparseConvTensor[shape=torch.Size([5, 16])]

source

SparseBasicBlock3d

 SparseBasicBlock3d (channels:int, kernel_size)

*A basic block for sparse convolutional networks, specifically designed for 3D inputs.

	Type	Details
channels	int	Number of channels in the input tensor.
kernel_size		Size of the convolving kernel.

Exported source

class SparseBasicBlock3d(spconv.pytorch.SparseModule):
    '''
    A basic block for sparse convolutional networks, specifically designed for 3D inputs.

    This block consists of two convolutional layers, each followed by normalization and activation.
    The output of the second convolutional layer is added to the input feature map (residual connection)
    before applying the final activation function.
    '''
    def __init__(self,
                 channels:int, # Number of channels in the input tensor.
                 kernel_size # Size of the convolving kernel.
                 ):
        super(SparseBasicBlock3d, self).__init__()
        self.block1 = SparseConv3dBlock(channels, channels, kernel_size, 1)
        self.conv2 = spconv.pytorch.SubMConv3d(channels, channels, kernel_size, padding=kernel_size//2,
                                               stride=1, bias=False)
        self.norm2 = nn.BatchNorm1d(channels, eps=1e-3, momentum=0.01)
        self.act2 = nn.ReLU()

    def forward(self, x):
        identity = x
        out = self.block1(x)
        out = self.conv2(out)
        out = replace_feature(out, self.norm2(out.features))
        out = replace_feature(out, out.features + identity.features)
        out = replace_feature(out, self.act2(out.features))

        return out

# Example usage
input_tensor = spconv.pytorch.SparseConvTensor(features=torch.randn(5, 3).to(DEVICE),
                                       indices=torch.randint(0, 10, (5, 4), dtype=torch.int32).to(DEVICE),
                                       spatial_shape=[10, 10, 10],
                                       batch_size=1)
basic_block3d = SparseBasicBlock3d(3, 3).to(DEVICE)
output_tensor = basic_block3d(input_tensor)
print(output_tensor)

SparseConvTensor[shape=torch.Size([5, 3])]