Model: utils

These classes define various convolutional blocks for both dense (regular) and sparse convolutional neural networks (CNNs), abstracting some of the complexities and repetitive code that is often encountered when building such networks directly using PyTorch. Below is an explanation of what these classes are doing, their differences from standard PyTorch implementations, and their limitations.

Module differences and limitations

Differences from PyTorch Direct Implementation

  • Abstraction: These classes encapsulate common patterns (convolution + normalization + activation) into single modules, reducing repetitive code and making the network definitions more concise and easier to read.
  • Configuration: They provide a higher-level interface for configuring layers, automatically setting common parameters such as padding.
  • Sparse Convolution Support: The sparse convolution blocks use the spconv library, which is not part of standard PyTorch, to handle sparse input data more efficiently.

Parameters Abstracted from PyTorch Direct Implementation

  • Padding Calculation: Automatically calculates padding based on the kernel size if not provided.
  • Layer Initialization: Automatically initializes convolutional, normalization, and activation layers within the block, so users don’t need to explicitly define each component.
  • Residual Connections: For the basic blocks, the residual connections (identity mappings) are integrated within the block, simplifying the addition of these connections.

Limitations

  • Flexibility: While these classes simplify the creation of common patterns, they can be less flexible than directly using PyTorch when non-standard configurations or additional customizations are required.
  • Dependency on spconv: The sparse convolution blocks depend on the spconv library, which might not be as widely used or supported as PyTorch’s native functionality.
  • Debugging: Abstracting layers into higher-level blocks can make debugging more difficult, as the internal operations are hidden away. Users may need to dig into the class implementations to troubleshoot issues.
  • Performance Overhead: Although the abstraction can simplify code, it might introduce slight performance overhead due to additional function calls and encapsulation.

Overall, these classes provide a convenient and structured way to build CNNs, particularly when using common patterns and when working with sparse data. However, for highly customized or performance-critical applications, a more direct approach using PyTorch’s lower-level APIs might be preferable.


source

Conv

 Conv (inplanes:int, planes:int, kernel_size:int, stride:int,
       conv_layer:torch.nn.modules.module.Module=<class
       'torch.nn.modules.conv.Conv2d'>, bias:bool=False, **kwargs)

*A convolutional layer module for neural networks.

This class is a wrapper around the specified convolutional layer type, providing a convenient way to include convolutional layers in neural networks with customizable parameters such as input channels, output channels, kernel size, stride, and padding.*

Type Default Details
inplanes int The number of input channels.
planes int The number of output channels.
kernel_size int The size of the convolving kernel.
stride int The stride of the convolution.
conv_layer Module Conv2d The convolutional layer class to be used.
bias bool False If True, adds a learnable bias to the output.
kwargs
Exported source
class Conv(nn.Module):
    """
    A convolutional layer module for neural networks.

    This class is a wrapper around the specified convolutional layer type, 
    providing a convenient way to include convolutional layers in neural networks 
    with customizable parameters such as input channels, output channels, kernel size, 
    stride, and padding.
    """
    def __init__(self,
                 inplanes:int, # The number of input channels.
                 planes:int, # The number of output channels.
                 kernel_size:int, # The size of the convolving kernel.
                 stride:int, # The stride of the convolution.
                 conv_layer:nn.Module=nn.Conv2d, # The convolutional layer class to be used.
                 bias:bool=False, # If `True`, adds a learnable bias to the output.
                 **kwargs # Arbitrary keyword arguments. Currently supports 'padding'.
                 ):
        super(Conv, self).__init__()
        padding = kwargs.get('padding', kernel_size // 2)  # dafault same size

        self.conv = conv_layer(inplanes, planes, kernel_size=kernel_size, stride=stride,
                               padding=padding, bias=bias)
                        
    def forward(self, x):
        return self.conv(x)
# Define input tensor with shape (batch_size, in_channels, height, width)
input_tensor = torch.randn(1, 3, 64, 64)  # Example with batch_size=1, in_channels=3, height=64, width=64

# Create an instance of the Conv class
conv_layer = Conv(inplanes=3, planes=16, kernel_size=3, stride=1)

# Pass the input tensor through the convolutional layer
output_tensor = conv_layer(input_tensor)

# Print the shape of the output tensor
print("Output tensor shape:", output_tensor.shape)
Output tensor shape: torch.Size([1, 16, 64, 64])

source

ConvBlock

 ConvBlock (inplanes:int, planes:int, kernel_size:int, stride:int=1,
            conv_layer:torch.nn.modules.module.Module=<class
            'torch.nn.modules.conv.Conv2d'>,
            norm_layer:torch.nn.modules.module.Module=<class
            'torch.nn.modules.batchnorm.BatchNorm2d'>,
            act_layer:torch.nn.modules.module.Module=<class
            'torch.nn.modules.activation.ReLU'>, **kwargs)

*A convolutional block module combining a convolutional layer, a normalization layer, and an activation layer.

This class encapsulates a common pattern found in neural networks, where a convolution is followed by batch normalization and a non-linear activation function. It provides a convenient way to stack these operations into a single module.*

Type Default Details
inplanes int The number of input channels.
planes int The number of output channels.
kernel_size int The size of the convolving kernel.
stride int 1 The stride of the convolution.
conv_layer Module Conv2d The convolutional layer class to be used.
norm_layer Module BatchNorm2d The normalization layer class to be used.
act_layer Module ReLU The activation function class to be used.
kwargs
Exported source
class ConvBlock(nn.Module):
    """
    A convolutional block module combining a convolutional layer, a normalization layer, 
    and an activation layer.

    This class encapsulates a common pattern found in neural networks, where a convolution 
    is followed by batch normalization and a non-linear activation function. It provides 
    a convenient way to stack these operations into a single module.
    """
    def __init__(self,
                 inplanes: int, # The number of input channels.
                 planes: int, # The number of output channels.
                 kernel_size: int, # The size of the convolving kernel.
                 stride:int=1, # The stride of the convolution.
                 conv_layer:nn.Module=nn.Conv2d, # The convolutional layer class to be used.
                 norm_layer:nn.Module=nn.BatchNorm2d, # The normalization layer class to be used.
                 act_layer:nn.Module=nn.ReLU, # The activation function class to be used.
                 **kwargs # Arbitrary keyword arguments. Currently supports 'padding'.
                 ):
        super(ConvBlock, self).__init__()
        padding = kwargs.get('padding', kernel_size // 2)  # dafault same size

        self.conv = Conv(inplanes, planes, kernel_size=kernel_size, stride=stride,
                               padding=padding, bias=False, conv_layer=conv_layer)

        self.norm = norm_layer(planes)
        self.act = act_layer()

    def forward(self, x):
        out = self.conv(x)
        out = self.norm(out)
        out = self.act(out)
        return out
# Define an instance of the ConvBlock
conv_block = ConvBlock(inplanes=3, planes=16, kernel_size=3, stride=1)

# Create a dummy input tensor with shape (batch_size, channels, height, width)
dummy_input = torch.randn(1, 3, 64, 64)  # Example: batch size of 1, 3 input channels, 64x64 image

# Pass the dummy input through the ConvBlock
output = conv_block(dummy_input)

# Print the shape of the output tensor
print("Output shape:", output.shape)
Output shape: torch.Size([1, 16, 64, 64])

source

BasicBlock

 BasicBlock (inplanes:int, kernel_size:int=3)

*A basic residual block module for neural networks.

This class implements a basic version of the residual block, consisting of two convolutional blocks followed by an addition operation with the input (identity) and an activation function. It is a fundamental component in ResNet architectures, allowing for the training of very deep networks by addressing the vanishing gradient problem.*

Type Default Details
inplanes int Number of input channels
kernel_size int 3 Size of the convolving kernel
Exported source
class BasicBlock(nn.Module):
    """
    A basic residual block module for neural networks.

    This class implements a basic version of the residual block, consisting of two convolutional 
    blocks followed by an addition operation with the input (identity) and an activation function. 
    It is a fundamental component in ResNet architectures, allowing for the training of very deep 
    networks by addressing the vanishing gradient problem.
    """

    def __init__(self,
                 inplanes:int, # Number of input channels
                 kernel_size:int=3 # Size of the convolving kernel
                 ):
        super(BasicBlock, self).__init__()
        self.block1 = ConvBlock(inplanes, inplanes, kernel_size=kernel_size)
        self.block2 = ConvBlock(inplanes, inplanes, kernel_size=kernel_size)
        self.act = nn.ReLU()

    def forward(self, x):
        identity = x
        out = self.block1(x)
        out = self.block2(out)
        out += identity  # Element-wise addition with the input tensor
        out = self.act(out)  # Apply activation function

        return out
# Instantiate the BasicBlock
basic_block = BasicBlock(64)

# Print the structure of the basic_block to understand its components
print(basic_block)

# Create a random tensor with shape (batch_size, channels, height, width)
# Let's assume a batch size of 1, with 64 channels, and spatial dimensions 32x32
input_tensor = torch.randn(1, 64, 32, 32)

# Pass the input tensor through the BasicBlock
output_tensor = basic_block(input_tensor)

# Print the shape of the output tensor
print("Output shape:", output_tensor.shape)
BasicBlock(
  (block1): ConvBlock(
    (conv): Conv(
      (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
    (norm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act): ReLU()
  )
  (block2): ConvBlock(
    (conv): Conv(
      (conv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    )
    (norm): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (act): ReLU()
  )
  (act): ReLU()
)
Output shape: torch.Size([1, 64, 32, 32])

source

SparseConvBlock

 SparseConvBlock (in_channels:int, out_channels:int, kernel_size:int,
                  stride, use_subm:bool=True, bias:bool=False)

*Initializes a sparse convolutional block for 2D inputs.

This block uses SparseConv2d for strides greater than 1 and SubMConv2d for stride equal to 1. It includes a normalization and activation layer following the convolution.*

Type Default Details
in_channels int Number of channels in the input tensor.
out_channels int Number of channels produced by the convolution.
kernel_size int Size of the convolving kernel.
stride Stride of the convolution.
use_subm bool True Whether to use SubMConv2d for stride 1.
bias bool False If True, adds a learnable bias to the output.
Exported source
class SparseConvBlock(spconv.pytorch.SparseModule):
    '''
    Initializes a sparse convolutional block for 2D inputs.

    This block uses SparseConv2d for strides greater than 1 and SubMConv2d for stride equal to 1.
    It includes a normalization and activation layer following the convolution.
    '''

    def __init__(self,
                 in_channels: int, # Number of channels in the input tensor.
                 out_channels: int, # Number of channels produced by the convolution.
                 kernel_size: int, # Size of the convolving kernel.
                 stride, # Stride of the convolution.
                 use_subm:bool=True, # Whether to use SubMConv2d for stride 1.
                 bias:bool=False # If True, adds a learnable bias to the output.
                 ):
        super(SparseConvBlock, self).__init__()
        if stride == 1 and use_subm:
            self.conv = spconv.pytorch.SubMConv2d(in_channels, out_channels, kernel_size,
                                                  padding=kernel_size//2, stride=1, bias=bias,)
        else:
            self.conv = spconv.pytorch.SparseConv2d(in_channels, out_channels, kernel_size,
                                                    padding=kernel_size//2, stride=stride, bias=bias)

        self.norm = nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01)
        self.act = nn.ReLU()

    def forward(self, x):
        out = self.conv(x)
        out = replace_feature(out, self.norm(out.features))
        out = replace_feature(out, self.act(out.features))

        return out
# Example usage
input_tensor = spconv.pytorch.SparseConvTensor(features=torch.randn(5, 3).to(DEVICE),
                                               indices=torch.randint(0, 10, (5, 3), dtype=torch.int32).to(DEVICE),
                                               spatial_shape=[10, 10],
                                               batch_size=1)
conv_block = SparseConvBlock(3, 16, 3, 1).to(DEVICE)
output_tensor = conv_block(input_tensor)
print(output_tensor)
SparseConvTensor[shape=torch.Size([5, 16])]

source

SparseBasicBlock

 SparseBasicBlock (channels:int, kernel_size)

*A basic block for sparse convolutional networks, specifically designed for 2D inputs.

This block consists of two convolutional layers, each followed by normalization and activation. The output of the second convolutional layer is added to the input feature map (residual connection) before applying the final activation function.*

Type Details
channels int Number of channels in the input tensor.
kernel_size Size of the convolving kernel.
Exported source
class SparseBasicBlock(spconv.pytorch.SparseModule):
    '''
    A basic block for sparse convolutional networks, specifically designed for 2D inputs.

    This block consists of two convolutional layers, each followed by normalization and activation.
    The output of the second convolutional layer is added to the input feature map (residual connection)
    before applying the final activation function.
    '''

    def __init__(self,
                 channels:int, # Number of channels in the input tensor.
                 kernel_size # Size of the convolving kernel.
                 ):
        super(SparseBasicBlock, self).__init__()
        self.block1 = SparseConvBlock(channels, channels, kernel_size, 1)
        self.conv2 = spconv.pytorch.SubMConv2d(channels, channels, kernel_size, padding=kernel_size//2,
                                               stride=1, bias=False, algo=ConvAlgo.Native, )
        self.norm2 = nn.BatchNorm1d(channels, eps=1e-3, momentum=0.01)
        self.act2 = nn.ReLU()

    def forward(self, x):
        identity = x
        out = self.block1(x)
        out = self.conv2(out)
        out = replace_feature(out, self.norm2(out.features))
        out = replace_feature(out, out.features + identity.features)
        out = replace_feature(out, self.act2(out.features))

        return out
# Example usage
input_tensor = spconv.pytorch.SparseConvTensor(features=torch.randn(5, 3).to(DEVICE),
                                               indices=torch.randint(0, 10, (5, 3), dtype=torch.int32).to(DEVICE),
                                               spatial_shape=[10, 10],
                                               batch_size=1)
basic_block = SparseBasicBlock(3, 3).to(DEVICE)
output_tensor = basic_block(input_tensor)
print(output_tensor)
SparseConvTensor[shape=torch.Size([5, 3])]

source

SparseConv3dBlock

 SparseConv3dBlock (in_channels:int, out_channels:int, kernel_size,
                    stride, use_subm:bool=True)

*Initializes a sparse convolutional block for 3D inputs.

This block uses SparseConv3d for strides greater than 1 and SubMConv3d for stride equal to 1. It includes a normalization and activation layer following the convolution.*

Type Default Details
in_channels int Number of channels in the input tensor.
out_channels int Number of channels produced by the convolution.
kernel_size Size of the convolving kernel.
stride Stride of the convolution.
use_subm bool True Whether to use SubMConv3d for stride 1.
Exported source
class SparseConv3dBlock(spconv.pytorch.SparseModule):
    '''
    Initializes a sparse convolutional block for 3D inputs.

    This block uses SparseConv3d for strides greater than 1 and SubMConv3d for stride equal to 1.
    It includes a normalization and activation layer following the convolution.
    '''
    def __init__(self,
                in_channels: int, # Number of channels in the input tensor.
                out_channels: int, # Number of channels produced by the convolution.
                kernel_size, # Size of the convolving kernel.
                stride, # Stride of the convolution.
                use_subm:bool=True # Whether to use SubMConv3d for stride 1.
                ):
        super(SparseConv3dBlock, self).__init__()
        if stride == 1 and use_subm:
            self.conv = spconv.pytorch.SubMConv3d(in_channels, out_channels, kernel_size, padding=kernel_size//2,
                                                  stride=1, bias=False)
        else:
            self.conv = spconv.pytorch.SparseConv3d(in_channels, out_channels, kernel_size, padding=kernel_size//2,
                                                    stride=stride, bias=False)

        self.norm = nn.BatchNorm1d(out_channels, eps=1e-3, momentum=0.01)
        self.act = nn.ReLU()

    def forward(self, x):
        out = self.conv(x)
        out = replace_feature(out, self.norm(out.features))
        out = replace_feature(out, self.act(out.features))

        return out
# Example usage
input_tensor = spconv.pytorch.SparseConvTensor(features=torch.randn(5, 3).to(DEVICE),
                                               indices=torch.randint(0, 10, (5, 4), dtype=torch.int32).to(DEVICE),
                                               spatial_shape=[10, 10, 10],
                                               batch_size=1)
conv3d_block = SparseConv3dBlock(3, 16, 3, 1).to(DEVICE)
output_tensor = conv3d_block(input_tensor)
print(output_tensor)
SparseConvTensor[shape=torch.Size([5, 16])]

source

SparseBasicBlock3d

 SparseBasicBlock3d (channels:int, kernel_size)

*A basic block for sparse convolutional networks, specifically designed for 3D inputs.

This block consists of two convolutional layers, each followed by normalization and activation. The output of the second convolutional layer is added to the input feature map (residual connection) before applying the final activation function.*

Type Details
channels int Number of channels in the input tensor.
kernel_size Size of the convolving kernel.
Exported source
class SparseBasicBlock3d(spconv.pytorch.SparseModule):
    '''
    A basic block for sparse convolutional networks, specifically designed for 3D inputs.

    This block consists of two convolutional layers, each followed by normalization and activation.
    The output of the second convolutional layer is added to the input feature map (residual connection)
    before applying the final activation function.
    '''
    def __init__(self,
                 channels:int, # Number of channels in the input tensor.
                 kernel_size # Size of the convolving kernel.
                 ):
        super(SparseBasicBlock3d, self).__init__()
        self.block1 = SparseConv3dBlock(channels, channels, kernel_size, 1)
        self.conv2 = spconv.pytorch.SubMConv3d(channels, channels, kernel_size, padding=kernel_size//2,
                                               stride=1, bias=False)
        self.norm2 = nn.BatchNorm1d(channels, eps=1e-3, momentum=0.01)
        self.act2 = nn.ReLU()

    def forward(self, x):
        identity = x
        out = self.block1(x)
        out = self.conv2(out)
        out = replace_feature(out, self.norm2(out.features))
        out = replace_feature(out, out.features + identity.features)
        out = replace_feature(out, self.act2(out.features))

        return out
# Example usage
input_tensor = spconv.pytorch.SparseConvTensor(features=torch.randn(5, 3).to(DEVICE),
                                       indices=torch.randint(0, 10, (5, 4), dtype=torch.int32).to(DEVICE),
                                       spatial_shape=[10, 10, 10],
                                       batch_size=1)
basic_block3d = SparseBasicBlock3d(3, 3).to(DEVICE)
output_tensor = basic_block3d(input_tensor)
print(output_tensor)
SparseConvTensor[shape=torch.Size([5, 3])]