chen2020mvlidarnet

(UNDER CONSTRUCTION…)


source

ConvBNReLU

 ConvBNReLU (in_channels, out_channels, kernel_size, stride, padding,
             has_ReLU=True)

Sequential composition of convolution, batch normalization and ReLU.

bs, in_c, out_c, h, w = 1, 5, 64, 64, 2048
inp = torch.randn(bs, in_c, h, w)

b = ConvBNReLU(in_c, out_c, 3, 1, 1)
outp = b(inp)
assert outp.shape == (bs, out_c, h, w)
print(outp.shape, f'== ({bs}, {out_c}, {h}, {w})')
torch.Size([1, 64, 64, 2048]) == (1, 64, 64, 2048)

source

InceptionV2

 InceptionV2 (in_channels, out_channels)

InceptionV2 Block from Rethinking the Inception Architecture for Computer Vision.

b = InceptionV2(in_c, out_c)
outp = b(inp)
assert outp.shape == (bs, out_c, h, w)
print(outp.shape, f'== ({bs}, {out_c}, {h}, {w})')
torch.Size([1, 64, 64, 2048]) == (1, 64, 64, 2048)

source

InceptionBlock

 InceptionBlock (in_channels, out_channels, n_modules, has_pool=False)

Sequential composition of InceptionV2 modules.

b = InceptionBlock(in_c, out_c, 2)
outp = b(inp)
assert outp.shape == (bs, out_c, h, w)
print(outp.shape, f'== ({bs}, {out_c}, {h}, {w})')
torch.Size([1, 64, 64, 2048]) == (1, 64, 64, 2048)

source

Encoder

 Encoder (in_channels=5)

MVLidarNet encoder architecture.

enc = Encoder()
outp = enc(inp)
[o.shape for o in outp]
[torch.Size([1, 5, 64, 2048]),
 torch.Size([1, 64, 32, 1024]),
 torch.Size([1, 64, 16, 512]),
 torch.Size([1, 128, 8, 256])]

source

Decoder

 Decoder ()

MVLidarNet decoder architecture.

dec = Decoder()
fts = dec(outp)
assert fts.shape == (bs, out_c, h, w)
print(fts.shape, f'== ({bs}, {out_c}, {h}, {w})')
torch.Size([1, 64, 64, 2048]) == (1, 64, 64, 2048)

source

MVLidarNet

 MVLidarNet (in_channels=5, n_classes=7)

MVLidarNet semantic segmentation architecture.

n_classes=7
model = MVLidarNet()
logits = model(inp)
assert logits.shape == (bs, n_classes, h, w)
print(logits.shape, f'== ({bs}, {n_classes}, {h}, {w})')
torch.Size([1, 7, 64, 2048]) == (1, 7, 64, 2048)