当前位置:首页 » 《随便一记》 » 正文

空间金字塔池化改进 SPP / SPPF / SimSPPF / ASPP / RFB / SPPCSPC / SPPFCSPC

18 人参与  2022年09月14日 13:07  分类 : 《随便一记》  评论

点击全文阅读


文章目录

1 原理1.1 SPP(Spatial Pyramid Pooling)1.2 SPPF(Spatial Pyramid Pooling - Fast)1.3 SimSPPF(Simplified SPPF)1.4 ASPP(Atrous Spatial Pyramid Pooling)1.5 RFB(Receptive Field Block)1.6 SPPCSPC1.7 SPPFCSPC? 2 参数量对比3 改进方式4 Issue更多内容导航


更新日志:2022年8月16日上午9:33分前在图片中增加感受野标注?

更新日志:2022年8月29日晚上11点40分在文中增加了SimSPPF模块,并测试了速度

更新日志:2022年8月30日修正了SPPCSPC的结构图

更新日志:2022年8月30日增加了SPPFCSPC的结构


1 原理

1.1 SPP(Spatial Pyramid Pooling)

SPP模块是何凯明大神在2015年的论文《Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition》中被提出。

SPP全程为空间金字塔池化结构,主要是为了解决两个问题:

有效避免了对图像区域裁剪、缩放操作导致的图像失真等问题;解决了卷积神经网络对图相关重复特征提取的问题,大大提高了产生候选框的速度,且节省了计算成本。

在这里插入图片描述

请添加图片描述

class SPP(nn.Module):    # Spatial Pyramid Pooling (SPP) layer https://arxiv.org/abs/1406.4729    def __init__(self, c1, c2, k=(5, 9, 13)):        super().__init__()        c_ = c1 // 2  # hidden channels        self.cv1 = Conv(c1, c_, 1, 1)        self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)        self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])    def forward(self, x):        x = self.cv1(x)        with warnings.catch_warnings():            warnings.simplefilter('ignore')  # suppress torch 1.9.0 max_pool2d() warning            return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))

1.2 SPPF(Spatial Pyramid Pooling - Fast)

这个是YOLOv5作者Glenn Jocher基于SPP提出的,速度较SPP快很多,所以叫SPP-Fast

请添加图片描述

class SPPF(nn.Module):    # Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher    def __init__(self, c1, c2, k=5):  # equivalent to SPP(k=(5, 9, 13))        super().__init__()        c_ = c1 // 2  # hidden channels        self.cv1 = Conv(c1, c_, 1, 1)        self.cv2 = Conv(c_ * 4, c2, 1, 1)        self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)    def forward(self, x):        x = self.cv1(x)        with warnings.catch_warnings():            warnings.simplefilter('ignore')  # suppress torch 1.9.0 max_pool2d() warning            y1 = self.m(x)            y2 = self.m(y1)            return self.cv2(torch.cat((x, y1, y2, self.m(y2)), 1))

1.3 SimSPPF(Simplified SPPF)

美团YOLOv6提出的模块,感觉和SPPF只差了一个激活函数,简单测试了一下,单个ConvBNReLU速度要比ConvBNSiLU18%
请添加图片描述

class SimConv(nn.Module):    '''Normal Conv with ReLU activation'''    def __init__(self, in_channels, out_channels, kernel_size, stride, groups=1, bias=False):        super().__init__()        padding = kernel_size // 2        self.conv = nn.Conv2d(            in_channels,            out_channels,            kernel_size=kernel_size,            stride=stride,            padding=padding,            groups=groups,            bias=bias,        )        self.bn = nn.BatchNorm2d(out_channels)        self.act = nn.ReLU()    def forward(self, x):        return self.act(self.bn(self.conv(x)))    def forward_fuse(self, x):        return self.act(self.conv(x))class SimSPPF(nn.Module):    '''Simplified SPPF with ReLU activation'''    def __init__(self, in_channels, out_channels, kernel_size=5):        super().__init__()        c_ = in_channels // 2  # hidden channels        self.cv1 = SimConv(in_channels, c_, 1, 1)        self.cv2 = SimConv(c_ * 4, out_channels, 1, 1)        self.m = nn.MaxPool2d(kernel_size=kernel_size, stride=1, padding=kernel_size // 2)    def forward(self, x):        x = self.cv1(x)        with warnings.catch_warnings():            warnings.simplefilter('ignore')            y1 = self.m(x)            y2 = self.m(y1)            return self.cv2(torch.cat([x, y1, y2, self.m(y2)], 1))

1.4 ASPP(Atrous Spatial Pyramid Pooling)

受到SPP的启发,语义分割模型DeepLabv2中提出了ASPP模块(空洞空间卷积池化金字塔),该模块使用具有不同采样率的多个并行空洞卷积层。为每个采样率提取的特征在单独的分支中进一步处理,并融合以生成最终结果。该模块通过不同的空洞率构建不同感受野的卷积核,用来获取多尺度物体信息,具体结构比较简单如下图所示:

请添加图片描述

ASPP是在DeepLab中提出来的,在后续的DeepLab版本中对其做了改进,如加入BN层、加入深度可分离卷积等,但基本的思路还是没变。

# without BN versionclass ASPP(nn.Module):    def __init__(self, in_channel=512, out_channel=256):        super(ASPP, self).__init__()        self.mean = nn.AdaptiveAvgPool2d((1, 1))  # (1,1)means ouput_dim        self.conv = nn.Conv2d(in_channel,out_channel, 1, 1)        self.atrous_block1 = nn.Conv2d(in_channel, out_channel, 1, 1)        self.atrous_block6 = nn.Conv2d(in_channel, out_channel, 3, 1, padding=6, dilation=6)        self.atrous_block12 = nn.Conv2d(in_channel, out_channel, 3, 1, padding=12, dilation=12)        self.atrous_block18 = nn.Conv2d(in_channel, out_channel, 3, 1, padding=18, dilation=18)        self.conv_1x1_output = nn.Conv2d(out_channel * 5, out_channel, 1, 1)    def forward(self, x):        size = x.shape[2:]        image_features = self.mean(x)        image_features = self.conv(image_features)        image_features = F.upsample(image_features, size=size, mode='bilinear')        atrous_block1 = self.atrous_block1(x)        atrous_block6 = self.atrous_block6(x)        atrous_block12 = self.atrous_block12(x)        atrous_block18 = self.atrous_block18(x)        net = self.conv_1x1_output(torch.cat([image_features, atrous_block1, atrous_block6,                                              atrous_block12, atrous_block18], dim=1))        return net

1.5 RFB(Receptive Field Block)

RFB模块是在《ECCV2018:Receptive Field Block Net for Accurate and Fast Object Detection》一文中提出的,该文的出发点是模拟人类视觉的感受野从而加强网络的特征提取能力,在结构上RFB借鉴了Inception的思想,主要是在Inception的基础上加入了空洞卷积,从而有效增大了感受野
在这里插入图片描述
请添加图片描述

RFBRFB-s的架构。RFB-s用于在浅层人类视网膜主题图中模拟较小的pRF,使用具有较小内核的更多分支。

class BasicConv(nn.Module):    def __init__(self, in_planes, out_planes, kernel_size, stride=1, padding=0, dilation=1, groups=1, relu=True, bn=True):        super(BasicConv, self).__init__()        self.out_channels = out_planes        if bn:            self.conv = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=False)            self.bn = nn.BatchNorm2d(out_planes, eps=1e-5, momentum=0.01, affine=True)            self.relu = nn.ReLU(inplace=True) if relu else None        else:            self.conv = nn.Conv2d(in_planes, out_planes, kernel_size=kernel_size, stride=stride, padding=padding, dilation=dilation, groups=groups, bias=True)            self.bn = None            self.relu = nn.ReLU(inplace=True) if relu else None    def forward(self, x):        x = self.conv(x)        if self.bn is not None:            x = self.bn(x)        if self.relu is not None:            x = self.relu(x)        return xclass BasicRFB(nn.Module):    def __init__(self, in_planes, out_planes, stride=1, scale=0.1, map_reduce=8, vision=1, groups=1):        super(BasicRFB, self).__init__()        self.scale = scale        self.out_channels = out_planes        inter_planes = in_planes // map_reduce        self.branch0 = nn.Sequential(            BasicConv(in_planes, inter_planes, kernel_size=1, stride=1, groups=groups, relu=False),            BasicConv(inter_planes, 2 * inter_planes, kernel_size=(3, 3), stride=stride, padding=(1, 1), groups=groups),            BasicConv(2 * inter_planes, 2 * inter_planes, kernel_size=3, stride=1, padding=vision + 1, dilation=vision + 1, relu=False, groups=groups)        )        self.branch1 = nn.Sequential(            BasicConv(in_planes, inter_planes, kernel_size=1, stride=1, groups=groups, relu=False),            BasicConv(inter_planes, 2 * inter_planes, kernel_size=(3, 3), stride=stride, padding=(1, 1), groups=groups),            BasicConv(2 * inter_planes, 2 * inter_planes, kernel_size=3, stride=1, padding=vision + 2, dilation=vision + 2, relu=False, groups=groups)        )        self.branch2 = nn.Sequential(            BasicConv(in_planes, inter_planes, kernel_size=1, stride=1, groups=groups, relu=False),            BasicConv(inter_planes, (inter_planes // 2) * 3, kernel_size=3, stride=1, padding=1, groups=groups),            BasicConv((inter_planes // 2) * 3, 2 * inter_planes, kernel_size=3, stride=stride, padding=1, groups=groups),            BasicConv(2 * inter_planes, 2 * inter_planes, kernel_size=3, stride=1, padding=vision + 4, dilation=vision + 4, relu=False, groups=groups)        )        self.ConvLinear = BasicConv(6 * inter_planes, out_planes, kernel_size=1, stride=1, relu=False)        self.shortcut = BasicConv(in_planes, out_planes, kernel_size=1, stride=stride, relu=False)        self.relu = nn.ReLU(inplace=False)    def forward(self, x):        x0 = self.branch0(x)        x1 = self.branch1(x)        x2 = self.branch2(x)        out = torch.cat((x0, x1, x2), 1)        out = self.ConvLinear(out)        short = self.shortcut(x)        out = out * self.scale + short        out = self.relu(out)        return out

1.6 SPPCSPC

该模块是YOLOv7中使用的SPP结构,表现优于SPPF,但参数量和计算量提升了很多

请添加图片描述

class SPPCSPC(nn.Module):    # CSP https://github.com/WongKinYiu/CrossStagePartialNetworks    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=(5, 9, 13)):        super(SPPCSPC, self).__init__()        c_ = int(2 * c2 * e)  # hidden channels        self.cv1 = Conv(c1, c_, 1, 1)        self.cv2 = Conv(c1, c_, 1, 1)        self.cv3 = Conv(c_, c_, 3, 1)        self.cv4 = Conv(c_, c_, 1, 1)        self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])        self.cv5 = Conv(4 * c_, c_, 1, 1)        self.cv6 = Conv(c_, c_, 3, 1)        self.cv7 = Conv(2 * c_, c2, 1, 1)    def forward(self, x):        x1 = self.cv4(self.cv3(self.cv1(x)))        y1 = self.cv6(self.cv5(torch.cat([x1] + [m(x1) for m in self.m], 1)))        y2 = self.cv2(x)        return self.cv7(torch.cat((y1, y2), dim=1))
#分组SPPCSPC 分组后参数量和计算量与原本差距不大,不知道效果怎么样class SPPCSPC_group(nn.Module):    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=(5, 9, 13)):        super(SPPCSPC_group, self).__init__()        c_ = int(2 * c2 * e)  # hidden channels        self.cv1 = Conv(c1, c_, 1, 1, g=4)        self.cv2 = Conv(c1, c_, 1, 1, g=4)        self.cv3 = Conv(c_, c_, 3, 1, g=4)        self.cv4 = Conv(c_, c_, 1, 1, g=4)        self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])        self.cv5 = Conv(4 * c_, c_, 1, 1, g=4)        self.cv6 = Conv(c_, c_, 3, 1, g=4)        self.cv7 = Conv(2 * c_, c2, 1, 1, g=4)    def forward(self, x):        x1 = self.cv4(self.cv3(self.cv1(x)))        y1 = self.cv6(self.cv5(torch.cat([x1] + [m(x1) for m in self.m], 1)))        y2 = self.cv2(x)        return self.cv7(torch.cat((y1, y2), dim=1))

1.7 SPPFCSPC?

我借鉴了SPPF的思想将SPPCSPC优化了一下,得到了SPPFCSPC,在保持感受野不变的情况下获得速度提升;我把这个模块给v7作者看了,并没有得到否定,详细回答可以看4 Issue
请添加图片描述

class SPPFCSPC(nn.Module):    # 本代码由YOLOAir目标检测交流群 心动 大佬贡献    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5, k=5):        super(SPPFCSPC, self).__init__()        c_ = int(2 * c2 * e)  # hidden channels        self.cv1 = Conv(c1, c_, 1, 1)        self.cv2 = Conv(c1, c_, 1, 1)        self.cv3 = Conv(c_, c_, 3, 1)        self.cv4 = Conv(c_, c_, 1, 1)        self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)        self.cv5 = Conv(4 * c_, c_, 1, 1)        self.cv6 = Conv(c_, c_, 3, 1)        self.cv7 = Conv(2 * c_, c2, 1, 1)    def forward(self, x):        x1 = self.cv4(self.cv3(self.cv1(x)))        x2 = self.m(x1)        x3 = self.m(x2)        y1 = self.cv6(self.cv5(torch.cat((x1,x2,x3, self.m(x3)),1)))        y2 = self.cv2(x)        return self.cv7(torch.cat((y1, y2), dim=1))

2 参数量对比

这里我在yolov5s.yaml中使用各个模型替换SPP模块

模型参数量(parameters)计算量(GFLOPs)
SPP722588516.5
SPPF723538916.5
SimSPPF723538916.5
ASPP1548572523.1
BasicRFB789542117.1
SPPCSPC1366354921.7
SPPFCSPC?1366354921.7
分组SPPCSPC835513317.4

3 改进方式

第一步;各个代码放入common.py
第二步;yolo.py中加入类名
第三步;修改配置文件
yolov5配置文件如下:

# YOLOv5 ? by Ultralytics, GPL-3.0 license# YOLOv5 v6.0 backbonebackbone:  # [from, number, module, args]  [[-1, 1, Conv, [64, 6, 2, 2]],  # 0-P1/2   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4   [-1, 3, C3, [128]],   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8   [-1, 6, C3, [256]],   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16   [-1, 9, C3, [512]],   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32   [-1, 3, C3, [1024]],   [-1, 1, SPPF, [1024, 5]],  # 9   [-1, 1, ASPP, [1024]],  # 9   [-1, 1, SPP, [1024]],   [-1, 1, SimSPPF, [1024, 5]],   [-1, 1, BasicRFB, [1024]],   [-1, 1, SPPCSPC, [1024]],   [-1, 1, SPPFCSPC, [1024, 5]], # ?  ]

4 Issue

Q:Why use SPPCSPC instead of SPPFCSPC? /
yolov5’s SPPF is much faster than SPP.
Why not try to replace SPPCSPC with SPPFCSPC?
请添加图片描述
A:
Max pooling uses very few computation, if you programming well, above one could run three max pool layers in parallel, while below one must process three max pool layers sequentially.
By the way, you could replace SPPCSPC by SPPFCSPC at inference time if your hardware is friendly to SPPFCSPC.

感兴趣的可以试一下


更多内容导航

1.手把手带你调参Yolo v5 (v6.2)(一)?强烈推荐

2.手把手带你调参Yolo v5 (v6.2)(二)?

3.如何快速使用自己的数据集训练Yolov5模型

4.手把手带你Yolov5 (v6.2)添加注意力机制(一)(并附上30多种顶会Attention原理图)?

5.手把手带你Yolov5 (v6.2)添加注意力机制(二)(在C3模块中加入注意力机制)

6.Yolov5如何更换激活函数?

7.Yolov5 (v6.2)数据增强方式解析

8.Yolov5更换上采样方式( 最近邻 / 双线性 / 双立方 / 三线性 / 转置卷积)

9.Yolov5如何更换EIOU / alpha IOU / SIoU?

10.Yolov5更换主干网络之《旷视轻量化卷积神经网络ShuffleNetv2》?

11.YOLOv5应用轻量级通用上采样算子CARAFE?

12.空间金字塔池化改进 SPP / SPPF / SimSPPF / ASPP / RFB / SPPCSPC?

13.用于低分辨率图像和小物体的模块SPD-Conv?

14.持续更新中


参考文献:增强感受野SPP、ASPP、RFB、PPM


点击全文阅读


本文链接:http://zhangshiyu.com/post/44901.html

<< 上一篇 下一篇 >>

  • 评论(0)
  • 赞助本站

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。

最新文章

  • 整容99次,归来做未婚妻后爸大结局_姜茜柳眉姜小姐抖音热门_小说后续在线阅读_无删减免费完结_
  • 朝徽宜燕澹容是什么小说(难搞!燕总又争又抢)(朝徽宜燕澹容)全本完整清爽版在线+无广告结局
  • 重生后我去扫厕所,鉴宝女神却疯了阅读_帝王小姐陈列精校文本_小说后续在线阅读_无删减免费完结_
  • 鹤不归顾宁喧林绾绾完结篇(顾宁喧林绾绾)全篇免费版在线+无障碍结局
  • 终章小说简明月顾烬深完结篇(你走后月光沉入深海)已更新+延伸(简明月顾烬深)清爽版
  • 我死后前夫疯魔了全书阮星舟霍景时在线
  • 救命!我被痴女包围了!列表_救命!我被痴女包围了!(神岛裕树)
  • 被亲姐封印百年,她死后我覆灭三界优质全文_封印小老仙老夫新上热文_小说后续在线阅读_无删减免费完结_
  • 全文七零香江,玄学大佬靠算命发家致富(柳玄黄招娣)列表_全文七零香江,玄学大佬靠算命发家致富
  • 七零娇娇美又媚,冷硬糙汉逃不掉续集(纪淮宋安宁)全本完整免费版_起点章节+后续(七零娇娇美又媚,冷硬糙汉逃不掉)
  • 重活一世,我钓到了真正的顾家家主热门推荐_顾星伯父顾星承阅读_小说后续在线阅读_无删减免费完结_
  • 重生之团长老公才是白月光小说(蒋横波姚沐兰)外篇+结局(重生之团长老公才是白月光)全篇在线阅读

    关于我们 | 我要投稿 | 免责申明

    Copyright © 2020-2022 ZhangShiYu.com Rights Reserved.豫ICP备2022013469号-1