-
-
Notifications
You must be signed in to change notification settings - Fork 17.2k
Closed
Labels
StaleStale and schedule for closing soonStale and schedule for closing soonquestionFurther information is requestedFurther information is requested
Description
Hi
I had a look at the Focus layer and it seems to me like it is equivalent to a simple 2d-convolutional layer without the need for the space-to-depth operation. For example, a Focus
layer with kernel size 3 can be expressed as a Conv
layer with kernel size 6 and stride 2 . I wrote some code to verify this:
import torch
from models.common import Focus, Conv
from utils.torch_utils import profile
focus = Focus(3, 64, k=3).eval()
conv = Conv(3, 64, k=6, s=2, p=2).eval()
# Express focus layer as conv layer
conv.bn = focus.conv.bn
conv.conv.weight.data[:, :, ::2, ::2] = focus.conv.conv.weight.data[:, :3]
conv.conv.weight.data[:, :, 1::2, ::2] = focus.conv.conv.weight.data[:, 3:6]
conv.conv.weight.data[:, :, ::2, 1::2] = focus.conv.conv.weight.data[:, 6:9]
conv.conv.weight.data[:, :, 1::2, 1::2] = focus.conv.conv.weight.data[:, 9:12]
# Compare
x = torch.randn(16, 3, 640, 640)
with torch.no_grad():
# Results are not perfectly identical, errors up to about 1e-7 occur (probably numerical)
assert torch.allclose(focus(x), conv(x), atol=1e-6)
# Profile
results = profile(input=torch.randn(16, 3, 640, 640), ops=[focus, conv, focus, conv], n=10, device=0)
And the output as follows:
YOLOv5 🚀 v5.0-434-g0dc725e torch 1.9.0+cu111 CUDA:0 (A100-SXM4-40GB, 40536.1875MB)
Params GFLOPs GPU_mem (GB) forward (ms) backward (ms) input output
7040 23.07 2.682 4.055 13.78 (16, 3, 640, 640) (16, 64, 320, 320)
7040 23.07 2.368 3.474 9.989 (16, 3, 640, 640) (16, 64, 320, 320)
7040 23.07 2.343 3.556 11.57 (16, 3, 640, 640) (16, 64, 320, 320)
7040 23.07 2.368 3.456 9.961 (16, 3, 640, 640) (16, 64, 320, 320)
I did have to slightly tweak the tolerance in torch.allcose
for the assertion to succeed, but looking at the errors they seem to be purely numerical.
So am I missing something or could the Focus
layer simply be replaced by a Conv
layer which would lead to a slight increase in speed?
SpongeBab, zhiqwang, zldrobit, GryhomShaw, manzhihuangnian and 25 moreErikValle, glenn-jocher, zxcamazing, zhiqwang and tomeuvglenn-jocher, zhiqwang, ZhixiongSun and MaayanShriki
Metadata
Metadata
Assignees
Labels
StaleStale and schedule for closing soonStale and schedule for closing soonquestionFurther information is requestedFurther information is requested