-
Notifications
You must be signed in to change notification settings - Fork 25.1k
Noisy layer #2103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Noisy layer #2103
Conversation
torch/nn/modules/linear.py
Outdated
@@ -12,16 +13,14 @@ class Linear(Module): | |||
Args: | |||
in_features: size of each input sample | |||
out_features: size of each output sample | |||
bias: If set to False, the layer will not learn an additive bias. | |||
Default: True | |||
bias: If set to False, the layer will not learn an additive bias. Default: True |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
factorised: whether or not to use factorised noise. | ||
Default: True | ||
std_init: initialization constant for standard deviation component of | ||
weights. If None, defaults to 0.017 for independent and 0.4 for |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
self.std_init = 0.017 | ||
else: | ||
self.std_init = std_init | ||
self.reset_parameters(bias) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
self.bias_sigma.data.fill_(self.std_init) | ||
|
||
def scale_noise(self, size): | ||
x = torch.Tensor(size).normal_() |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
self.weight_epsilon = Variable(epsilon_out.ger(epsilon_in)) | ||
self.bias_epsilon = Variable(self.scale_noise(self.out_features)) | ||
else: | ||
self.weight_epsilon = Variable(torch.Tensor((self.out_features, self.in_features)).normal_()) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
class NoisyLinear(Module): | ||
"""Applies a noisy linear transformation to the incoming data: | ||
:math:`y = (mu_w + sigma_w \cdot epsilon_w)x | ||
+ mu_b + sigma_b \cdot epsilon_b` |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
def __repr__(self): | ||
return self.__class__.__name__ + ' (' \ | ||
+ str(self.in_features) + ' -> ' \ | ||
+ str(self.out_features) + ')' |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
weight: the learnable weights of the module of shape | ||
(out_features x in_features) | ||
bias: the learnable bias of the module of shape (out_features) | ||
Examples:: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
self.bias_mu.data.uniform_(-mu_range, mu_range) | ||
self.bias_sigma.data.fill_(self.std_init) | ||
|
||
def scale_noise(self, size): |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@soumith is the method documentation at lines 145/6 in a satisfactory format? Can't find a module with a similar example. |
Everything looks good to me now 👍 I'll leave Soumith to help with that bit of documentation and how to sort out testing. |
@jvmancuso I think you just need to add some test cases to the list here: https://github.com/pytorch/pytorch/blob/master/test/test_nn.py#L3114 you can follow the style of Linear here: https://github.com/pytorch/pytorch/blob/master/test/common_nn.py#L28 |
@jvmancuso as for the docstring, you should prefix the r"""
this is my docstring which is now treated as a raw string.
This means escape characters like \ wont be converted and instead
treated as a literal \
""" |
torch/nn/modules/linear.py
Outdated
class NoisyLinear(Module): | ||
"""Applies a noisy linear transformation to the incoming data. | ||
During training: | ||
:math:`y = (mu_w + sigma_w \cdot epsilon_w)x |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
>>> print(output) | ||
>>> print(output_new) | ||
""" | ||
def __init__(self, in_features, out_features, bias=True, factorised=True, std_init=None): |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@alykhantejani pretty sure I fixed the math line in the docs at L123. I'm not sure how a test case like Linear's will work in this instance, since the output of the layer will be different for each forward pass by definition. Specifically, there is no stable |
@jvmancuso You don't have to add the |
@alykhantejani Apart from the dict tests I think 2 custom tests would be good:
It is pretty critical that the module shows these behaviours beyond simply passing the automatic differentiation checks. |
@Kaixhin agreed, these can just be added as regular test functions in |
Dug into these most recent failures. Question: what do I do about test_noncontig? It appears to be checking that using deepcopy to copy the layer doesn't change the gradient of the parameters. I think deepcopy might be resampling the noise somewhere, which would definitely trigger assertEqual to fail on parameter grads if that's the case. Can somebody else please take a look at this? Also, no idea why test_Conv2d_backward_twice is failing. Nothing I've done changes the Conv2d module or that test case, and I don't see how any changes I've made would cause it to fail. |
@jvmancuso sorry I've been away on vacation. Will try to take a look at this, this week. |
@@ -2344,62 +2274,6 @@ def test_bce_with_logits_gives_same_result_as_sigmoid_and_bce_loss(self): | |||
weight = torch.rand(4) | |||
self.assertEqual(nn.BCEWithLogitsLoss(weight)(output, target), nn.BCELoss(weight)(sigmoid(output), target)) | |||
|
|||
target = Variable(torch.FloatTensor(4, 1).fill_(0)) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@jvmancuso The issue with I'm not quite sure why @jvmancuso in terms of the other test failing, try and pull in changes from upstream/master as you currently have merge conflicts anyway. |
Hi all, it's been awhile, and I haven't forgotten about this. I'm doing an analysis of DeepMind's Noisy Nets versus OpenAI's Parameter Space Noise, and wanted to revisit this when that's done. I'll be implementing the OpenAI version shortly, and wanted to investigate including that in this PR. I'll close this for now and reopen when I've thought through that a bit more. |
@jvmancuso for future reference, this doesn't work with CUDA. I think the best solution is to register |
@Kaixhin I had made a few changes to the code to accommodate for that but hadn't committed them. Your solution is more elegant though, I'll integrate it into my work. Thanks! |
…9a6052 Summary: Previous import was 707064980b9825b8705b9d1c9aad34d8b022d5dd Included changes: - **[28ca699b](onnx/onnx@28ca699b)**: Member Company logo guidelines (pytorch#2196) <Prasanth Pulavarthi> - **[47acb06a](onnx/onnx@47acb06a)**: remove link to outdated issue for contributions wanted (pytorch#2186) <Prasanth Pulavarthi> - **[168519f6](onnx/onnx@168519f6)**: Create sigs.md (pytorch#2103) <Prasanth Pulavarthi> - **[b9320746](onnx/onnx@b9320746)**: mintor format update (pytorch#2180) <Prasanth Pulavarthi> - **[65b8e0f9](onnx/onnx@65b8e0f9)**: add more types support for Equal op (pytorch#2176) <Ke Zhang> - **[dc5e62a9](onnx/onnx@dc5e62a9)**: Update AddNewOP document. (pytorch#2172) <Emad Barsoum> - **[bae8b530](onnx/onnx@bae8b530)**: Add missing space (pytorch#2150) <Takeshi Watanabe> - **[5952b7f5](onnx/onnx@5952b7f5)**: python api example typo fix (pytorch#2155) <LeicongLi> - **[904cb842](onnx/onnx@904cb842)**: Fix errors in RoiAlign shape inference code (pytorch#2167) <G. Ramalingam> Differential Revision: D16502373 fbshipit-source-id: 68b9479a30fc330d876947cb4ea8227848f576e3
Removing --no-deps and --no-index flags because the old setuptools was installing the requirements of the torch wheel along with those of the torchvision wheel when using the old setuptools version (<80.0.1) and in versions greater than setuptools 80.0.1, the new setuptools does not install the requirements of the torch wheel and torchvision wheel. Resolves https://ontrack-internal.amd.com/browse/SWDEV-531011 Validation: http://rocm-ci.amd.com/job/framework-pytorch-2.6-ub22-py3.10-ci_rel-6.4-preview/60/
Implementation of Noisy Networks per #2024. The gist link from that issue is now obsolete; the forward pass no longer resamples the noise tensors each time, and I've added a method
reset_noise
to resample the noise tensors. Also, I usedself.training
to differentiate between train and eval passes. I ran some basic tests to make sure methods were functioning, but I still need to do more testing. Also, I'm not sure how to edit the docs. If someone can point me in the right direction for expectations on writing docs, I'd appreciate it.