Memory leak in C++ when running module in separate thread

## 🐛 Bug

When calling the forward function of a Module, some memory is allocated that is not de-allocated at the end of the thread. 

## To Reproduce

Steps to reproduce the behavior:

Module scripted from Python as in tutoriel:
```
import torchvision
import torch

model = torchvision.models.resnet18()
example = torch.rand(1,3,224,224)
my_torchscript_module = torch.jit.trace(model, example)
torch.jit.save(my_torchscript_module, "sciptedModule.pt")
```
Loaded and ran in C++ in separate thread:

```
#include "torch/script.h"
#include "torch/torch.h"


void runModel(at::Tensor, torch::jit::script::Module);

int main()
{
	torch::NoGradGuard no_guard;
	torch::jit::script::Module m_module = torch::jit::load("./sciptedModule.pt");
	m_module.eval();
	at::Tensor testTensor = torch::rand({ 1,3,224,224}, at::kFloat);
	testTensor = testTensor.div(testTensor.norm());
	for (int i = 0; i < 10000; i++) {
		std::thread newThread(&runModel, testTensor, m_module);
		newThread.join();
	}
}

void runModel(at::Tensor testTensor, torch::jit::script::Module m_module) {
	torch::NoGradGuard no_guard;
	at::Tensor out = m_module.forward({ testTensor }).toTensor().detach();
}

```
![MemoryIncrease](https://user-images.githubusercontent.com/31984573/62935760-3f065800-bdc8-11e9-9e59-f296b64240da.PNG)


## Expected behavior

Inference is done in separate thread with no increase in memory

## Environment

PyTorch version: 1.2.0
Is debug build: No
CUDA used to build PyTorch: None

OS: Microsoft Windows 10 Home
GCC version: Could not collect
CMake version: version 3.12.2

Python version: 3.6
Is CUDA available: No
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA

Versions of relevant libraries:
[pip] numpy==1.16.2
[pip] numpydoc==0.8.0
[pip] torch==1.2.0
[pip] torchvision==0.4.0
[conda] _tflow_1100_select        0.0.3                       mkl  
[conda] _tflow_select             2.3.0                       mkl  
[conda] blas                      1.0                         mkl  
[conda] cpuonly                   1.0                           0    pytorch
[conda] libmklml                  2019.0.3                      0  
[conda] mkl                       2019.1                      144  
[conda] mkl-include               2019.1                      144  
[conda] mkl-service               1.1.2            py36hb782905_5  
[conda] mkl_fft                   1.0.10           py36h14836fe_0  
[conda] mkl_random                1.0.2            py36h343c172_0  
[conda] pytorch                   1.2.0               py3.6_cpu_1  [cpuonly]  pytorch
[conda] tensorflow-base           1.10.0          mkl_py36h81393da_0  
[conda] torchvision               0.4.0                  py36_cpu  [cpuonly]  pytorch

## Additional context

When running on main thread, the memory seems to be allocated once on first call and then re-used.
Python threading doesn't have this problem


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory leak in C++ when running module in separate thread #24237

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Memory leak in C++ when running module in separate thread #24237

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions