Skip to content

Hetero neighbor sampler multithreading #215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 4, 2023

Conversation

kgajdamo
Copy link
Contributor

The main roadblocks in the context of sampler parallelization is the mapper and the sampler. However, in the case of hetero sampling, there are separate mappers for each dst node type and separate samplers for each edge type. Therefore, to avoid race condition we can use parallelization per dst node type, i.e. each thread is assigned an edge types with unique dst node type that it is working on.

Example sampling times obtained using hetero_neighbor.py benchmark:

hetero-sampler-icx

Example end-to-end times obtained using inference_benchmark.py:

e2e-icx

@codecov-commenter
Copy link

codecov-commenter commented Mar 16, 2023

Codecov Report

Merging #215 (819af72) into master (72d2647) will not change coverage.
The diff coverage is n/a.

❗ Current head 819af72 differs from pull request most recent head b6f7ae9. Consider uploading reports for the commit b6f7ae9 to get more accurate results

@@           Coverage Diff           @@
##           master     #215   +/-   ##
=======================================
  Coverage   83.49%   83.49%           
=======================================
  Files          26       26           
  Lines         848      848           
=======================================
  Hits          708      708           
  Misses        140      140           
Impacted Files Coverage Δ
pyg_lib/csrc/sampler/cpu/neighbor_kernel.cpp 88.72% <ø> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@kgajdamo kgajdamo force-pushed the hetero-sampler-mt branch from 78b60f8 to 8dd411b Compare March 18, 2023 12:05
@mingfeima
Copy link

The speedup does not see to be very much, do we have analysis on what is the bottleneck ?

@kgajdamo
Copy link
Contributor Author

kgajdamo commented Apr 2, 2023

The speedup does not see to be very much, do we have analysis on what is the bottleneck ?

I think the reason is that the number of threads is limited to the number of dst nodes. As many different dst nodes there are as many threads. For the ogbn-mag dataset, this will be 4.

@kgajdamo kgajdamo force-pushed the hetero-sampler-mt branch from 8dd411b to 930dc1a Compare April 7, 2023 10:37
@mingfeima
Copy link

The speedup does not see to be very much, do we have analysis on what is the bottleneck ?

I think the reason is that the number of threads is limited to the number of dst nodes. As many different dst nodes there are as many threads. For the ogbn-mag dataset, this will be 4.

The code generally LGTM, but i am still curious on the limited speedup. You may try from two aspects:

  • can we shift to another dataset with more dst nodes.
  • check VTune log on ogbn-mag dataset.

@kgajdamo
Copy link
Contributor Author

The speedup does not see to be very much, do we have analysis on what is the bottleneck ?

I think the reason is that the number of threads is limited to the number of dst nodes. As many different dst nodes there are as many threads. For the ogbn-mag dataset, this will be 4.

The code generally LGTM, but i am still curious on the limited speedup. You may try from two aspects:

  • can we shift to another dataset with more dst nodes.
  • check VTune log on ogbn-mag dataset.

Sure, I'll check that in VTune.

Copy link
Member

@rusty1s rusty1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. A few minor comments.

}
at::parallel_for(0, node_types.size(), 1, [&](size_t _s, size_t _e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this bring any gain? We are not doing heavy work here, so the overhead of threading might not be worth it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To check this, I made two measurements. The first in red are measurements made with the current code. And in green is after removing at::parallel_for (what was before). The results show that the introduction of parallelization at this point gives maybe not a big but still speedup (I took the measurements on a different machine than the one mentioned in the PR description, which is why the results are different):
compare

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But maybe to be sure, I'll take 10 measurements and average the results

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The average of 10 measurements is as follows:
Screenshot 2023-05-03 at 15 38 45

Looks like the results are almost the same. Should I remove it then? WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be in favor of removing it, as it doesn't look that useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

@kgajdamo kgajdamo force-pushed the hetero-sampler-mt branch from 930dc1a to 6449aca Compare May 3, 2023 14:33
@rusty1s rusty1s enabled auto-merge (squash) May 4, 2023 08:55
@rusty1s rusty1s merged commit f1b10bb into pyg-team:master May 4, 2023
OlhaBabicheva added a commit to OlhaBabicheva/pyg-lib that referenced this pull request May 5, 2023
OlhaBabicheva added a commit to OlhaBabicheva/pyg-lib that referenced this pull request May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants