Skip to content

Conversation

cyrilbouvier
Copy link
Contributor

As noted in issue #37642, enumerating the neighbors of a vertex for a graph that use the sparse backend (the default in Sage) is not linear on the number of neighbors (there is an extra log factor).
The goal of this PR is to fix this.

The problem is that to enumerate the neighbors of a vertex, the current code calls iteratively _next_neighbor_unsafe (or one of its out or in variant). But this method return an int corresponding to a neighbor, so to go to the next neighbor, it was first necessary to retrieve this vertex* in the structure storing all the neighbors. As it is stored in a sorted binary tree, the cost of retrieving is O(log(#neighbors)).

*(or a close one if it was deleted in the meantime)

To obtain a linear complexity, I wrote a new method that do a simple tree traversal which is linear in the number of nodes in the tree (i.e., the number of neighbors). As this new method does not have a similar interface as the previous _next_neighbor_unsafe, some more rewriting was necessary to use it in other parts of the code.

  1. I wrote new methods for the SparseGraph class: out_neighbors_unsafe and in_neighbors_unsafe. They overwrite the ones from the base class and rely on the new methods _neighbors_unsafe and _neighbors_BTNode_unsafe.
    The _neighbors_BTNode_unsafe is a low-level method enumerating the neighbors in linear time.

  2. I rewrote the two methods out_neighbors_BTNode_unsafe and in_neighbors_BTNode_unsafe to use the new _neighbors_BTNode_unsafe.

  3. I wrote a new method _iterator_edges method for SparseGraphBackend to overwrite the one from the base class
    CGraphBackend, in order to expose the low-level code to the Graph class.

Question:
During the writing of this PR, I notice that the methods out_neighbors_BTNode_unsafe and in_neighbors_BTNode_unsafe are only defined for the sparse backend and are not used anywhere in the code.
Can I remove them ? Or should they be kept for compatibility ?

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • [] I have created tests covering the changes.
  • [] I have updated the documentation and checked the documentation preview.

⌛ Dependencies

@cyrilbouvier
Copy link
Contributor Author

This PR is necessary to fix #37642 but is not sufficient: PR #37662 is also necessary.

Copy link
Contributor

@dcoudert dcoudert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question:
During the writing of this PR, I notice that the methods out_neighbors_BTNode_unsafe and in_neighbors_BTNode_unsafe are only defined for the sparse backend and are not used anywhere in the code.
Can I remove them ? Or should they be kept for compatibility ?

These are old functions that are apparently no longer used. I think you can remove them.

"""
if modus == 0:
return False
if modus == 1 or modus == 2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could reorder the tests to check if modus == 3 and otherwise return True

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does 51933f0 correspond to what you had in mind ?

@cyrilbouvier
Copy link
Contributor Author

These are old functions that are apparently no longer used. I think you can remove them.

done in f04678c

@dcoudert
Copy link
Contributor

These are old functions that are apparently no longer used. I think you can remove them.

done in f04678c

yes.

@dcoudert
Copy link
Contributor

something goes wrong with the CI. Before, this PR was inducing a segfault...

@cyrilbouvier
Copy link
Contributor Author

something goes wrong with the CI. Before, this PR was inducing a segfault...

I fixed it: I wrote two line of code to compute the max degree of the graph but I used num_verts instead of the size of active_vertices (+ test if the bit is active in the bitset) to iterate over the vertices of the graph. It created out of bound memory access in the rest of the code. My bad. (see e5f4299)

There is still some failing tests with labels. I am working on it.

Copy link

github-actions bot commented Jun 10, 2024

Documentation preview for this PR (built with commit b7e644e; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

Copy link
Contributor

@dcoudert dcoudert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@dcoudert
Copy link
Contributor

This is a nice improvement. Thank you.

vbraun pushed a commit to vbraun/sage that referenced this pull request Jun 16, 2024
sagemathgh-38167: New methods for sparse graphs backend to enumerate neighbors in linear time
    
<!-- ^ Please provide a concise and informative title. -->
<!-- ^ Don't put issue numbers in the title, do this in the PR
description below. -->
<!-- ^ For example, instead of "Fixes sagemath#12345" use "Introduce new method
to calculate 1 + 2". -->
<!-- v Describe your changes below in detail. -->
<!-- v Why is this change required? What problem does it solve? -->
<!-- v If this PR resolves an open issue, please link to it here. For
example, "Fixes sagemath#12345". -->

As noted in issue sagemath#37642, enumerating the neighbors of a vertex for a
graph that use the sparse backend (the default in Sage) is **not**
linear on the number of neighbors (there is an extra log factor).
The goal of this PR is to fix this.

The problem is that to enumerate the neighbors of a vertex, the current
code calls iteratively _next_neighbor_unsafe (or one of its out or in
variant). But this method return an int corresponding to a neighbor, so
to go to the next neighbor, it was first necessary to retrieve this
vertex* in the structure storing all the neighbors. As it is stored in a
sorted binary tree, the cost of retrieving is O(log(#neighbors)).

*(or a close one if it was deleted in the meantime)

To obtain a linear complexity, I wrote a new method that do a simple
tree traversal which is linear in the number of nodes in the tree (i.e.,
the number of neighbors). As this new method does not have a similar
interface as the previous _next_neighbor_unsafe, some more rewriting was
necessary to use it in other parts of the code.

1. I wrote new methods for the SparseGraph class: out_neighbors_unsafe
and in_neighbors_unsafe. They overwrite the ones from the base class and
rely on the new methods _neighbors_unsafe and _neighbors_BTNode_unsafe.
The _neighbors_BTNode_unsafe is a low-level method enumerating the
neighbors in linear time.

2. I rewrote the two methods out_neighbors_BTNode_unsafe and
in_neighbors_BTNode_unsafe to use the new _neighbors_BTNode_unsafe.

3. I wrote a new method _iterator_edges method for SparseGraphBackend to
overwrite the one from the base class
  CGraphBackend, in order to expose the low-level code to the Graph
class.



Question:
During the writing of this PR, I notice that the methods
out_neighbors_BTNode_unsafe and in_neighbors_BTNode_unsafe are only
defined for the sparse backend and are not used anywhere in the code.
Can I remove them ? Or should they be kept for compatibility ?

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [] I have created tests covering the changes.
- [] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#38167
Reported by: cyrilbouvier
Reviewer(s): cyrilbouvier, David Coudert
vbraun pushed a commit to vbraun/sage that referenced this pull request Jun 16, 2024
sagemathgh-38167: New methods for sparse graphs backend to enumerate neighbors in linear time
    
<!-- ^ Please provide a concise and informative title. -->
<!-- ^ Don't put issue numbers in the title, do this in the PR
description below. -->
<!-- ^ For example, instead of "Fixes sagemath#12345" use "Introduce new method
to calculate 1 + 2". -->
<!-- v Describe your changes below in detail. -->
<!-- v Why is this change required? What problem does it solve? -->
<!-- v If this PR resolves an open issue, please link to it here. For
example, "Fixes sagemath#12345". -->

As noted in issue sagemath#37642, enumerating the neighbors of a vertex for a
graph that use the sparse backend (the default in Sage) is **not**
linear on the number of neighbors (there is an extra log factor).
The goal of this PR is to fix this.

The problem is that to enumerate the neighbors of a vertex, the current
code calls iteratively _next_neighbor_unsafe (or one of its out or in
variant). But this method return an int corresponding to a neighbor, so
to go to the next neighbor, it was first necessary to retrieve this
vertex* in the structure storing all the neighbors. As it is stored in a
sorted binary tree, the cost of retrieving is O(log(#neighbors)).

*(or a close one if it was deleted in the meantime)

To obtain a linear complexity, I wrote a new method that do a simple
tree traversal which is linear in the number of nodes in the tree (i.e.,
the number of neighbors). As this new method does not have a similar
interface as the previous _next_neighbor_unsafe, some more rewriting was
necessary to use it in other parts of the code.

1. I wrote new methods for the SparseGraph class: out_neighbors_unsafe
and in_neighbors_unsafe. They overwrite the ones from the base class and
rely on the new methods _neighbors_unsafe and _neighbors_BTNode_unsafe.
The _neighbors_BTNode_unsafe is a low-level method enumerating the
neighbors in linear time.

2. I rewrote the two methods out_neighbors_BTNode_unsafe and
in_neighbors_BTNode_unsafe to use the new _neighbors_BTNode_unsafe.

3. I wrote a new method _iterator_edges method for SparseGraphBackend to
overwrite the one from the base class
  CGraphBackend, in order to expose the low-level code to the Graph
class.



Question:
During the writing of this PR, I notice that the methods
out_neighbors_BTNode_unsafe and in_neighbors_BTNode_unsafe are only
defined for the sparse backend and are not used anywhere in the code.
Can I remove them ? Or should they be kept for compatibility ?

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [] I have created tests covering the changes.
- [] I have updated the documentation and checked the documentation
preview.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on. For example,
-->
<!-- - sagemath#12345: short description why this is a dependency -->
<!-- - sagemath#34567: ... -->
    
URL: sagemath#38167
Reported by: cyrilbouvier
Reviewer(s): cyrilbouvier, David Coudert
@vbraun vbraun merged commit cee5890 into sagemath:develop Jun 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants