Skip to content

python BFS and DFS search documentation improvements #1361

@barakatzir

Description

@barakatzir

What is the expected enhancement?

I find the documentations for the python visit functions (dfs_search etc.) confusing, and I think all users can benefit from some additions to them.
It might be that my sparse prior knowledge in graph algorithms, but I found the documentation confusing with all the colors (WHITE, GRAY, BLACK) mentioned without context to the BFSVisitor events, or the term "forward or cross edge" which I didn't know.
In the end, I read the rust implementation to understand the algorithm. The code is clearly well thought out and it would be an injustice to keep not let it shine in the documentation as well.

I suggest the following changes:

  1. In docstring, mentioning the visitor events in the pseudo-code in the docs, so it's clear which event occurs when. See suggestions below.
  2. Fix rustworkx.dfs_search docstring: mention that the graph can be a PyDiGraph as well.
  3. Changing the docstring's reported type-annotation for source from List[int] to Sequence[int] | None with default None.
  4. Changing the type annotations (in rustworkx.pyi) for the different visit functions so that visitor=None is not accepted by static type checkers. This can be achieved with overloads:
    @overload
    def digraph_bfs_search(
        graph: PyDiGraph,
        *,
        visitor: _BFSVisitor,
    ) -> None: ...
    @overload
    def digraph_bfs_search(
        graph: PyDiGraph,
        source: Sequence[int] | None,
        visitor: _BFSVisitor,
    ) -> None: ...
  5. In docstring, replace the "Graph can not be mutated while traversing." to "Trying to mutate the graph while traversing raises a RuntimeError.". This way the user can know which exception type he should catch in such cases.
  6. Documenting that raising PruneSearch in a finish_vertex method raises an exception. I checked and this behavior is documented in rustworkx-core where the function call panics in such cases.
  7. (This is the only actual suggested change to runtime behavior) I suggest not raising a PanicException if finish_vertex raises PrunceSearch. This exception isn't caught by except Exception and user cannot easily narrow it with except PanicException since PanicException is in pyo3_runtime module, which is not site-packages usually and cannot be easily imported.

I only looked at the bfs_search and dfs_search which I use, but I suspect dijkstra_search also could benefit from similar changes.
If you agree, I can open a PR with the agreed upon changes and also look into the Dijkstra case.

Regrading the pseudo-code elaborations. Here is an option: replacing the current pseudo-code with python alternative (python is already kinda pseudo-codish). I wrote python alternatives while trying to understand the rust code. Here they are after removing the control flow for the PruneSearch and StopSearch (I can reproduce the control flow if desired):

from typing import Literal
from collections import deque

def bfs_search(graph, source=None, visitor=None):
    color: dict[int, Literal['WHITE', 'GRAY', 'BLACK']] = {}
    for u in graph.node_indices():
        color[u] = 'WHITE'  # set all vertices as unvisited
    for s in source:
        _bfs_visit(graph, s, visitor, color)
        
def _bfs_visit(graph, start, visitor, color):
    if color[start] != 'WHITE':
        return
    color[start] = 'GRAY'
    visitor.discover_vertex(start)  # discover vertex start
    stack = deque((start,))
    while True:
        try:
            u = stack.pop()
        except IndexError:
            break  # break if stack is empty
        for _, v, weight in graph.incident_edge_index_map(u).values():  # examine edge (u,v)
            if color[v] == 'WHITE':
                visitor.tree_edge((u, v, weight))
                color[v] = 'GRAY'
                visitor.discover_vertex(v)
                stack.appendleft(v)
            else:
                visitor.non_tree_edge((u, v, weight))
                if color[v] == 'GRAY':
                    visitor.gray_target_edge((u, v, weight))
                else:  # color[v] == 'BLACK'
                    visitor.black_target_edge((u, v, weight))
        color[u] = 'BLACK'
        visitor.finish_vertex(u)

and

from typing import Literal

def dfs_search(graph, source=None, visitor=None):
    color: dict[int, Literal['WHITE', 'GRAY', 'BLACK']] = {}
    for u in graph.node_indices():
        color[u] = 'WHITE'  # set all vertices as unvisited
    time = -1  # time counter
    for s in source:
        time = _dfs_visit(graph, s, visitor, color, time)
        

def _dfs_visit(graph, start, visitor, color, time):
    if color[start] != 'WHITE':
        return time
    color[start] = 'GRAY'
    time += 1
    visitor.discover_vertex(start, time)  # discover vertex start
    stack = [(start, graph.incident_edge_index_map(start).values())]
    while True:
        try:
            u, edges = stack[-1]
        except IndexError:
            break  # break if stack is empty
        next_ = None
        for _, v, weight in edges:  # examine edge (u, v)
            if color[v] == 'WHITE':
                visitor.tree_edge((u, v, weight))
                color[v] = 'GRAY'
                time += 1
                visitor.discover_vertex(v, time)
                next_ = v
                break
            elif color[v] == 'GRAY':
                visitor.back_edge((u, v, weight))
            else:  # color[v] == 'BLACK'
                visitor.forward_or_cross_edge((u, v, weight))
        if next_ is not None:
            stack.append((v, graph.incident_edge_index_map(v).values()))
            continue
        color[u] = 'BLACK'
        time += 1
        visitor.finish_vertex(u, time)
        stack.pop()
    return time

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions