Skip to content

Add Lah numbers and clean up combinatorics section #39379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jul 25, 2025

Conversation

kwankyu
Copy link
Collaborator

@kwankyu kwankyu commented Jan 25, 2025

According to the Wikipedia article https://en.wikipedia.org/wiki/Lah_number, a formula that computes the Lah number $L(n,k)$, also called the Stirling numbers of the third kind, is

$$\binom{n-1}{k-1}\frac{n!}{k!}$$

but the most efficient form of the formula seems to be

$$\binom{n}{k}\frac{k}{n}\binom{n}{k}(n-k)!$$

contributed by @user202729.

Long time ago, #17161 proposed yet another formula that gives $L(n,k)$ for all integers $n,k$. But that formula is extremely slow when implemented in Sage. Like existing functions for Stirling numbers, we only support the main case where $n$ and $k$ are nonnegative.

After this PR, we may close #17161, #17159, #17123 since we do not intend to support the negative $k$ case for combinatorial functions, which is I think only of minor interest and in controversy.

I made some cosmetic changes to the file src/sage/combinat/combinat.py, mostly for user friendliness.

While at it, I ended up making extensive changes to index pages of the combinatorics section, for tidiness. In particular,

  • removed some of "forever" TODOs.
  • removed excessive cross references.
  • uniformized capitalization.

and also fixed #17421.

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

⌛ Dependencies

Copy link

github-actions bot commented Jan 25, 2025

Documentation preview for this PR (built with commit 188e64e; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

@user202729
Copy link
Contributor

While we're at it, maybe binomial(n, k)*(n-k).factorial() would be faster than n.factorial()//k.factorial()?

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jan 26, 2025

Not really, at least on my machine:

sage: def lah_numbers4(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = n.factorial() // k.factorial()
....:     return a**2*k // (n*(n-k).factorial())
....: 
....: def lah_numbers5(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = binomial(n, k)*(n-k).factorial()
....:     return a**2*k // (n*(n-k).factorial())
....: 
sage: timeit('T1 = matrix([[lah_numbers4(n,k) for n in range(100)] for k in rang
....: e(100)])')
25 loops, best of 3: 9.74 ms per loop
sage: timeit('T2 = matrix([[lah_numbers5(n,k) for n in range(100)] for k in rang
....: e(100)])')
5 loops, best of 3: 116 ms per loop
sage: T1 = matrix([[lah_numbers4(n,k) for k in range(10)] for n in range(10)])
sage: T2 = matrix([[lah_numbers5(n,k) for k in range(10)] for n in range(10)])
sage: T1 == T2
True

@user202729
Copy link
Contributor

Apparently there's massive overhead in binomial(n, k) (symbolic version) over n.binomial(k) for small input. I only tested for large input.

Try this one

    a = n.binomial(k)
    return a * k // n * a * (n-k).factorial() 

(note that n.binomial(k) * k / n == (n-1).binomial(k-1) so the division is exact)

In my experiment, it has minimal overhead for small n and k (as in your matrix) and a few times faster for e.g. lah_numbers(10^6, 10^5) .

It looks impossible to have a version that calls binomial()/factorial() only twice and avoid division though. Best division-free version I got is (n-1).binomial(k-1) * n.binomial(k) * (n-k).factorial() but it takes 3 calls (slower).

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jan 26, 2025

Voilà!

sage: def lah_numbers4(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = n.factorial() // k.factorial()
....:     return a**2*k // (n*(n-k).factorial())
....: 
sage: def lah_numbers6(n, k):
....:     if k > n:
....:         return 0
....:     if k == 0:
....:         return 0 if n else 1
....:     n = Integer(n)
....:     k = Integer(k)
....:     a = n.binomial(k)
....:     return a * k // n * a * (n-k).factorial()
....: 
sage: timeit('T1 = matrix([[lah_numbers4(n,k) for n in range(100)] for k in range(100)])')
25 loops, best of 3: 9.36 ms per loop
sage: timeit('T1 = matrix([[lah_numbers6(n,k) for n in range(100)] for k in range(100)])')
25 loops, best of 3: 11.1 ms per loop
sage: timeit('T1 = matrix([[lah_numbers4(10000+n,10000+k) for n in range(100)] for k in range(100)])')
5 loops, best of 3: 1.91 s per loop
sage: timeit('T2 = matrix([[lah_numbers6(10000+n,10000+k) for n in range(100)] for k in range(100)])')
25 loops, best of 3: 20.7 ms per loop
sage: timeit('T1 = matrix([[lah_numbers4(10000+n,k) for n in range(100)] for k in range(100)])')
5 loops, best of 3: 10.7 s per loop
sage: timeit('T2 = matrix([[lah_numbers6(10000+n,k) for n in range(100)] for k in range(100)])')
5 loops, best of 3: 2.19 s per loop

Nice!

@kwankyu kwankyu force-pushed the p/add-lah-numbers branch 2 times, most recently from 260df5f to 42ef17b Compare January 26, 2025 22:23
@kwankyu kwankyu marked this pull request as ready for review January 26, 2025 23:41
@kwankyu kwankyu changed the title Add Lah numbers Add Lah numbers and clean up combinatorics section Jan 26, 2025
@kwankyu kwankyu force-pushed the p/add-lah-numbers branch 2 times, most recently from b5d05e2 to 44c3dc1 Compare January 31, 2025 13:54
@kwankyu kwankyu force-pushed the p/add-lah-numbers branch from 5c38935 to 1ee62d8 Compare June 9, 2025 04:32
@kwankyu
Copy link
Collaborator Author

kwankyu commented Jun 11, 2025

@fchapoton As you currently seem to be in the "combinat" section, would you review this PR? This PR touches many files in the directory src/sage/combinat to improve the documentation, but most of the changes are quite small.

The new code for Lah numbers was, I think, already reviewed by @user202729.

@fchapoton fchapoton self-assigned this Jun 11, 2025
@kwankyu
Copy link
Collaborator Author

kwankyu commented Jun 28, 2025

Resolved merge conflicts.

@fchapoton
Copy link
Contributor

I must say that I am not convinced at all by the necessity to have all titles in lower cap.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jun 28, 2025

Currently we have different styles mixed, even in the same section. It looks untidy to me. For example, see

https://doc-release--sagemath.netlify.app/html/en/reference/structure/

I am making efforts to make the style consistent, in a long term.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jul 6, 2025

@fchapoton Still not convinced at all? What is your opinion on the problem of "different styles mixed"?

@fchapoton
Copy link
Contributor

Sorry, but there is only epsilon chance that I will ever review something that changes 174 files.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jul 11, 2025

As you perhaps know, most of them are trivial "lower caps" changes for titles or minor stylistic edits.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jul 11, 2025

Sorry, but there is only epsilon chance that I will ever review something that changes 174 files.

OK. Thanks for your previous review.

@user202729
Copy link
Contributor

  • I can't tell whether title case is more consistent than sentence case though. Outside combinat there certainly are cases where things are title case e.g.

    image

    but if e.g. a half of them are already sentence case, I can't see any harm either.

  • I'm also neutral to the changes e.g. "Introductory material → Introduction" or "Thematic indexes → Topics"

  • There are some other improvements such as indentation fixes, or deletion of reference to nonexistent fibonacci_number() function,
    which would be an improvement.

    image
  • Collated credits from individual functions to module index. Neutral.

     -- David Joyner (2006-07): initial implementation.
     -
     +- David Joyner (2006-07): initial implementation
      - William Stein (2006-07): editing of docs and code; many
        optimizations, refinements, and bug fixes in corner cases
     -
     +- Jon Hanke (2006-08): added ``tuples``
      - David Joyner (2006-09): bug fix for combinations, added
        permutations_iterator, combinations_iterator from Python Cookbook,
        edited docs.
     -
      - David Joyner (2007-11): changed permutations, added hadamard_matrix
     -
     +- Blair Sutton (2009-01-26): added ``bell_polynomial``
      - Florent Hivert (2009-02): combinatorial class cleanup
     -
     +- Bobby Moretti (2009-02): added ``fibonacci_sequence`` and ``fibonacci_xrange``
      - Fredrik Johansson (2010-07): fast implementation of ``stirling_number2``
     -
     +- Robert Gerbicz (2010-10): added Bell numbers
      - Punarbasu Purkayastha (2012-12): deprecate arrangements, combinations,
        combinations_iterator, and clean up very old deprecated methods.
     +- Jeroen Demeyer (2014-10): improved implementation of Dobinski formula for Bell numbers
     +  with more accurate error estimates (:issue:`17157`)
     +- Thierry Monteil (2015-09-29): the result of ``bell_polynomial`` must
     +  always be a polynomial
     +- Kei Beauduin (2024-04-06): when univariate, the Bell polynomial is in
     +  variable ``x0``; extended to complete exponential, partial ordinary and
     +  complete ordinary Bell polynomials.
     +- Kwankyu Lee (2025-01): added Lah numbers
  • Some TODO are removed. Reasonable.

@kwankyu
Copy link
Collaborator Author

kwankyu commented Jul 20, 2025

Thanks for the review and for your fast formula for Lah numbers!

@vbraun vbraun merged commit 50399d0 into sagemath:develop Jul 25, 2025
22 of 23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic generation of the module list in the Sphinx documentation for sage.combinat Implement the unsigned Lah numbers
4 participants