Releases: mcthouacbb/Sirius
Sirius v9.0
Release notes
After months of work, issues with buggy OpenBench workers, and over 100 commits, Sirius is back with its next major release. Many improvements to the search and evaluation were made, along with some QoL improvements, including support for the Stockfish WDL model. Major changes include 3 SPSA tunes at LTC, a much improved king safety evaluation, many tweaks and additions to passed pawns and pawn structure evaluation, generating quiet check evasion moves in quiescence search, tweaks and additions to correction histories, and more. Sirius v9.0 is nearly 130 elo stronger than Sirius v8.0 on a balanced book.
Additionally, Sirius made its TCEC(Top Chess Engine Championship) debut these past few months. Sirius performed well against other HCE(handcrafted evaluation) engines, but was not very competitive against most NNUE engines due to their much stronger evaluation.
Changelog
- Lots of refactors and cleanups
- Added clang-format for more consistent code formatting
- Fixed promotions printing uppercase letters
- Support for FRC and DFRC chess(chess960)
- Some stability and speed improvements for multithreading and TT clearing
- Tested up to TCEC conditions(512 threads and 256GB hash)
- Stockfish WDL Model Support
- Normalized Eval
- Estimated win, draw, and loss probabilities
- Elo estimates for all major search features
- Speedups
- Staged move generation for killer moves
- Many evaluation changes
- Fixing the evaluation to be fully vertically symmetric.
- King safety
- A quadratic adjustment formula applied after summing all safety terms
- Penalty for weak squares around the king
- Penalty for an attack without a queen
- Bonus/penalty for king flank attacks and defenses
- Improved safe checks calculation
- Pawn structure and passed pawn evaluation
- Bonus for candidate passed pawns
- Penalty for backwards and doubled pawns
- Tweaked definition of isolated pawns
- Better king-passer proximity calculation
- Exclude the king and blocked pawns from mobility
- Bonus for restricting the enemy's mobility
- Bonus for being able to move a piece to attack the enemy queen
- Bonus for a bishop on a long center diagonal
- Specialized evaluation and scaling functions for certain endgames
- Many search tweaks
- Quiescence search quiet check evasion generation and pruning
- Correction history
- Added continuation correction history
- Tuned correction history weights
- Improved correction history updates
- LMR adjustments based on correction history
- Move loop pruning
- Added noisy futility pruning
- Increased max pruning depth
- Adjust futility margin by history
- Better capture move ordering
- Improved internal iterative reductions implementation
- More accurate handling of pinned pieces in static exchange evaluation
- Added probcut
- Improved singular extensions and negative extensions
- Many other changes
- 3 LTC SPSA tunes of the entire search
Self-play against Sirius v8.0
Elo | 127.02 +- 6.45 (95%)
Conf | 8.0+0.08s Threads=1 Hash=16MB
Games | N: 5004 W: 2087 L: 335 D: 2582
Penta | [12, 151, 814, 1123, 402]
Elo | 128.92 +- 7.78 (95%)
Conf | 40.0+0.40s Threads=1 Hash=64MB
Games | N: 2502 W: 965 L: 77 D: 1460
Penta | [1, 28, 451, 624, 147]
Elo Estimates
Results of sirius-dev-a2c68f9 vs komodo-14.1 (40+0.4, 1t, 32MB, UHO_Lichess_4852_v1.epd):
Elo: 72.08 +/- 11.80, nElo: 117.91 +/- 18.74
LOS: 100.00 %, DrawRatio: 38.33 %, PairsRatio: 3.33
Games: 1320, Wins: 498, Losses: 228, Draws: 594, Points: 795.0 (60.23 %)
Ptnml(0-2): [5, 89, 253, 257, 56], WL/DD Ratio: 1.04
Results of sirius-dev-a2c68f9 vs weiss-dev-95b0951 (20+0.2, 1t, 32MB, UHO_Lichess_4852_v1.epd):
Elo: 32.16 +/- 8.23, nElo: 52.51 +/- 13.35
LOS: 100.00 %, DrawRatio: 44.62 %, PairsRatio: 1.62
Games: 2600, Wins: 868, Losses: 628, Draws: 1104, Points: 1420.0 (54.62 %)
Ptnml(0-2): [14, 261, 580, 361, 84], WL/DD Ratio: 1.41
At medium-long time controls, Sirius is significantly stronger than the last HCE release of Komodo and quite a bit stronger than Weiss dev. This should cement Sirius as the current 2nd strongest HCE engine. It is still quite far from Stockfish classical, which is the current strongest HCE engine.
Results of sirius-dev-a2c68f9 vs sf-classical (40+0.4, 1t, 32MB, UHO_Lichess_4852_v1.epd):
Elo: -40.18 +/- 10.26, nElo: -69.01 +/- 17.47
LOS: 0.00 %, DrawRatio: 42.50 %, PairsRatio: 0.46
Games: 1520, Wins: 326, Losses: 501, Draws: 693, Points: 672.5 (44.24 %)
Ptnml(0-2): [25, 275, 323, 124, 13], WL/DD Ratio: 1.20
Estimated 3530 rating CCRL Blitz
Estimated 3700 rating CCRL FRC
Credits
Thanks to members in the engine dev community who continue to support and discuss topics in computer chess with me.
Additionally, much credit goes to Stockfish, Ethereal, Weiss, Stash, and Perseus, along with their developers.
Many ideas and features in Sirius were inspired by or taken from these engines, and their knowledge has been invaluable to the development of Sirius.
Selecting a binary
v1 is significantly slower than v2, v3, and v4, and should only be used when absolutely necessary. In general, higher levels are faster.
If you download a binary and it crashes(closes immediately or doesn't respond to commands), it likely doesn't work, and you should download a different binary.
Linux binaries courtesy of @JonathanHallstrom
Full Changelog: v8.0...v9.0
Sirius v8.0
Release notes
Sirius v8.0 is much stronger than the previous version. Many improvements to the search and evaluation were made, along with some QoL improvements, including pretty printing. Major changes include singular extensions, a much better correction history implementation, a rework of threats eval, and staged move generation. Sirius v8.0 is 200+ elo stronger than Sirius v7.0 on a balanced book.
Changelog
- Lots of refactors
- UCI changes
- Pretty printing
- perft and eval command
- Speedups
- Staged move generation
- Incrementally updated evaluation terms
- A few new evaluation features
- Horizontally mirrored piece-square tables
- Threats
- Complete rework
- Pawn push and king threats
- defended/weak threats
- King safety
- Pawn shield/storm rework
- Passed pawns
- Score based on whether push square is blocked/enemy attacked
- Minors behind pawns
- Bishop same color pawns
- Many search tweaks
- Best move stability time management
- Correction history
- Qsearch correction
- Many new tables besides pawn structure: minor/major piece, non pawn, material, and threats
- Singular extensions
- multicut, double extensions, and negative extensions
- Razoring
- Many other search changes
Self-play against Sirius v7.0
Elo | 212.25 +- 7.84 (95%)
Conf | 8.0+0.08s Threads=1 Hash=16MB
Games | N: 5004 W: 2946 L: 220 D: 1838
Penta | [2, 66, 489, 1094, 851]
Elo | 207.46 +- 16.39 (95%)
Conf | 40.0+0.40s Threads=1 Hash=64MB
Games | N: 1000 W: 564 L: 29 D: 407
Penta | [1, 5, 103, 240, 151]
Estimated 3450 rating
Credits
Thanks to members in the engine dev community who continue to support and discuss topics in computer chess with me
Selecting a binary
v1 is significantly slower than v2, v3, and v4(about 15x), and should only be used when absolutely necessary. In general, higher levels are faster.
If you download a binary and it crashes(closes immediately or doesn't respond to commands), it likely doesn't work, and you should download a different binary.
Full Changelog: v7.0...v8.0
v7.0
Release notes
This version of Sirius is, unfortunately, no longer using just piece square tables for the evaluation function. It is, however, significantly stronger than Sirius 6.0 in self-play. Major new features include a more sophisticated handcrafted evaluation function tuned with my tuner, and a few major search features
Changelog
- A few minor refactors
- Many new evaluation features
- Mobility
- Threats
- Pawn structure
- Passed pawns, isolated pawns, etc.
- King safety
- Pawn shield/storm
- Safe checks from enemy pieces
- Attacks to squares near kings
- Rooks on open/semi-open files
- Knight outposts
- Bishop pair
- Tempo bonus
- A few search changes
- Static evaluation correction history
- Capture history
- Quiescence search futility pruning
- Threat history
- Some other tweaks
- Complete LTC SPSA tune of the entire search
Self-play against Sirius v6.0
Elo | 269.79 +- 7.38 (95%)
Conf | 8.0+0.08s Threads=1 Hash=16MB
Games | N: 10000 W: 7189 L: 682 D: 2129
Penta | [29, 163, 708, 1472, 2628]
Elo | 306.84 +- 10.27 (95%)
Conf | 40.0+0.40s Threads=1 Hash=64MB
Games | N: 5000 W: 3715 L: 175 D: 1110
Penta | [0, 41, 251, 835, 1373]
Estimated 3200 rating
Credits
Thanks to members in the engine dev community who continue to support me
Special thanks to Andy Grant for creating OpenBench
Selecting a binary
v1 is significantly slower than v2, v3, and v4(about 15x), and should only be used when absolutely necessary. In general, higher levels are faster.
If you download a binary and it crashes(closes immediately or doesn't respond to commands), it likely doesn't work, and you should download a different binary.
Full Changelog: v6.0...v7.0
Sirius-v6.0 output bugfix
Thanks to dio from cegt for helping with finding the bug. Sirius had a bug where it would output 2 spaces between "score" and "cp" in the uci output, and Arena is unable to parse this and outputs a score of 0. This release contains the fixed binaries without the double space output. It is functionally equivalent to the normal 6.0 release in every other way
v6.0
Changelog
- Lots of refactors
- Lots of fixes
- Much better time management
- soft time management
- node count time management
- A new set of piece square tables
- Improvements to various pruning techniques
- The improving heuristic
- reverse futility pruning
- late move pruning
- late move reductions
- Better history implementation
- continuation history
- 3 fold pvs/lmr
- Lots of performance improvements
- pseudo legal movegen
- Better tt lookup
- Added history pruning
- Added internal iterative reductions
- Adjust static evaluation using the tt score
- Lazy SMP
- Other tweaks and improvements
Full changelog: v5.0...v6.0
Release Notes
This is the final version where Sirius is piece-square table only. This release brings many improvements to the search, a completely new set of piece square tables, and tons of bug fixes and performance improvements. Sirius-v6.0 is around ~370 elo stronger than the previous version in self-play, and is an estimated ~3050 elo.
Self-play against sirius-v5.0
Score of sirius-6.0 vs sirius-5.0: 3293 - 101 - 641 [0.896] 4035
... sirius-6.0 playing White: 1693 - 39 - 286 [0.910] 2018
... sirius-6.0 playing Black: 1600 - 62 - 355 [0.881] 2017
... White vs Black: 1755 - 1639 - 641 [0.514] 4035
Elo difference: 373.3 +/- 13.3, LOS: 100.0 %, DrawRatio: 15.9 %
tc: 8+0.08
Score of sirius-6.0 vs sirius-5.0: 1397 - 28 - 325 [0.891] 1750
... sirius-6.0 playing White: 746 - 8 - 121 [0.922] 875
... sirius-6.0 playing Black: 651 - 20 - 204 [0.861] 875
... White vs Black: 766 - 659 - 325 [0.531] 1750
Elo difference: 365.2 +/- 18.9, LOS: 100.0 %, DrawRatio: 18.6 %
tc: 60+0.6s
Against PeSTO
Comparison against PeSTO, the current(previous?) strongest pst-only engine
Score of sirius-6.0 vs PeSTO: 784 - 629 - 1347 [0.528] 2760
... sirius-6.0 playing White: 410 - 290 - 680 [0.543] 1380
... sirius-6.0 playing Black: 374 - 339 - 667 [0.513] 1380
... White vs Black: 749 - 664 - 1347 [0.515] 2760
Elo difference: 19.5 +/- 9.3, LOS: 100.0 %, DrawRatio: 48.8 %
tc: 20+0.2
Rating estimates
Rank Name Elo +/- Games Wins Losses Draws Points Score Draw White Black
0 sirius-6.0 -9 9 4500 1600 1720 1180 2190.0 48.7% 26.2% 50.4% 46.9%
1 altair-5.0.0 176 28 500 300 66 134 367.0 73.4% 26.8% 74.0% 72.8%
2 stash-27.0 15 25 500 183 162 155 260.5 52.1% 31.0% 53.2% 51.0%
3 4ku-4.0 12 25 500 183 166 151 258.5 51.7% 30.2% 52.4% 51.0%
4 midnight-6 11 26 500 188 172 140 258.0 51.6% 28.0% 52.0% 51.2%
5 polaris-1.8.1 0 26 500 176 176 148 250.0 50.0% 29.6% 51.2% 48.8%
6 akimbo-0.5.0 -7 27 500 188 198 114 245.0 49.0% 22.8% 53.2% 44.8%
7 peacekeeper-1.60 -12 26 500 179 196 125 241.5 48.3% 25.0% 49.8% 46.8%
8 pedantic-0.6.0 -38 26 500 159 214 127 222.5 44.5% 25.4% 50.2% 38.8%
9 princhess-0.15.1 -60 28 500 164 250 86 207.0 41.4% 17.2% 41.8% 41.0%
tc: 8+0.08
All tests done on 8moves_v3.epd
Credits
Huge thanks to Ciekce, JW, Alex2262, and Cj5716 along with many others in the engine dev community, who have continued to provide valuable support and insight on the development of Sirius
Selecting a binary
v1 is significantly slower than v2, v3, and v4(about 3x), and should only be used when absolutely necessary. In general, higher levels are faster.
If you download a binary and it crashes(closes immediately or doesn't respond to commands), it likely doesn't work, and you should download a different binary.
Sirius 5.0
Release notes
This release brings about a ton of strength improvements regarding various search changes, bringing Sirius to around ~2700 elo. In self-play, Sirius v5.0 is over 400 elo stronger than Sirius v4.0 with a balanced book. This release also fixes many bugs and crashes related to the uci implementation in the previous version. The evaluation has not changed significantly, but has been tuned more effectively.
Self Play against v4.0
8+0.08 STC with 8moves_v3.epd
Score of Sirius-5.0 vs Sirius-4.0: 642 - 32 - 77 [0.906] 751
... Sirius-5.0 playing White: 328 - 16 - 32 [0.915] 376
... Sirius-5.0 playing Black: 314 - 16 - 45 [0.897] 375
... White vs Black: 344 - 330 - 77 [0.509] 751
Elo difference: 393.9 +/- 35.9, LOS: 100.0 %, DrawRatio: 10.3 %
60+0.6 LTC with 8moves_v3.epd
Score of Sirius-5.0 vs Sirius-4.0: 618 - 14 - 68 [0.931] 700
... Sirius-5.0 playing White: 309 - 8 - 33 [0.930] 350
... Sirius-5.0 playing Black: 309 - 6 - 35 [0.933] 350
... White vs Black: 315 - 317 - 68 [0.499] 700
Elo difference: 453.2 +/- 40.6, LOS: 100.0 %, DrawRatio: 9.7 %
Changelog
- Tons of bug fixes, improvements, and tweaks
- Completely reworked UCI implementation
- Large refactor
- Switched to asymmetric piece square tables
- Switched to fail-soft alpha beta pruning
- Fixed PVS/LMR implementation
- Improved futility pruning implementation
- Added reverse futility pruning
- Improved late move reductions
- Added history malus
- Added bench
- Added SEE pruning
- Added history gravity
- Added SEE move ordering
- Improved tt replacement scheme
- Added tt cutoffs in quiescence search
- Added late move pruning
- Added packed evaluation
Full changelog v4.0...v5.0
Credits
Huge thanks to Ciekce, JW, and Alex2262, along with many others in the engine dev community. Their insight and experience with engine dev has been invaluable for the development of Sirius
Selecting a binary
v1 is significantly slower than v2, v3, and v4(about 3x), so it should only be used on very old CPUs. In general, higher levels are faster.
If you download a binary and it crashes(closes immediately or doesn't respond to commands), it likely doesn't work, and you should download a different binary.