Skip to content

Conversation

ilayn
Copy link
Member

@ilayn ilayn commented Nov 24, 2023

Reference issue

Part of #18566

Relevant issues will be added

What does this implement/fix?

The rewrite of the AMOS library.

Additional information

Not as big as cdflib, should be tested meticulously for the edge cases.

Total gotos left: ~40 (out of too many)

  • Add missing documentation from Fortran sources
  • Add tests
  • Get some benchmarks

@ilayn ilayn added enhancement A new feature or improvement scipy.special maintenance Items related to regular maintenance tasks Fortran Items related to the internal Fortran code base labels Nov 24, 2023
@ilayn ilayn added this to the 1.13.0 milestone Nov 24, 2023
@ilayn ilayn marked this pull request as draft November 24, 2023 16:24
@ilayn ilayn force-pushed the remove_amos branch 2 times, most recently from 07223f2 to 4817951 Compare November 25, 2023 09:34
@ilayn ilayn mentioned this pull request Nov 25, 2023
37 tasks
@ilayn
Copy link
Member Author

ilayn commented Dec 21, 2023

I managed to make some progress in a quite unexpected direction.

Because the code is so unorthodox, doing real arithmetic with complex ops, jumping from one for loop into the middle of another and feeling no shame and so on, Cythonization didn't cut it so I switched to C as another goto infested lang. Anyways, I quickly realized why I stopped coding in these langs because it's been roughly 30 years and still nobody managed to agree on what a complex number type should be. So probably at some point #include <complex.h> will be a headache for somewhere some system.

In fact quoting Amos himself ;

Mathematical constants are carried to 18 decimal digits to accommodate UNIVAC double precision should double precision complex arithmetic be implemented. The fact that FORTRAN does not have a type called DOUBLE PRECISION COMPLEX makes this part awkward, but conversion is straightforward. Complex numbers can be carried as a pair. New routines for complex multiply, divide, absolute value, square root, logarithm, and exponential will have to be coded in double precision arithmetic.

If only he knew... I unrolled as many GOTOs as I could. I'll clean the cython code when I'm done 😢

If you have objections to style or strategy please tell me as I hardly know what I am doing.

@ilayn ilayn changed the title [WIP]ENH:MAINT:special:Cythonize Amos lib and remove F77 code [WIP]ENH:MAINT:special:Rewrite amos F77 code Dec 22, 2023
@lucascolley lucascolley added the C/C++ Items related to the internal C/C++ code base label Dec 22, 2023
@ilayn
Copy link
Member Author

ilayn commented Dec 24, 2023

165e716 finished the rewrite. I am pretty confident about airy and biry code in C as I tested in many regions and compared with Fortran code and WolframAlpha.

I'll test the the remaining Bessel ones and start dealing with the wrappers.

@ilayn ilayn mentioned this pull request Dec 24, 2023
5 tasks
@ilayn ilayn changed the title [WIP]ENH:MAINT:special:Rewrite amos F77 code ENH:MAINT:special:Rewrite amos F77 code Dec 24, 2023
@ilayn ilayn marked this pull request as ready for review December 30, 2023 17:36
@ilayn
Copy link
Member Author

ilayn commented Dec 30, 2023

Tested on Win10 and Linux Mint. Only one test is barely failing rel err 1.09e-13 instead of 1.0e-13. Let's see what happens in the CI.

I would like to know what we should do to define missing macros in different platforms like CMPLX and I.

Happy new year all 🎉 🎆

@ilayn
Copy link
Member Author

ilayn commented Jan 1, 2024

Status update;

  • Py3.12 job does not understand the .real and .imag attributes of the NumPy Complex doubles (See the failing job). Is this anything related to NumPy 2.0?

  • I tried to find the source of the mismatch in the test
    image

    but whatever is happening is happening in the wrappers probably at the lowest registers between C and Fortran. Unfortunately this ~1e-58 number is then multiplied with a very large number and the result is not possible to hit that same relative accuracy. Hence I made the test relative tolerance to be 2e-13 instead of 1e-13.

If the first one is resolved and second item is acceptable, then this is ready to be reviewed.

@lucascolley
Copy link
Member

Py3.12 job does not understand the .real and .imag attributes of the NumPy Complex doubles (See the failing job). Is this anything related to NumPy 2.0?

I'm just guessing but is numpy/numpy#24085 related?

@ilayn
Copy link
Member Author

ilayn commented Jan 1, 2024

Ah yes, probably. Let me use the NPY_CREAL and NPY_CIMAG just to be on the safe side.

@ilayn
Copy link
Member Author

ilayn commented Jan 1, 2024

Ok seems like we are game. All feedback welcome.

@rgommers
Copy link
Member

rgommers commented Jan 2, 2024

Impressive effort @ilayn!

This isn't quite reviewable line by line, so I guess we should look at test coverage, build warnings, etc. It maybe also be useful to run clang-format over the new code now.

Copy link
Contributor

@steppi steppi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was quite a feat to untangle and translate all of this, the Fortran is not at all easy to understand. Bessel functions can be tricky to compute and despite its tortured code style, AMOS is actually pretty solid, so I wanted to give this a very close look.

What I decided to do is pick one of the most difficult functions and study the code in depth, comparing to the original Fortran and Donald Amos' paper. I chose amos_binu which computes the modified Bessel function of the first kind in the right half plane. If this part had been translated perfectly, I would have had more confidence in everything, since it is among the most difficult parts to get right.

I found an incorrect condition which causes degraded performance in some cases. Just the kind of little mistake I could have made too if I was the one who translated this. It makes me think I'll need to look over everything with a similar level of care. I don't quite have bandwidth for that at the moment. It would probably be several days of work.

@ilayn, I don't know what your bandwidth looks like, but you might want to trace through all of the control flow again to check if you made any other little mistakes like that. Also, I found a comment in the wrong place, it'll be helpful if you make sure all of those are where they belong.

I found it helpful to look through Amos' paper I linked to above. Despite all of the spaghetti, most of what's happening here is that different methods are being used in different regions of $\mathbb{R}\times \mathbb{Z}$. The unstructured Fortran code makes the logic seem much more complicated than it is.

@ilayn
Copy link
Member Author

ilayn commented Jan 13, 2024

Ha, it's not less feat to go through this so thanks @steppi

Funny that you picked binu because I think I wrote that one at least 3 times. And the part you stumbled upon is exactly where I didn't get right apparently, the conditions are so weirdly tangled that I made a true false diagram for it. I'm surprised that it is still broken.

So we are looking at this flow

IF (AZ.LT.RL) GO TO 40
IF (DFNU.LE.1.0D0) GO TO 30
IF (AZ+AZ.LT.DFNU*DFNU) GO TO 50
C-----------------------------------------------------------------------
C ASYMPTOTIC EXPANSION FOR LARGE Z
C-----------------------------------------------------------------------
30 CONTINUE
CALL ZASYI(ZR, ZI, FNU, KODE, NN, CYR, CYI, NW, RL, TOL, ELIM,
* ALIM)
IF (NW.LT.0) GO TO 130
GO TO 120
40 CONTINUE
IF (DFNU.LE.1.0D0) GO TO 70
50 CONTINUE

So first jumps, then falls through then jumps further and if not then falls through. The reason I have those extra conditions are these if else sillyness. If it falls through we finish off with 120 or 130. But if not further things happen. So the idea was whether it is coming from the 50 jump or coming from 40 with false IF condition.

Fortunately we don't have another occurence of this so I am pretty confident about the remaining parts since I tested quite a number of them.

I'll check what the difference is between your version and what I wrote

@ilayn
Copy link
Member Author

ilayn commented Jan 13, 2024

OK apparently I fixed this in my local work but did not pass it over to the PR

image

This binu is such an annoying piece of code.

@steppi
Copy link
Contributor

steppi commented Jan 13, 2024

This binu is such an annoying piece of code.

Yeah, it's the trickiest one in my opinion. If that had been perfect, I think I would have just gave everything else a spot check and merged. I'm feeling better about things again after your explanations. I should be able to find time to finish reviewing within the coming week.

@ilayn
Copy link
Member Author

ilayn commented Jan 13, 2024

No worries, that's really good catch. Let's hope I didn't make more labor intensive mistakes.

@ilayn
Copy link
Member Author

ilayn commented Jan 13, 2024

@steppi apologies for getting back to this again but I think you found a bug in the fortran code itself. Because when I check it with the fortran code itself this is what I get; this is the fortran code annotated with lots of print statements and gives 1e-13 difference

BESI 225: BESK CALLING BINU
 BINU  35: GOING TO 40?   21.213203435596427        21.784271729432426     
 BINU  55: DID NOT GO TO 70 FROM 40
 BINU  59: CALLING UOIK   15.000000000000000        15.000000000000000
 UNIK 135: SR    1.0002612190046860        2.2858395497944661E-002
 UNIK 198: CWRK[15]  0.10036965266287727       -1.1466953394589764E-003
 BINU  59: RETURNED UOIK WITH NW           0
 BINU  63: CY[0]   0.0000000000000000        0.0000000000000000
 BINU 113: CALLING BUNI
 BUNI 158: CALLING UNI1
 UNIK 135: SR    1.0002612190046860        2.2858395497944661E-002
 UNIK 198: CWRK[15]  0.10036965266287727       -1.1466953394589764E-003
 UNIK 135: SR    1.0002612190046860        2.2858395497944661E-002
 UNIK 198: CWRK[15]  0.10036965266287727       -1.1466953394589764E-003
 UNIK 178: K AFTER LOOP           9
 UNIK 181: K AFTER LOOP           9
 UNIK 198: SUM  0.99916734689966913        1.1361412899521058E-004
 UNIK 198: PHI   4.0041698116427993E-002  -4.5746525364945902E-004
 UNIK 198: CWRK[15]  0.10036965266287727       -1.1466953394589764E-003
 UNIK 198: CON[0]  0.39894228040143270
 UNI1 125: CSR[           1 ] =   1.0000000000000000
 UNI1 125: S2  -2.0645908843310795E-055  -1.1127773731040390E-055
 UNI1 125: Y[           0 ] =   -2.0645908843310795E-055  -1.1127773731040390E-055
 BINU 113: RETURNED BUNI WITH NW           0
 BINU 113: CY[0]  -2.0645908843310795E-055  -1.1127773731040390E-055
 BINU 113: CY[1]   1.6198761276743044E-312  -2.0645908843310795E-055
 BESI 228: BESK RETURNED FROM BINU WITH NZ =           0
(-2.0645908843310795e-55, -1.112777373104039e-55, 0, 0)

This is what you get from the wrong SciPy result above. The reason I am convinced that this is a Fortran bug is because my code also goes into the same branches and not into UOIK as you mentioned it should go immediately. They do go indeed but return the same result.

_amos.c:890: BESI CALLING BINU
_amos.c:1334: BINU AZ < RL check 21.2132034355964265 <? 21.7842717294324260
_amos.c:1334: BINU dfnu < RL check 99.2000000000000028 <? 1.0
_amos.c:1334: BINU az+az < dfnu*dfnu check 42.4264068711928530 <? 198.4000000000000057
_amos.c:1373: BINU CALLING UOIK
_amos.c:1373: BINU with z 15.0000000000000000 15.0000000000000000
_amos.c:3665: UNIK sr = 1.000261219004686 0.02285839549794466
_amos.c:3709: UNIK: cwrk[15]  0.1003696526628773 -0.001146695339458976
_amos.c:1373: BINU RETURNED FROM UOIK NW = 0
_amos.c:1373: BINU cy[0] 0 0
_amos.c:1394: BINU CALLING BUNI
_amos.c:2156: BUNI CALLING UNI1
_amos.c:3665: UNIK sr = 1.000261219004686 0.02285839549794466
_amos.c:3709: UNIK: cwrk[15]  0.1003696526628773 -0.001146695339458976
_amos.c:3665: UNIK sr = 1.000261219004686 0.02285839549794466
_amos.c:3709: UNIK: cwrk[15]  0.1003696526628773 -0.001146695339458976
_amos.c:3726 UNIK: K AFTER LOOP 9
_amos.c:3749: UNIK AFTER INIT=0 case INIT = 9
_amos.c:3709: UNIK: sum  0.9991673468996691 0.0001136141289952106
_amos.c:3709: UNIK: phi  0.04004169811642799 -0.000457465253649459
_amos.c:3709: UNIK: cwrk[15]  0.1003696526628773 -0.001146695339458976
_amos.c:3709: UNIK: con[0]  0.3989422804014327
_amos.c:3342: UNI1 csr[1] 1
_amos.c:3342: UNI1 s2 -2.064590884330946e-55 -1.112777373104005e-55
_amos.c:3342: UNI1 y[0] -2.064590884330946e-55 -1.112777373104005e-55
_amos.c:2156: BUNI RETURNED FROM UNI1 NW = 0
_amos.c:1394: BUNI y[0] -2.064590884330946e-55 -1.112777373104005e-55
_amos.c:1394: BINU RETURNED FROM BUNI NW = 0, NLAST 0
_amos.c:1394: BINU cy[0] -2.064590884330946e-55 -1.112777373104005e-55
_amos.c:892: BESI RETURNED FROM BINU WITH NZ = 0
0 0 -2.0645908843309458552e-55 -1.1127773731040049375e-55i

I guess copied the Fortran code a bit too good.

@ilayn
Copy link
Member Author

ilayn commented Jan 13, 2024

Hmm I am more confused now because then my fixed version is diverged from the Fortran code itself. And if I add your fix then we have many tests failing.

@ilayn
Copy link
Member Author

ilayn commented Jan 13, 2024

OK the current SciPy which uses Fortran also does not pass 1e-15 rtol level too

C:\Users\ilhan> ipython
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.19.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: import scipy

In [2]: scipy.__version__
Out[2]: '1.12.0rc1'

In [3]: import scipy.special as ssp

In [4]: ssp.iv(99.2, 15+15j)
Out[4]: (-2.0645908843310788e-55-1.1127773731040388e-55j)

In [11]: assert_allclose(ssp.iv(99.2, 15+15j), mpmath,rtol=1-15)

[..snip..]

AssertionError:
Not equal to tolerance rtol=-14, atol=0

Mismatched elements: 1 / 1 (100%)
Max absolute difference: 1.28411573e-68
Max relative difference: 5.47508521e-14
 x: array(-2.064591e-55-1.112777e-55j)
 y: array(-2.064591e-55-1.112777e-55j)

So I think I'll just take a break for today and tomorrow I'll do another round.

ilayn and others added 5 commits January 14, 2024 00:50
* MAINT: special: remove goto in acai

* MAINT: special: remove some gotos from bknu

* MAINT: simplify if statements

* MAINT: special: remove a few gotos unk1

* MAINT: remove duplicate code in acai

* STY:special:amos:Basic fixups for do loops

---------

Co-authored-by: Ilhan Polat <ilhanpolat@gmail.com>
MAINT:special:amos: Add const specifier for array data

STY:special:amos: Cleanup and organize data and comments

MAINT:special:amos: Silence compiler for fabs(integer) use
@ilayn
Copy link
Member Author

ilayn commented Jan 13, 2024

I am not sure where I confused myself but I used my working code and swapped places following your lead and indeed tests pass. And moreover, I reverted the modified test since that also passes too. Let's see how it goes. Overall, I think this indeed was a very good catch. And the test suite is quite comprehensive it seems.

And just for the ease of testing, you can choose to test only the public functions as they go through multiple hoops and different domains should catch any remaining discrepencies.

Copy link
Contributor

@steppi steppi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ilayn. I've looked over everything else with a level of attention somewhere between spot check and exhaustive study while referring to the original paper. The rest looks good to me, and none of it has such twisted logic in the original as binu. It's possible I missed something still, but I feel confident now, especially with the level of testing we have.

@steppi steppi merged commit 8e9d438 into scipy:main Jan 14, 2024
@ilayn
Copy link
Member Author

ilayn commented Jan 14, 2024

Thanks @steppi much appreciated.

@j-bowhay
Copy link
Member

@steppi do you think we should update the docstrings as currently they state along the lines of:
Wrapper for a C translation of the AMOS [1]_ routines `zairy` and `zbiry`.
which I guess now is not 100% correct.
I suggested in ilayn#7 to change to
Wrapper for a C translation of the AMOS [1]_ routines `zairy` and `zbiry`.
however, perhaps this is too much detail for the end user. Do you have a view here?

@steppi
Copy link
Contributor

steppi commented Jan 14, 2024

@steppi do you think we should update the docstrings as currently they state along the lines of: Wrapper for a C translation of the AMOS [1]_ routines `zairy` and `zbiry`. which I guess now is not 100% correct. I suggested in ilayn#7 to change to Wrapper for a C translation of the AMOS [1]_ routines `zairy` and `zbiry`. however, perhaps this is too much detail for the end user. Do you have a view here?

Sorry. I didn't find any sign of ilayn#7 mentioned in the comment chain above. I would have waited to merge this if I'd known you had this concern.

Honestly, I thought the docstrings were fine as is. I was thinking of the AMOS routines in the abstract, not tied to any particular implementation. Now that you mention it though, letting users know these are translations might save people from some confusion around our functions giving slightly different results from AMOS in some cases.

For the future. I expect these functions to diverge further and further from AMOS over time as we make improvements. If so, I think the docstrings should be updated again to say our implementations are based on those of Amos, citing D.E. Amos' Computation of Bessel functions of a complex argument, which discusses the math behind the implementations.

@ilayn
Copy link
Member Author

ilayn commented Jan 14, 2024

Sorry. I didn't find any sign of ilayn#7 mentioned in the comment chain above. I would have waited to merge this if I'd known you had this concern.

No problem at all. There is just too much to look at with these Fortran stuff so we are always forgetting stuff.

But indeed I would also using the abstract terminology of citing Amos' paper. The interested readers would discover where we got the implementations at the top of the license comments anyways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C/C++ Items related to the internal C/C++ code base enhancement A new feature or improvement Fortran Items related to the internal Fortran code base maintenance Items related to regular maintenance tasks scipy.special
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants