MAINT: constants: reorganize codata constants data #20055

jakobjakobson13 · 2024-02-08T21:52:22Z

I´ll try to break down my draft #17577 into smaller bits, starting by this pull request.

Reference issue

gh-17577

What does this implement/fix?

It slims down the _codata.py file by moving the constants data in their own file. This makes its editing easier.

Additional information

This pull request should not break anything as the changes are only internal.

jakobjakobson13 · 2024-02-09T07:27:56Z

Could anyone give me a hint why the file scipy/constants/codata_constants_2002.txt can't be found?

ev-br · 2024-02-09T07:59:19Z

Try adding it to the list in meson.build, https://github.com/scipy/scipy/blob/main/scipy/constants/meson.build

lucascolley · 2024-02-09T17:01:30Z

relevant CI error seems to be

scipy/constants/_codata.py:106: in parse_constants_2002to2014
    uncert = float(line[77:99].replace(' ', '').replace('(exact)', '0'))
E   ValueError: could not convert string to float: '0.00000029e'
        constants  = {'alpha particle-electron mass ratio': (7294.2995361, '', 2.9e-06), '{220} lattice spacing of silicon': (1.920155714e-10, '2          m', 3.2e-07)}
        d          = '\n             2010 Fundamental Physical Constants --- Complete Listing\n\n\n  From:  [http://physics.nist.gov/constan...1\nWien](http://physics.nist.gov/constan...1/nWien) wavelength displacement law constant                   2.897 7721 e-3           0.000 0026 e-3           m K\n'
        line       = 'alpha particle mass                                         6.644 656 75 e-27        0.000 000 29 e-27        kg'
        name       = 'alpha particle mass'
        uncert     = 2.9e-06
        units      = ''
        val        = 6.64465675e-27

jakobjakobson13 · 2024-02-09T23:31:44Z

For the parsing of the 2002 file I came up with the following parsing function but it contains loads of regular expressions:

def parse_constants_2002to2006(d: str) -> dict[str, tuple[float, str, float]]:
    constants = {}
    for line in d.split('\n'):
        if line == "":
            continue
        if line[1] == " ":
            continue
        if line[1] == "-":
            continue        
        name = line[1:61].rstrip()
	significant = re.search(r'\d*.\d*', line[62:95].replace(" ","")).group()
	exponent = re.search(r'e-*\d*', line[62:95].replace(" ","")).group()
	val = float(significant+exponent)
	uncert_non_zeros = re.sub(r'[()]', '',  re.search(r'\(\d*\)', line[62:95]).group())
 	uncert_zeros = re.sub(r'\d', '0', significant).group()
	uncert = float(uncert_zeros.replace(uncert_zeros[-len(uncert_non_zeros):], uncert_non_zeros) + exponent)
	units = line[96:].rstrip()
        constants[name] = (val, units, uncert)
	return constants

I have to look at it again another time and perhaps it´s way to complicated.

rgommers · 2024-02-27T18:32:16Z

This sounds like a good idea. While we're at it - what do you think about saving the data into a single .npz file at build time and installing that instead of the .txt files? That will make the wheel and on-disk installed sizes smaller; the text files are almost 200 kb; the binary equivalent is going to be a lot smaller.

We're doing this already in scipy.special. It should be pretty straightforward, the only thing to think about is to write the file to the build dir rather than in-tree. This can be done with a simplified version of https://github.com/scipy/scipy/blob/main/scipy/special/utils/makenpz.py

jakobjakobson13 · 2025-02-26T20:32:00Z

This sounds like a good idea. While we're at it - what do you think about saving the data into a single .npz file at build time and installing that instead of the .txt files? That will make the wheel and on-disk installed sizes smaller; the text files are almost 200 kb; the binary equivalent is going to be a lot smaller.

We're doing this already in scipy.special. It should be pretty straightforward, the only thing to think about is to write the file to the build dir rather than in-tree. This can be done with a simplified version of https://github.com/scipy/scipy/blob/main/scipy/special/utils/makenpz.py

I finally got the time to look into it again but the problem seems more complicated than I inititally thought:
The current proposal as far as it stands should work fine but using building the npz files is more complicated.
makenpz.py

scipy/scipy/special/utils/makenpz.py

Line 82 in 1687c1c

data[key] = np.loadtxt(fn)

uses np.loadtxt within its process to create archives but that function is originally intended for arrays.
So if you want np.loadtxt to read a file with a header like


             2010 Fundamental Physical Constants --- Complete Listing


  From:  http://physics.nist.gov/constants



  Quantity                                                       Value                 Uncertainty           Unit
-----------------------------------------------------------------------------------------------------------------------------
{220} lattice spacing of silicon                            192.015 5714 e-12        0.000 0032 e-12          m
alpha particle-electron mass ratio                          7294.299 5361            0.000 0029               
alpha particle mass                                         6.644 656 75 e-27        0.000 000 29 e-27        kg

you have to consider various things: The first lines have to be skipped till you are at the data and you´ll need converters for the quantities and the unit as they are strings and you´ll need converters for the value and uncertainty as the scientific notation with the spaces are a mixture of characters and digits. Additionally, the formatting of the codata files changed over the years.

So long story short: For the moment, it seems like a bad trade to put much more effort into it. Splitting out the "codata raw files" from the main file seems quite easy but creating npz files needs more work than just using makenpz.py. The reg expressions I came up with could be useful for this but I´ll leave to another volunteer, if there is one.

jakobjakobson13 added 2 commits February 8, 2024 22:39

Move codata constants into their own files

d30e1f3

Adapt import functions to new folder structure

6789a53

github-actions bot added the scipy.constants label Feb 8, 2024

Set relative path for tests

3706223

lucascolley added the maintenance Items related to regular maintenance tasks label Feb 9, 2024

jakobjakobson13 added 2 commits February 9, 2024 07:47

use absolute path for open statement

69219e0

Correct absolute path of codata_constants files

168299a

Add constant files to meson build file

c935790

jakobjakobson13 requested a review from rgommers as a code owner February 9, 2024 14:42

add missing quotation marks

58107fc

jakobjakobson13 added 5 commits February 9, 2024 19:16

Fix parsing functions

6cdd55c

skip old codata values for the moment

fc66177

Fix parsing function

de45c4b

skip test_2002_vs_2006

1dcd04c

Update codata.py

a2a6b37

Change data dir

aba2e46

jakobjakobson13 added 6 commits March 4, 2024 23:22

Fix path and parsing function

851a6c3

Delete 2002vs2006 test

ae9a2ae

Add import and decomment lines

fec15ed

Decomment lines

d976cea

Change import paths

a5f1e03

add import

6a26d4c

lucascolley changed the title ~~Reorganize codata constants data~~ MAINT: constants: reorganize codata constants data Mar 14, 2024

jakobjakobson13 added 2 commits April 4, 2024 23:55

Refactor import functions

5219cd8

Use underscore

966e5fe

jakobjakobson13 added 9 commits April 5, 2024 08:32

Use underscore

6aff38f

Use underscore

35ac041

Use underscore

ab5f22b

Use underscore

e949611

Use underscore for files

e883531

Reintroduce imports

130360b

Use underscores in filenames

1a0d57c

Remove imports of deleted variables

a11ef07

Update codata.py

e103044

Louwrensth mentioned this pull request Sep 20, 2024

MAINT: Update constants to CODATA 2022 recommendation #21596

Closed

mdhaber mentioned this pull request Oct 4, 2024

MAINT: constants: revise way 'exact' values are recomputed #11345

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

MAINT: constants: reorganize codata constants data #20055

MAINT: constants: reorganize codata constants data #20055

Uh oh!

jakobjakobson13 commented Feb 8, 2024

Uh oh!

jakobjakobson13 commented Feb 9, 2024

Uh oh!

ev-br commented Feb 9, 2024

Uh oh!

lucascolley commented Feb 9, 2024

Uh oh!

jakobjakobson13 commented Feb 9, 2024

Uh oh!

rgommers commented Feb 27, 2024

Uh oh!

jakobjakobson13 commented Feb 26, 2025

Uh oh!

Uh oh!

Uh oh!

MAINT: constants: reorganize codata constants data #20055

Are you sure you want to change the base?

MAINT: constants: reorganize codata constants data #20055

Uh oh!

Conversation

jakobjakobson13 commented Feb 8, 2024

Reference issue

What does this implement/fix?

Additional information

Uh oh!

jakobjakobson13 commented Feb 9, 2024

Uh oh!

ev-br commented Feb 9, 2024

Uh oh!

lucascolley commented Feb 9, 2024

Uh oh!

jakobjakobson13 commented Feb 9, 2024

Uh oh!

rgommers commented Feb 27, 2024

Uh oh!

jakobjakobson13 commented Feb 26, 2025

Uh oh!

Uh oh!