Fix Poly and nsimplify to not lose precision #12088

asmeurer · 2017-01-24T05:04:57Z

Fixes #12015 (there isn't an issue for nsimplify). Previously, both Poly and nsimplify would lose precision unless mpmath.mp.dps was manually set.

This also addresses #12045. I have added a flag base10 to nsimplify, which when set to False causes it to use the exact binary representation of floats (when rational=True). I personally believe this should be the default value, but I haven't changed it (it would be a breaking change). I can see the point (using the base-10 representation is more user friendly), but it's also inexact and a surprise for anyone who expects nsimplify(rational=True).evalf() to round-trip. What do others think here?

CC @scopatz

float is the name of a builtin, so it's better to use a different name.

See sympy#12045.

The default behavior is to convert floats to rationals using the string representation (roughly). Setting this flag to false will cause nsimplify to use the exact binary representation (i.e., it round-trips with evalf(), given the correct precision). I personally believe the default for this flag should be False, but I've left that alone for now. See sympy#12045.

This was causing Poly() and nsimplify() to lose precision unless mpmath.mp.dps was set. I saw this at sympy#11862 but was unsure how to hit the code (hence the XXX comment). Fixes sympy#12015.

jksuom · 2017-01-24T08:59:22Z

I think that the conversion from floating point to rational numbers could be improved. (That might also affect the choice of default flags.)

When a rational number is represented by a floating point number, the result is necessarily approximate (except if the denominator is a power of 2, or, in case of decimal numbers, a product of powers of 2 and 5). The mechanical inverse conversion of a floating point number to a rational number will then produce a result whose denominator is a power of 2 (or a product of powers of 2 and 5), and both numerator and denominator are often disproportionately large.

The theory of continued fractions shows how to find best rational approximations to any number (even a transcendental number such as pi). In case of a floating point number representing a rational number, that number will appear among the convergents of the continued fraction expansion as the last one with reasonably sized coefficients.

 n [2]: list(continued_fraction_iterator(Rational(0.333333333333333)))
Out[2]: [0, 3, 316042079113718, 1, 2, 6]

In [3]: it = continued_fraction_convergents(continued_fraction_iterator(Rational(0.333333333333333)))

In [4]: print(list(it))
[0, 1/3, 316042079113718/948126237341155, 316042079113719/948126237341158, 948126237341156/2844378712023471, 6004799503160655/18014398509481984]

It seems to me that Rational(0.333333333333333) should return 1/3, and this could be implemented by means of continued fractions.

asmeurer · 2017-01-24T18:49:46Z

That's an interesting idea, perhaps outside the scope of this PR. I wonder if that isn't just equivalent to the current algorithm, which limits the denominator with the tolerance.

This is the current behavior by the way:

In [2]: nsimplify(0.3, rational=True)
Out[2]: 3/10

In [3]: nsimplify(0.33, rational=True)
Out[3]:
 33
───
100

In [4]: nsimplify(0.333, rational=True)
Out[4]:
333
────
1000

In [5]: nsimplify(0.3333, rational=True)
Out[5]:
 3333
─────
10000

In [6]: nsimplify(0.33333, rational=True)
Out[6]:
33333
──────
100000

In [7]: nsimplify(0.333333, rational=True)
Out[7]:
 333333
───────
1000000

In [8]: nsimplify(0.3333333, rational=True)
Out[8]:
3333333
────────
10000000

In [9]: nsimplify(0.33333333, rational=True)
Out[9]: 1/3

It starts giving 1/3 with 8 3s after the decimal, whereas the closes float approximation to 1/3 is 0.3333333333333333 (16 3s).

I believe (@smichr will need to confirm) that the idea behind the current algorithm is that it lets you write fractions in a lazy way without concern for Python's evaluation of division, like 1/3*x, and then use nsimplify to convert the resulting floats back to the fractions that you (probably) wrote.

asmeurer · 2017-01-24T18:50:57Z

No idea what's causing the test failure with matplotlib.

asmeurer · 2017-01-24T18:51:23Z

Oh I guess the test failure is #11691. So unrelated to this PR.

jksuom · 2017-01-24T21:23:41Z

Maybe we should return to matplotlib 1.5.3 for the time being.

About continued fractions: The convergents have the very special property that the error of n/d is less than 1/d^2 while for most approximating fractions the error is of class 1/d.

asmeurer · 2017-01-24T21:27:23Z

So, looking forward, if we implement the continued fractions idea, does the flag name I've added here for nsimplify make sense? Or should it be named something else (maybe rational_approximation_method=<'continued_fraction', 'exact', ...>)?

jksuom · 2017-01-25T09:06:02Z

Maybe it is not necessary to change the keyword name or the way it is working after all. Continued fractions could be also used in limit_denominator after a 'full precision' rational has been constructed. Or, preferably, there could be a Float.to_rational method, with an argument 'tolerance', that would choose the first continued fraction convergent that comes within the range.

jksuom · 2017-01-25T10:25:37Z

I think this could be merged as it is (once the plotting issue has been fixed). I also think that base10 should be False by default, but that can be changed in another PR. Also the conversion of floats in general is a different matter.

asmeurer · 2017-01-25T18:55:51Z

I'm still concerned that base10 is not the best kwarg name.

As noted at sympy#12088, we may want to apply other algorithms here in the future, such as one using continued fractions.

asmeurer · 2017-02-13T23:26:06Z

I went ahead and changed the keyword argument to something better. Please review.

scopatz · 2017-02-13T23:28:15Z

LGTM

asmeurer · 2017-02-13T23:33:47Z

Looks like Poly still loses precision in multivariate cases (even setting mpmath.mp.dps doesn't fix it).

In [40]: Poly(pi.evalf(100)*x*y, x)
Out[40]: Poly(3.14159265358979*y*x, x, domain='RR[y]')

In [41]: Poly(pi.evalf(100)*x, x)
Out[41]: Poly(3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117068*x, x, domain='RR')

asmeurer · 2017-02-13T23:50:05Z

Problem is here.

The fix is easy (just copy what the simple constructor does), but there's another problem in the tests, because PolyRings compare using is. Is the correct == comparison for PolyRing to compare the symbols and domain?

asmeurer · 2017-02-14T02:07:18Z

For some reason the dtype is set to a dynamically created subclass of PolyElement (

sympy/sympy/polys/rings.py

Line 206 in d29d13a

obj.dtype = type("PolyElement", (PolyElement,), {"ring": obj})

), but PolyElement doesn't have a special metaclass, so it's __eq__ is the default (different instances are unequal). So because of this, two separately created PolyRings are always unequal. Why does the PolynomialRing __eq__ check dtype if dtypes are created dynamically like this? Something needs to change in the structure here, but I'm unsure exactly what. @mattpap?

The added test does not pass because the dtype of a PolyElement a dynamically created class, which doesn't have any metaclass (hence, it uses the default __eq__). But that is set to PolynomialRing.dtype, and is used in the PolynomialRing.__eq__ test. I am not sure how to properly fix this. It seems odd that the dtype should be created dynamically at each class creation, and also odd that PolynomialRing.__eq__ is doing an equality check on a class.

The construct_domain function was not using a RR with the correct precision. The tests do not pass here yet (see discussion on the previous commit), but the behavior itself does work.

asmeurer · 2017-02-14T02:13:50Z

OK, I've pushed up a fix, which currently fails the tests because of the broken __eq__ checks, but the multivariate Poly behavior is correct, so I can at least work against this branch until I figure out how to make the tests work.

This requires sympy/sympy#12088 to work. Otherwise, the horner option will lose precision.

jksuom · 2017-02-14T07:13:39Z

Is the correct == comparison for PolyRing to compare the symbols and domain?

Currently, monomial order is also taken into account. Two rings having the same symbols and domain are considered different if their orders are different. This does not seem practical for most purposes as the order only matters when computing Gröbner bases.

asmeurer · 2017-02-14T08:19:13Z

Turns out __hash__ was already defined, so I ended up just basing it on that (using symbols, domain, ngens, and order).

- Make __eq__ always match __hash__ - Don't compare dtype. Since dtype is a dynamically created class on PolyRing/FracField, which doesn't have any kind of special metaclass that allows comparisons, we can't do isinstance or equality checks on it. Rather, compare dtype.ring/dtype.field, since that is the only differing attribute. I did not do a thorough check of the use of this. I just fixed enough to make the failing tests pass. A better way of handling this is needed, preferably one that doesn't make use of dynamically created classes. - Fix typo in the tests (instantiate a RealField with RealField(), not RR()).

asmeurer · 2017-02-17T21:06:38Z

OK, I've pushed up a fix for this. @jksuom I'd appreciate if you could look over the __eq__ methods I've defined in the polys and see if they make sense, based on your knowledge of the classes. See also my commit messages.

The main thing here is:

__eq__ now is the same as __hash__ (I implemented __eq__ based on what was being compared in __hash__).
Never compare the dtype attribute, since it is a dynamically created class. Rather, compare dtype.ring/dtype.field, which is the only thing that differs. Really, we should get rid of dynamically created classes (dtype should be an instance, not a class), but I wasn't up to the task of doing that.

A review of the rest would be nice too (esp. the new flag I've added to nsolve), although you already looked at most of the changes from before.

jksuom · 2017-02-18T14:47:00Z

The current implementation of PolyRing (and FracField) is based on the idea that only one copy is ever constructed for each combination of symbols, domain and order, and that copy is kept in a cache. Then it should be true that == and is indeed have the same meaning. However, caching is based on hash values, and there is a theoretical possibility of collision. Hence it may be advisable to redefine __eq__ (and caching).

It is not so easy to say what should be done with PolynomialRing and FractionField. They can be thought of as some kind of 'wrappers' for PolyRing and FracField hiding their low level methods and introducing new high level methods. Basically, it should be possible to delegate comparisons of PolynomialRings to their underlying PolyRings, so == could be just self.ring == other.ring. I don't see why even dtype should be considered.

jksuom · 2017-02-18T16:35:58Z

sympy/simplify/simplify.py



-def _real_to_rational(expr, tolerance=None):
+def _real_to_rational(expr, tolerance=None, rational_alg='base10'):


Maybe yet another name could be found for the keyword. To me, it suggests both algebra and algorithm. Perhaps conversion could be used?

jksuom · 2017-02-20T15:00:01Z

I only now realized that it is RealField that behaves differently from other domains as new instances are created on each call. A simple fix could be using a cache as is done with PolyRing and FracField. There could be several ways to define __eq__ and __hash__. I think that Python defaults should suffice, in principle, but they cannot be used as RealField derives from Domain which has both methods defined. In the following example I have redefined the simple definitions.

+_real_cache = {}
+
 @public
 class RealField(Field, CharacteristicZero, SimpleDomain):
     """Real numbers up to the given precision. """
@@ -26,6 +28,7 @@ class RealField(Field, CharacteristicZero, SimpleDomain):
     has_assoc_Field = True
 
     _default_precision = 53
+    _default_tolerance = 2.22044604925031e-14
 
     @property
     def has_default_precision(self):
@@ -43,7 +46,12 @@ def dps(self):
     def tolerance(self):
         return self._context.tolerance
 
-    def __init__(self, prec=_default_precision, dps=None, tol=None):
+    def __new__(cls, prec=_default_precision, dps=None, tol=_default_tolerance)
+        key = (prec, tol)
+        if key in _real_cache:
+            return _real_cache[key]
+
+        self = object.__new__(cls)
         context = MPContext(prec, dps, tol)
         context._parent = self
         self._context = context
@@ -52,13 +60,19 @@ def __init__(self, prec=_default_precision, dps=None, tol=No
         self.zero = self.dtype(0)
         self.one = self.dtype(1)
 
+        self._hash = hash(key)  # or something else
+        _real_cache[key] = self
+
+        return self
+
+    def __init__(self):
+        pass

     def __eq__(self, other):
-        return (isinstance(other, RealField)
-           and self.precision == other.precision
-           and self.tolerance == other.tolerance)
+        return self is other
 
     def __hash__(self):
-        return hash((self.__class__.__name__, self.dtype, self.precision, self.
+        return self._hash

asmeurer · 2017-02-20T20:21:24Z

The caching is fine if it helps performance, but I think it's brittle to rely on it for equality comparison. We could add an is check at the front of __eq__ if it helps performance (although I'm doubtful it would make a difference). We could also change all the and chains to compare tuples, the advantage being readability, and also the obscure fact that tuple comparison checks is before __eq__ on its elements automatically.

It isn't being mutated in the code, but this is better practice.

This is not only more readable, but potentially a little faster as tuples use 'is' comparison on their elements before trying ==.

asmeurer · 2017-02-20T20:35:13Z

From the test failures it looks like there are other isinstance dtype checks that need to fixed.

jksuom · 2017-02-20T21:01:10Z

Is it necessary to consider dtype at all in these checks. It seems to me that .dtype.ring should be the same as .ring for a polynomial ring. Moreover, self.ring == other.ring should always imply self.symbols == other.symbols.

asmeurer · 2017-02-20T21:35:49Z

You are probably right.

But there are more complications, as can be seen from the test failures. For instance, at

sympy/sympy/polys/fields.py

Lines 456 to 458 in 62b1996

    
           elif isinstance(g, field.dtype): 
        
               return f.new(f.numer*g.numer, f.denom*g.denom) 
        
           elif isinstance(g, field.ring.dtype):

.

I'm thinking we might need to either remove the dynamic class creation or use a metaclass to make isinstance work.

asmeurer · 2017-02-24T19:45:32Z

Just realized that even if we use a metaclass, that pickling is going to be a nightmare and probably won't work. PolyElement currently has __getnewargs__, but it doesn't seem to work, and I don't see how it could work, since it tries to pass in the ring as an argument.

jksuom · 2017-02-24T20:26:03Z

I did investigate the test failures and thought that I had already commented on it, but now I cannot find those comments. Anyway, the reason is that RealField creates new versions each time it is called even with default parameters. Then PolyRing will also create new rings with same symbols. At the point of failure there are three polynomial rings with domain RR and the same symbols. The product fails since f[j] and g[i - j] belong to different rings. The error should go away if only one RR is created.

asmeurer · 2017-02-24T21:01:52Z

I also noticed that the hash values are based on the hash, not the things that generated the hash, meaning if there is a hash collision, it will completely break. The hash is based in part on a string (cls.__name__), meaning that hash randomization affects it, so this could randomly happen at any time (with pretty low probability, but it would be a pretty annoying error).

I'm playing around with using a metaclass to make the dynamically created classes compare equal to each other correctly (including making isinstance work as you'd want). I'm currently running into some issues because everywhere assumes that the PolyElement class has the ring attribute, but it is only actually defined on the dynamically created subclasses from PolyRing. I'm currently figuring out if this can work.

The other option would be to make a metaclass that singletonizes the PolyElement classes based on ring, so that the caching is enforced at the metaclass level (and then remove the manual caching from PolyRing).

(and ditto everywhere for FracElement)

asmeurer · 2017-02-24T21:44:11Z

I wonder if it would work to not store the ring on the dtype (class), but rather require it to be passed in whenever an instance is created (store it on the instance). That is, instead of PolyElement({(1,): 2}) it would be PolyElement(ring, {(1,): 2}). So basically, how PolyElement.__getnewargs__ claims that the class works.

Effects:

whenever a PolyElement is created, the ring needs to be known. This is what I'm unsure about.
PolyElement no longer subclasses from dict. Some methods may need to be defined to make it look like a dict again, but FracElement already doesn't subclass from dict, so it shouldn't be strictly necessary.
no more dynamically created subclasses. The dtype will always be just PolyElement or FracElement, and to create an element you have to pass in the ring. Or maybe PolyRing.dtype should be

def dtype(self, expr):
    return PolyElement(self.ring, expr)

isinstance checks against self.dtype will still need to go away (and it's technically an API break)

I'll have to try this out to see if it can work.

jksuom · 2017-02-25T08:08:59Z

I think the current implementation is an optimization.
It is clear that polynomials have to know the parent ring. In other words, ring has to be an attribute, either instance attribute or class attribute. If it is an instance attribute, then all arithmetic operations (which are quite frequent) must begin with something like if isinstance(other, PolyElement) and other.ring == self.ring but if the PolyElement classes are disjoint it is enough to just compare dtypes.

I don't know how much difference there is in efficiency, but the idea of having different classes for elements of different rings pleases me as mathematically natural. There should be one ring for each combination of domain, symbols and order, and their elements should be easily distinguishable.

From the technical point of view it seems that making ring an instance attribute would imply a lot of changes making the code more complicated. I would rather avoid that if at all possible. I think that the first thing to do should be to arrange that RealField will not create multiple objects. The remaining problems would then become much easier to handle.

smichr · 2017-06-23T02:03:53Z

sympy/simplify/simplify.py

@@ -1156,7 +1157,8 @@ def nthroot(expr, n, max_len=4, prec=15):
    return expr


-def nsimplify(expr, constants=[], tolerance=None, full=False, rational=None):
+def nsimplify(expr, constants=(), tolerance=None, full=False, rational=None,
+    rational_conversion='base10'):


What about using base=10 or base=2?

Those are the only bases for "base-rounding" that would be implemented. But from discussion above other methods may be implemented, like 'continued_fraction'.

This causes issues because RealField instantiates this class every time it is created. But the classes that are created do not have extra attributes from the base classes, so there is no need for a subclass.

Using the hash gives a low probability chance of collision (it includes a string, which is affected by hash randomization).

asmeurer · 2017-06-27T21:53:43Z

I pushed up some fixes, which should fix the test failure. I gave up on the metaclass fixes, so there may still be issues in this lurking.

Are there any more tests for this PR that should be added, or changes otherwise?

scopatz · 2017-06-27T23:10:14Z

Seems reasonable to me @asmeurer

asmeurer added 4 commits January 23, 2017 23:38

Replace variable name 'float' with 'fl'

9f9e576

float is the name of a builtin, so it's better to use a different name.

Note in docstring of _real_to_rational that it isn't an exact conversion

5b29fb6

See sympy#12045.

Fix Float(RealElement) to keep full precision

c62cad1

This was causing Poly() and nsimplify() to lose precision unless mpmath.mp.dps was set. I saw this at sympy#11862 but was unsure how to hit the code (hence the XXX comment). Fixes sympy#12015.

asmeurer requested review from smichr and jksuom January 24, 2017 05:04

Change base10 argument to nsimplify to rational_alg

7da5cd1

As noted at sympy#12088, we may want to apply other algorithms here in the future, such as one using continued fractions.

asmeurer added 3 commits February 13, 2017 21:09

Don't lose precision in multivariate Polys

d4aa602

The construct_domain function was not using a RR with the correct precision. The tests do not pass here yet (see discussion on the previous commit), but the behavior itself does work.

Merge branch 'master' into Float-prec-fix

e3f07aa

asmeurer added a commit to ergs/transmutagen that referenced this pull request Feb 14, 2017

Add multiply_vector for multiplying an expression by a vector

1f2f4cc

This requires sympy/sympy#12088 to work. Otherwise, the horner option will lose precision.

asmeurer mentioned this pull request Feb 15, 2017

-vvv doesn't remove the ... from the assertion error pytest-dev/pytest#2256

Closed

jksuom reviewed Feb 18, 2017

View reviewed changes

asmeurer added 3 commits February 20, 2017 15:24

Use constants=() instead of constants=[] in nsimplify parameters

dd08140

It isn't being mutated in the code, but this is better practice.

Change rational_alg keyword argument to rational_conversion

e52581b

Use tuples for polys __eq__ methods

b06b9a4

This is not only more readable, but potentially a little faster as tuples use 'is' comparison on their elements before trying ==.

asmeurer added a commit to ergs/transmutagen that referenced this pull request Mar 7, 2017

Checkout sympy/sympy#12088 in the Dockerfile

0298c8e

asmeurer added this to the SymPy 1.1 milestone Jun 22, 2017

smichr reviewed Jun 23, 2017

View reviewed changes

asmeurer added 3 commits June 27, 2017 16:40

Merge branch 'master' into Float-prec-fix

a83de07

Do not create create custom subclasses in MPContext

0891cc0

This causes issues because RealField instantiates this class every time it is created. But the classes that are created do not have extra attributes from the base classes, so there is no need for a subclass.

Use the tuples instead of the hash for cache keys in polys

5c00a4a

Using the hash gives a low probability chance of collision (it includes a string, which is affected by hash randomization).

asmeurer merged commit 3020b6f into sympy:master Jun 29, 2017

asmeurer mentioned this pull request Jan 22, 2019

Clean up and minor bug fixes #15820

Merged

jksuom mentioned this pull request Sep 29, 2019

Add a Decimal object #17648

Open

asmeurer mentioned this pull request Oct 14, 2019

Sympy caching isn't unique/ clear_cache misses part of the cache #17731

Open

asmeurer mentioned this pull request Jan 27, 2020

[WIP] Make rings sympifyable #18462

Closed

7 tasks

asmeurer mentioned this pull request Feb 17, 2025

polys: Add PuiseuxRing and remove the ring cache #27577

Merged



		def _real_to_rational(expr, tolerance=None):
		def _real_to_rational(expr, tolerance=None, rational_alg='base10'):

Uh oh!

Fix Poly and nsimplify to not lose precision #12088

Fix Poly and nsimplify to not lose precision #12088

Uh oh!

Conversation

asmeurer commented Jan 24, 2017

Uh oh!

jksuom commented Jan 24, 2017

Uh oh!

asmeurer commented Jan 24, 2017

Uh oh!

asmeurer commented Jan 24, 2017

Uh oh!

asmeurer commented Jan 24, 2017

Uh oh!

jksuom commented Jan 24, 2017

Uh oh!

asmeurer commented Jan 24, 2017

Uh oh!

jksuom commented Jan 25, 2017

Uh oh!

jksuom commented Jan 25, 2017

Uh oh!

asmeurer commented Jan 25, 2017

Uh oh!

asmeurer commented Feb 13, 2017

Uh oh!

scopatz commented Feb 13, 2017

Uh oh!

asmeurer commented Feb 13, 2017

Uh oh!

asmeurer commented Feb 13, 2017

Uh oh!

asmeurer commented Feb 14, 2017

Uh oh!

asmeurer commented Feb 14, 2017

Uh oh!

jksuom commented Feb 14, 2017

Uh oh!

asmeurer commented Feb 14, 2017

Uh oh!

asmeurer commented Feb 17, 2017

Uh oh!

jksuom commented Feb 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jksuom Feb 18, 2017

Choose a reason for hiding this comment

Uh oh!

jksuom commented Feb 20, 2017

Uh oh!

asmeurer commented Feb 20, 2017

Uh oh!

asmeurer commented Feb 20, 2017

Uh oh!

jksuom commented Feb 20, 2017

Uh oh!

asmeurer commented Feb 20, 2017

Uh oh!

asmeurer commented Feb 24, 2017

Uh oh!

jksuom commented Feb 24, 2017

Uh oh!

asmeurer commented Feb 24, 2017

Uh oh!

asmeurer commented Feb 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jksuom commented Feb 25, 2017

Uh oh!

smichr Jun 23, 2017

Choose a reason for hiding this comment

Uh oh!

asmeurer Jun 23, 2017

Choose a reason for hiding this comment

Uh oh!

asmeurer commented Jun 27, 2017

Uh oh!

scopatz commented Jun 27, 2017

Uh oh!

jksuom commented Feb 18, 2017 •

edited

Loading

asmeurer commented Feb 24, 2017 •

edited

Loading