-
-
Notifications
You must be signed in to change notification settings - Fork 4.8k
Fix Poly and nsimplify to not lose precision #12088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
float is the name of a builtin, so it's better to use a different name.
The default behavior is to convert floats to rationals using the string representation (roughly). Setting this flag to false will cause nsimplify to use the exact binary representation (i.e., it round-trips with evalf(), given the correct precision). I personally believe the default for this flag should be False, but I've left that alone for now. See sympy#12045.
This was causing Poly() and nsimplify() to lose precision unless mpmath.mp.dps was set. I saw this at sympy#11862 but was unsure how to hit the code (hence the XXX comment). Fixes sympy#12015.
I think that the conversion from floating point to rational numbers could be improved. (That might also affect the choice of default flags.) When a rational number is represented by a floating point number, the result is necessarily approximate (except if the denominator is a power of 2, or, in case of decimal numbers, a product of powers of 2 and 5). The mechanical inverse conversion of a floating point number to a rational number will then produce a result whose denominator is a power of 2 (or a product of powers of 2 and 5), and both numerator and denominator are often disproportionately large. The theory of continued fractions shows how to find best rational approximations to any number (even a transcendental number such as pi). In case of a floating point number representing a rational number, that number will appear among the convergents of the continued fraction expansion as the last one with reasonably sized coefficients.
It seems to me that |
That's an interesting idea, perhaps outside the scope of this PR. I wonder if that isn't just equivalent to the current algorithm, which limits the denominator with the tolerance. This is the current behavior by the way:
It starts giving 1/3 with 8 3s after the decimal, whereas the closes float approximation to 1/3 is 0.3333333333333333 (16 3s). I believe (@smichr will need to confirm) that the idea behind the current algorithm is that it lets you write fractions in a lazy way without concern for Python's evaluation of division, like |
No idea what's causing the test failure with matplotlib. |
Oh I guess the test failure is #11691. So unrelated to this PR. |
Maybe we should return to matplotlib 1.5.3 for the time being. About continued fractions: The convergents have the very special property that the error of n/d is less than 1/d^2 while for most approximating fractions the error is of class 1/d. |
So, looking forward, if we implement the continued fractions idea, does the flag name I've added here for nsimplify make sense? Or should it be named something else (maybe |
Maybe it is not necessary to change the keyword name or the way it is working after all. Continued fractions could be also used in limit_denominator after a 'full precision' rational has been constructed. Or, preferably, there could be a Float.to_rational method, with an argument 'tolerance', that would choose the first continued fraction convergent that comes within the range. |
I think this could be merged as it is (once the plotting issue has been fixed). I also think that base10 should be False by default, but that can be changed in another PR. Also the conversion of floats in general is a different matter. |
I'm still concerned that |
As noted at sympy#12088, we may want to apply other algorithms here in the future, such as one using continued fractions.
I went ahead and changed the keyword argument to something better. Please review. |
LGTM |
Looks like Poly still loses precision in multivariate cases (even setting mpmath.mp.dps doesn't fix it).
|
For some reason the dtype is set to a dynamically created subclass of PolyElement ( Line 206 in d29d13a
__eq__ is the default (different instances are unequal). So because of this, two separately created PolyRings are always unequal. Why does the PolynomialRing __eq__ check dtype if dtypes are created dynamically like this? Something needs to change in the structure here, but I'm unsure exactly what. @mattpap?
|
The added test does not pass because the dtype of a PolyElement a dynamically created class, which doesn't have any metaclass (hence, it uses the default __eq__). But that is set to PolynomialRing.dtype, and is used in the PolynomialRing.__eq__ test. I am not sure how to properly fix this. It seems odd that the dtype should be created dynamically at each class creation, and also odd that PolynomialRing.__eq__ is doing an equality check on a class.
The construct_domain function was not using a RR with the correct precision. The tests do not pass here yet (see discussion on the previous commit), but the behavior itself does work.
OK, I've pushed up a fix, which currently fails the tests because of the broken |
This requires sympy/sympy#12088 to work. Otherwise, the horner option will lose precision.
Currently, monomial order is also taken into account. Two rings having the same symbols and domain are considered different if their orders are different. This does not seem practical for most purposes as the order only matters when computing Gröbner bases. |
Turns out |
- Make __eq__ always match __hash__ - Don't compare dtype. Since dtype is a dynamically created class on PolyRing/FracField, which doesn't have any kind of special metaclass that allows comparisons, we can't do isinstance or equality checks on it. Rather, compare dtype.ring/dtype.field, since that is the only differing attribute. I did not do a thorough check of the use of this. I just fixed enough to make the failing tests pass. A better way of handling this is needed, preferably one that doesn't make use of dynamically created classes. - Fix typo in the tests (instantiate a RealField with RealField(), not RR()).
OK, I've pushed up a fix for this. @jksuom I'd appreciate if you could look over the The main thing here is:
A review of the rest would be nice too (esp. the new flag I've added to nsolve), although you already looked at most of the changes from before. |
The current implementation of PolyRing (and FracField) is based on the idea that only one copy is ever constructed for each combination of symbols, domain and order, and that copy is kept in a cache. Then it should be true that It is not so easy to say what should be done with PolynomialRing and FractionField. They can be thought of as some kind of 'wrappers' for PolyRing and FracField hiding their low level methods and introducing new high level methods. Basically, it should be possible to delegate comparisons of PolynomialRings to their underlying PolyRings, so |
sympy/simplify/simplify.py
Outdated
|
||
|
||
def _real_to_rational(expr, tolerance=None): | ||
def _real_to_rational(expr, tolerance=None, rational_alg='base10'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe yet another name could be found for the keyword. To me, it suggests both algebra and algorithm. Perhaps conversion
could be used?
I only now realized that it is RealField that behaves differently from other domains as new instances are created on each call. A simple fix could be using a cache as is done with PolyRing and FracField. There could be several ways to define
|
The caching is fine if it helps performance, but I think it's brittle to rely on it for equality comparison. We could add an is check at the front of |
It isn't being mutated in the code, but this is better practice.
This is not only more readable, but potentially a little faster as tuples use 'is' comparison on their elements before trying ==.
From the test failures it looks like there are other isinstance dtype checks that need to fixed. |
Is it necessary to consider dtype at all in these checks. It seems to me that .dtype.ring should be the same as .ring for a polynomial ring. Moreover, self.ring == other.ring should always imply self.symbols == other.symbols. |
You are probably right. But there are more complications, as can be seen from the test failures. For instance, at Lines 456 to 458 in 62b1996
I'm thinking we might need to either remove the dynamic class creation or use a metaclass to make isinstance work. |
Just realized that even if we use a metaclass, that pickling is going to be a nightmare and probably won't work. |
I did investigate the test failures and thought that I had already commented on it, but now I cannot find those comments. Anyway, the reason is that RealField creates new versions each time it is called even with default parameters. Then PolyRing will also create new rings with same symbols. At the point of failure there are three polynomial rings with domain RR and the same symbols. The product fails since f[j] and g[i - j] belong to different rings. The error should go away if only one RR is created. |
I also noticed that the hash values are based on the hash, not the things that generated the hash, meaning if there is a hash collision, it will completely break. The hash is based in part on a string ( I'm playing around with using a metaclass to make the dynamically created classes compare equal to each other correctly (including making The other option would be to make a metaclass that singletonizes the PolyElement classes based on ring, so that the caching is enforced at the metaclass level (and then remove the manual caching from (and ditto everywhere for |
I wonder if it would work to not store the ring on the dtype (class), but rather require it to be passed in whenever an instance is created (store it on the instance). That is, instead of Effects:
def dtype(self, expr):
return PolyElement(self.ring, expr)
I'll have to try this out to see if it can work. |
I think the current implementation is an optimization. I don't know how much difference there is in efficiency, but the idea of having different classes for elements of different rings pleases me as mathematically natural. There should be one ring for each combination of domain, symbols and order, and their elements should be easily distinguishable. From the technical point of view it seems that making |
@@ -1156,7 +1157,8 @@ def nthroot(expr, n, max_len=4, prec=15): | |||
return expr | |||
|
|||
|
|||
def nsimplify(expr, constants=[], tolerance=None, full=False, rational=None): | |||
def nsimplify(expr, constants=(), tolerance=None, full=False, rational=None, | |||
rational_conversion='base10'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about using base=10
or base=2
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are the only bases for "base-rounding" that would be implemented. But from discussion above other methods may be implemented, like 'continued_fraction'.
This causes issues because RealField instantiates this class every time it is created. But the classes that are created do not have extra attributes from the base classes, so there is no need for a subclass.
Using the hash gives a low probability chance of collision (it includes a string, which is affected by hash randomization).
I pushed up some fixes, which should fix the test failure. I gave up on the metaclass fixes, so there may still be issues in this lurking. Are there any more tests for this PR that should be added, or changes otherwise? |
Seems reasonable to me @asmeurer |
Fixes #12015 (there isn't an issue for
nsimplify
). Previously, bothPoly
andnsimplify
would lose precision unlessmpmath.mp.dps
was manually set.This also addresses #12045. I have added a flag
base10
tonsimplify
, which when set toFalse
causes it to use the exact binary representation of floats (whenrational=True
). I personally believe this should be the default value, but I haven't changed it (it would be a breaking change). I can see the point (using the base-10 representation is more user friendly), but it's also inexact and a surprise for anyone who expectsnsimplify(rational=True).evalf()
to round-trip. What do others think here?CC @scopatz