fix(polys): improve numerical evaluation in minpoly #21370

oscarbenjamin · 2021-04-22T23:42:22Z

References to other Issues or PRs

Fixes #19732

Brief description of what is fixed or changed

The minpoly algorithm for choosing a factor is made more robust by using better numerical evaluation.

Other comments

Release Notes

polys
- The accuracy of the minpoly algorithm was improved to handle complicated algebraic expressions more reliably.

sympy-bot · 2021-04-22T23:42:30Z

✅

Hi, I am the SymPy bot (v161). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

polys
- The accuracy of the minpoly algorithm was improved to handle complicated algebraic expressions more reliably. (#21370 by @oscarbenjamin)

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.9.

Click here to see the pull request description that was parsed.

<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->

Fixes #19732

#### Brief description of what is fixed or changed

The minpoly algorithm for choosing a factor is made more robust by using better numerical evaluation.

#### Other comments


#### Release Notes

<!-- Write the release notes for this release below between the BEGIN and END
statements. The basic format is a bulleted list with the name of the subpackage
and the release note for this PR. For example:

* solvers
  * Added a new solver for logarithmic equations.

* functions
  * Fixed a bug with log of integers.

or if no release note(s) should be included use:

NO ENTRY

See https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more
information on how to write release notes. The bot will check your release
notes automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
* polys
    * The accuracy of the `minpoly` algorithm was improved to handle complicated algebraic expressions more reliably.
<!-- END RELEASE NOTES -->

Update

The release notes on the wiki have been updated.

oscarbenjamin · 2021-04-23T08:48:04Z

This has failed:

In [1]: minpoly(residue(1/(x**4 + 1), x, exp(I*pi/4)) - (-(Rational(1, 4) + I/4)/sqrt(2)))
---------------------------------------------------------------------------
NotImplementedError

jksuom · 2021-04-23T11:11:45Z

The test n10._prec > 1 and n10 > 0 is not reliable because the precision is not properly computed.

In [13]: from sympy.core.evalf import pure_complex

In [16]: f = 1 + 32*(sqrt(2)/8 - (-1)**(S(1)/4)/4)**2                        

In [17]: a = f.evalf(10)                                                        

In [18]: r, i = pure_complex(a)                                                 

In [19]: r, i                                                                   
Out[19]: (0.e-127, -4.731953965e-132)

In [20]: r._prec, i._prec                                                       
Out[20]: (1, 37)

In [21]: abs(a)._prec                                                           
Out[21]: 37

The wrong precision 37 of the imaginary part comes from mpmath. Computation of abs (involving __add__) could probably be fixed to return a more realistic final precision.

However, I think that there is an easier way: _choose_factor is always called with a correct factor in the factor list so it should suffice to pick the one the factors f for which abs(f.evalf(10, points)) has the minimum value.

oscarbenjamin · 2021-04-23T11:13:32Z

The problem is basically this:

In [2]: f = 1 + 32*(sqrt(2)/8 - (-1)**(S(1)/4)/4)**2

In [3]: f
Out[3]: 
                    2
       ⎛     4 ____⎞ 
       ⎜√2   ╲╱ -1 ⎟ 
1 + 32⋅⎜── - ──────⎟ 
       ⎝8      4   ⎠ 

In [4]: f.n()
Out[4]: 0.e-138 - 9.06039381185871e-144⋅ⅈ

In [5]: abs(f.n())._prec
Out[5]: 53

There should be no precision on the imaginary part because it is exactly zero.

oscarbenjamin · 2021-04-23T12:44:25Z

The wrong precision 37 of the imaginary part comes from mpmath

There isn't a _prec in mpmath. It is sympy's evalf code that computes the _prec of any result. Floats with different precision still have the same _mpf_ representation as far as mpmath is concerned.

jksuom · 2021-04-23T13:25:22Z

There isn't a _prec in mpmath.

That is true. The mpf objects have no precision. An operation on two objects treats then as having the same precision, usually given as a parameter.

In this example, sqrt(2)/8 - (-1)**(S(1)/4)/4 is pure imaginary but mpmath sees it as re + im*I where re is very small and has low precision.

(Pdb) p re, im
((0, mpz(27), -76, 5), (1, mpz(13356869453140768985445), -76, 74))

It then computes its square using the standard formula:

(Pdb) p p, prec
(2, 52)
(Pdb) p libmp.mpc_pow_int((re, im), p, prec)
((1, mpz(4503599627370495), -57, 52), (1, mpz(1343471837173405), -123, 51))

For the imaginary part 2*re*im, mpmath gives 51 bits but most of them are spurious as re has only five significant bits.

smichr · 2021-04-23T14:41:19Z

sqrt(2)/8 - (-1)**(S(1)/4)/4

This is perhaps the reason for using as_real_imag() on the expression before evaluating instead of vice versa:

>>> (sqrt(2)/8 - (-1)**(S(1)/4)/4).as_real_imag()
(0, -sqrt(2)/8)

smichr · 2021-04-23T14:45:26Z

The problem is basically this:

>>> f = 1 + 32*(sqrt(2)/8 - (-1)**(S(1)/4)/4)**2
>>> r,i=f.as_real_imag()
>>> abs(r.n(10) + i.n(10)*I)._prec
-1
>>> r,i
(0, 0)
>>> f
1 + 32*(sqrt(2)/8 - (-1)**(1/4)/4)**2

smichr · 2021-04-23T14:51:25Z

Do you think it would be worth having a safe_nsimplify step at the outset of minpoly?

jksuom · 2021-04-23T17:10:32Z

It looks like as_real_imag could be used:

---
 a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -65,8 +65,8 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
             n_temp = n_temp // bound
 
         def nonzero(f):
-            n10 = abs(f.evalf(10, points))
-            return n10._prec > 1 and n10 > 0
+            r, i = [abs(c.evalf(10, points)) for c in f.as_real_imag()]
+            return r._prec > 1 and r > 0 or i._prec > 1 and i > 0
 
         candidates = {f:fe for f, fe in factors.items() if not nonzero(fe)}
         if candidates:

smichr · 2021-04-23T18:25:02Z

It looks like as_real_imag could be used:

Would it be worth starting the whole minpoly process with expression taming?

# tame expression if possible, e.g. (sqrt(2)/8 - (-1)**(S(1)/4)/4) -> -sqrt(2)*I/8
r, i = expr.as_real_imag()
expr = r + i*I

oscarbenjamin · 2021-04-23T20:03:55Z

It looks like as_real_imag could be used:

That leads to a big slowdown in

sympy/polys/tests/test_numberfields.py::test_minpoly_fraction_field_slow

which goes from 5 seconds to 250 seconds on this machine.

Also the time taken to run the whole non-slow sympy/polys tests goes from 96 seconds for master to 103 seconds for the PR to 110 seconds with as_real_imag (that might not be a reliable difference though). Maybe the slowdown in the fraction field test is a separate performance problem.

Do you think it would be worth having a safe_nsimplify step at the outset of minpoly?

There are lots of heuristics that could be used here but this is potentially a slow function and the intention is that it should be able to work just by depending on robust numerical evaluation. The best fix would be to fix the actual bug in the numerical evaluation I think.

smichr · 2021-04-23T22:34:44Z

That leads to a big slowdown in

then maybe use as a fallback. But if the evalf trouble can be fixed, great. But it is my feeling that the calculation of a "hidden zero" is fraught with difficulties.

>>> abs((cos(1)**2+sin(1)**2-1).n(10))>0
True
>>> abs((cos(1)**2+sin(1)**2-1).n(10))._prec
1

My rule of thumb has been that if prec is 1 you don't know anything about the number via evaluation. If you get a prec of 1 for the real or imaginary part of the number it is perhaps a good indicator that you should look for a better form of the expression that you are trying to compute.

smichr · 2021-04-24T01:34:01Z

What about this (in light of @jksuom comment about there being only one true root):

diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index 073e5f3cb0..2cc8468183 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -1,6 +1,7 @@
 """Computational algebraic field theory. """
 
 from functools import reduce
+from collections import defaultdict
 
 from sympy import (
     S, Rational, AlgebraicNumber, GoldenRatio, TribonacciConstant,
@@ -10,6 +11,7 @@
 
 from sympy.core.exprtools import Factors
 from sympy.core.function import _mexpand
+from sympy.core.symbol import Symbol
 from sympy.functions.elementary.exponential import exp
 from sympy.functions.elementary.trigonometric import cos, sin
 from sympy.ntheory import sieve
@@ -64,16 +66,14 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
             points[s] = n_temp % bound
             n_temp = n_temp // bound
 
-        def nonzero(f):
-            n10 = abs(f.evalf(10, points))
-            return n10._prec > 1 and n10 > 0
-
-        candidates = {f:fe for f, fe in factors.items() if not nonzero(fe)}
-        if candidates:
-            factors = candidates
-        if len(factors) == 1:
-            [f] = factors
-            return f
+        if factors:
+            mag = lambda x: abs(x.n(10, points))
+            candidates = defaultdict(list)
+            for f, fe in factors.items():
+                candidates[mag(fe)].append(f)
+            c = min(candidates)
+            if len(candidates[c]) == 1:
+                return candidates[c][0]
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)

oscarbenjamin · 2021-04-24T11:27:45Z

What about this (in light of @jksuom comment about there being only one true root):

That also leads to slowdowns.

I've tried various different ways of computing this using evalf and Abs and _prec and so on. Theoretically any of these methods should work although it's understandable that some are faster than others. The ones that aren't slow seem to hit up against some or other bug in evalf/mpmath though.

On a basic level Float takes the higher precision in a product which seems wrong:

In [35]: re = Float(1e-100, precision=3)

In [36]: im = Float(0.1, precision=53)

In [37]: re
Out[37]: 0.e-100

In [38]: im
Out[38]: 0.100000000000000

In [39]: re*im
Out[39]: 1.00011396737199e-101

In [40]: _._prec
Out[40]: 53

I think also that the way that evalf handles this is not right because it assumes a single precision for both real and imaginary parts:

sympy/sympy/core/evalf.py

Lines 710 to 713 in 624549a

    
           # General complex number to arbitrary integer power 
        
           re, im = libmp.mpc_pow_int((re, im), p, prec) 
        
           # Assumes full accuracy in input 
        
           return finalize_complex(re, im, target_prec)

jksuom · 2021-04-24T12:23:20Z

Float takes the higher precision in a product which seems wrong:

I agree. The lower precision should be taken. That should be easy to fix.

The precision of a sum is harder to find. If one of the summands is much bigger than the other onw, then its precision should be taken. When the summands overlap, more work is needed to compute the proper precision.

smichr · 2021-04-24T12:48:00Z

sympy/polys/numberfields.py

@@ -53,30 +53,27 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
    if len(factors) == 1:
        return factors[0]

-    points = {x:v}
+    factors = {f:f.as_expr().subs({x:v}) for f in factors}


If v.is_number (and not v.is_Number) it should be put in with the points for best evaluation.

Bearing in mind that these are always exact algebraic polynomials is there a benefit in passing them to evalf rather than just using subs?

oscarbenjamin · 2021-04-24T13:19:37Z

Float takes the higher precision in a product which seems wrong:

I agree. The lower precision should be taken. That should be easy to fix.

The precision of a sum is harder to find. If one of the summands is much bigger than the other onw, then its precision should be taken. When the summands overlap, more work is needed to compute the proper precision.

I think that the whole approach of having a "precision" as an error bound is broken. I note that @fredrik-johansson has moved on from mpmath to Arb which uses ball arithmetic:
https://arblib.org/
https://www.texmacs.org/joris/ball/ball.html

I think that sympy's evalf should do the same even if it is still built on top of mpmath for now (it would be good to use Arb one day). It should be possible for sympy's evalf to be reorganised along these lines at least for algebraic functions (as needed here). For transcendental functions that would have to come from a separate library (e.g. Arb).

With ball arithmetic instead of thinking about precision you keep track of an error bound. Then the precision is effectively the input to a calculation and the error bound is part of the output. When you want to make the error smaller you put in more precision but the error bound that comes back is always strictly correct and depends only indirectly on the precision that was used. Rather than checking if the result "has precision" you can check if zero is contained in the bounds.

fredrik-johansson · 2021-04-24T13:49:17Z

Yes, ball arithmetic is the way to go for this kind of thing (not for everything, but for most things evalf-esque).

It is possible to do ball arithmetic in pure Python with reasonable performance (about as fast as pure-Python mpmath). I think I even have buggy proof of concept code of this somewhere, but unsure if I'd have time to develop it.

smichr · 2021-04-25T06:44:37Z

The following diff passes the tests you have added and runs about 10% faster on master's 'minpoly' tests.

diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index 073e5f3cb0..786cd32613 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -14,6 +14,8 @@
 from sympy.functions.elementary.trigonometric import cos, sin
 from sympy.ntheory import sieve
 from sympy.ntheory.factor_ import divisors
+from sympy.utilities.iterables import subsets
+
 from sympy.polys.densetools import dup_eval
 from sympy.polys.domains import ZZ, QQ
 from sympy.polys.orthopolys import dup_chebyshevt
@@ -48,36 +50,50 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     Return a factor having root ``v``
     It is assumed that one of the factors has root ``v``.
     """
+    from sympy.polys.polyutils import illegal
+
     if isinstance(factors[0], tuple):
         factors = [f[0] for f in factors]
     if len(factors) == 1:
         return factors[0]
 
-    factors = {f:f.as_expr().subs({x:v}) for f in factors}
-
+    prec1 = 10
+    points = {}
     symbols = dom.symbols if hasattr(dom, 'symbols') else []
-
-    for n in range(bound**len(symbols)):
-        n_temp = n
-        points = {}
-        for s in symbols:
-            points[s] = n_temp % bound
-            n_temp = n_temp // bound
-
-        def nonzero(f):
-            n10 = abs(f.evalf(10, points))
-            return n10._prec > 1 and n10 > 0
-
-        candidates = {f:fe for f, fe in factors.items() if not nonzero(fe)}
-        if candidates:
-            factors = candidates
-        if len(factors) == 1:
-            [f] = factors
-            return f
+    while True:
+        # when dealing with non-Rational numbers we usually evaluate
+        # with `subs` argument but we only need a ballpark evaluation
+        xv = {x:v if not v.is_number else v.n(prec1)}
+        fe = [f.as_expr().xreplace(xv) for f in factors]
+
+        # assign integers [0, n) to symbols (if any)
+        for n in subsets(range(bound), k=len(symbols), repetition=True):
+            for s, i in zip(symbols, n):
+                points[s] = i
+
+            # evaluate the expression at these points
+            candidates = [(abs(f.subs(points).n(prec1)), i)
+                for i,f in enumerate(fe)]
+
+            # if we get invalid numbers (e.g. from division by zero)
+            # we try again
+            if any(i in illegal for i, _ in candidates):
+                continue
+
+            # find the smallest two -- if they differ significantly
+            # then we assume we have found the factor that becomes
+            # 0 when v is substituted into it
+            can = sorted(candidates)
+            (a, ix), (b, _) = can[:2]
+            if b != 0 and a/b < 10**6:  # XXX what to use?
+                return factors[ix]
+
+        prec1 *= 2
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)
 
 
+
 def _separate_sq(p):
     """
     helper function for ``_minimal_polynomial_sq``

jksuom · 2021-04-25T07:20:59Z

It looks like subsets only returns non-decreasing sequences (though that will probably suffice). product would give all sequences.

jksuom · 2021-04-25T08:55:33Z

Maybe the last condition could be of the type if b > 10**6*a.

oscarbenjamin · 2021-04-25T10:26:15Z

Thanks, I've pushed that.

jksuom · 2021-04-25T10:45:01Z

sympy/polys/numberfields.py

+            # 0 when v is substituted into it
+            can = sorted(candidates)
+            (a, ix), (b, _) = can[:2]
+            if a < 10**6*b:  # XXX what to use?


a is the smaller one:

> /home/jks/test/sympy/sympy/polys/numberfields.py(88)_choose_factor() -> if a < 10**6*b: # XXX what to use? (Pdb) p a, b (5.801108341e-11, 44.13886734)

Oh yes. Surprising that it passes all the tests...

It looks like that confirms my expectation that the minimum could always be chosen.

If there are no symbols then I can see that the minimum could be chosen.

If we are substituting random integers for the symbols though then could it not be the case that multiple factors end up being exactly zero?

That is possible though may not occur in our tests. So some condition like this is necessary.

The expression is:

In [62]: yv = nsimplify(sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**(5/2)*(1 + x**(-3))**(3/2)) + 1/(x**(11/2)*(1 + x**(-3))**(3/2))) In [63]: yv Out[63]: ________ ╱ 3 √x ╲╱ x + 1 1 1 ───────────── - ─────────── + ──────────────── + ───────────────── ________ x 3/2 3/2 ╱ 1 5/2 ⎛ 1 ⎞ 11/2 ⎛ 1 ⎞ ╱ 1 + ── x ⋅⎜1 + ──⎟ x ⋅⎜1 + ──⎟ ╱ 3 ⎜ 3⎟ ⎜ 3⎟ ╲╱ x ⎝ x ⎠ ⎝ x ⎠

SymPy computes this expression as zero for any real x with x > -1. For x < -1 nonzero values are obtained:

In [64]: yv.subs(x, -2) Out[64]: √7⋅ⅈ

We can split the expression up into two terms taking the one with the sqrt in the numerator apart from the others:

In [70]: y1 = yv.args[3] In [71]: y2 = sum(yv.args[:3]) In [72]: y1 Out[72]: ________ ╱ 3 -╲╱ x + 1 ───────────── x In [73]: y2 Out[73]: √x 1 1 ───────────── + ──────────────── + ───────────────── ________ 3/2 3/2 ╱ 1 5/2 ⎛ 1 ⎞ 11/2 ⎛ 1 ⎞ ╱ 1 + ── x ⋅⎜1 + ──⎟ x ⋅⎜1 + ──⎟ ╱ 3 ⎜ 3⎟ ⎜ 3⎟ ╲╱ x ⎝ x ⎠ ⎝ x ⎠

For x > -1 the two parts y1 and y2 are nonzero but sum to zero. For x < -1 both are imaginary but have the same sign so they combine rather than cancel. This depends on our choice of branch for the square roots I guess though so if we allowed y1 to take the other branch then the terms would still cancel.

From purely algebraic considerations we can find a larger polynomial that this expression would need to be a root of e.g.:

In [12]: from sympy.solvers.solvers import unrad In [13]: factor(unrad(yv - y)[0]) Out[13]: 14 2 14 ⎛ 2 ⎞ ⎛ 3 2 2 ⎞ x⋅y ⋅(x + 1) ⋅⎝x - x + 1⎠ ⋅⎝4⋅x - x ⋅y + 4⎠

This leaves 3 possible choices for y:

In [14]: r1, r2, r3 = roots(_, y) In [15]: r1.subs(x, I) Out[15]: ____________ 2⋅ⅈ⋅╲╱ -ⅈ⋅(1 + ⅈ) In [16]: r1.subs(x, I).n() Out[16]: 0.910179721124455 + 2.19736822693562⋅ⅈ In [17]: r2.subs(x, I).n() Out[17]: -0.910179721124455 - 2.19736822693562⋅ⅈ In [18]: r3.subs(x, I).n() Out[18]: 0

Exactly which one we have depends on our interpretation of the square root and other non-integer powers. The minpoly function substitutes only nonnegative integers for x and uses evalf/mpmath so all radicals take the principal root. That's why it chooses the zero root in this case. That's not the root it would find if it chose to substitute negative values for x though.

I'm actually not really sure how well-defined this is. The correct choice of minimal polynomial for this case depends on the value of x but x here is just considered to be an arbitrary symbol.

The minpoly function substitutes only nonnegative integers for x and uses evalf/mpmath so all radicals take the principal root.

Is there some assurance that arguments to radicals are non-negative for non-negative, real values of x?

Is there some assurance that arguments to radicals are non-negative for non-negative, real values of x?

No, not really. I think that this depends essentially on the interpretation that we give for what x is supposed to mean in this expression and that isn't really specified anywhere.

y = sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**(5/2)*(1 + x**(-3))**(3/2)) + 1/(x**(11/2)*(1 + x**(-3))**(3/2)) is an interesting example. It is the sum y1 + y2 (#21370 (comment)) where y1 is -sqrt(x**3 + 1)/x and y2 can be simplified (manually) to sqrt(x)*sqrt(1 + 1/x**3). The sum will be 0, if sqrt(x**3 + 1) is written in the form x*sqrt(x)*sqrt(1 + 1/x**3). That cannot be done for all values of x as sqrt(a*b) is different from sqrt(a)*sqrt(b) in general.

If y were to denote an element of an algebraic function field (in risch, for example), then the field should be first constructed by giving suitable generators, u = sqrt(x) and v = sqrt(1 + 1/x**3), say. Then y would be rewritten in terms of the generators. At that point, a choice would have to be made for sqrt(x**3 + 1): either x*u*v or -x*u*v. Then the element would be unambiguous and its minimal polynomial well defined, either y or x**2*y**2 - 4*x**3 - 4 (in monic form).

The algorithms in numberfields have been implement under the assumption that the elements lie in a well defined field. That is obviously true for numbers but not for expressions involving symbols. The minimal polynomial of a field element (over some base field like QQ or QQ(x)) is always irreducible. It might be possible to extend the definition of minimal polynomial to expressions like y above by taking it as the product of several irreducible factors but I'm not sure that it would be worth while.

smichr · 2021-04-25T20:43:06Z

product would give all sequences.

Yes, for n in product(*[list(range(bound))]*len(symbols)):

oscarbenjamin · 2021-04-26T14:22:34Z

product would give all sequences.

Yes, for n in product(*[list(range(bound))]*len(symbols)):

Is that needed?

I'm just not sure how principled this algorithm really is given the discussion above.

smichr · 2021-04-26T19:26:47Z

I'm just not sure how principled this algorithm really is

This adds some increased security to the method and still passes numberfield tests:

diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index b01f2cd71b..4b0daba41d 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -49,6 +49,17 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     """
     Return a factor having root ``v``
     It is assumed that one of the factors has root ``v``.
+
+    When `v` is numeric, the evaluation of a factor at that value
+    may not compute with precision to 0, but the other factors
+    will and will be larger, so we will take the one with the
+    minimum value.
+
+    When `v` is symbolic, we will substitute in arbitrary values
+    for the symbols making sure that no denominator is 0 and that
+    all arguments of radicals are positive (so we get the principle
+    value). This should yield a single expression that is 0 provided
+    that enough precision is used.
     """
     from sympy import Pow
     from sympy.solvers.solvers import denoms
@@ -61,45 +72,68 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     prec1 = 10
     points = {}
     symbols = dom.symbols if hasattr(dom, 'symbols') else []
-    while True:
-        # when dealing with non-Rational numbers we usually evaluate
-        # with `subs` argument but we only need a ballpark evaluation
-        xv = {x:v if not v.is_number else v.n(prec1)}
-        fe = [f.as_expr().xreplace(xv) for f in factors]
-
-        concerns = set([i.base for f in factors for i in f.atoms(Pow)
-            if not (i.exp.is_Integer and i.exp > 0 or i.base.is_number)])
-        ex = set.union(*[set(denoms(i)) for i in fe])
-        assert not ex - concerns
-        rbase = concerns
-        p = {i:Dummy(positive=True) for i in symbols}
-        assert all(i.xreplace(p).is_nonnegative for i in rbase)
-
-        # assign integers [0, n) to symbols (if any)
-        from random import randint
-        for do in range(bound):
-            n = [randint(1,100) for i in symbols]
+
+    # when dealing with non-Rational numbers we usually evaluate
+    # with `subs` argument but we only need a ballpark evaluation
+    xv = {x: v if not v.is_number else v.n(prec1)}
+    fe = [f.as_expr().xreplace(xv) for f in factors]
+
+    if not all(i.is_number for i in fe):
+        def _base(i):
+            """return denominator or base of non-integer Pow,
+            recursively
+
+            Examples
+            ========
+
+            >>> from sympy import sqrt
+            >>> from sympy.abc import x
+            >>> _base(x**2)
+            >>> _base(sqrt(x))
+            x
+            >>> _base(x**-2)
+            x
+            >>> _base(1/sqrt(x))
+            x
+            """
+            if not isinstance(i, Pow):
+                return i
+            if i.exp.is_Integer and i.exp < 0:
+                return _base(i.base)
+            if not i.exp.is_Integer:
+                return _base(i.base)
+
+        concerns = set(filter(None,
+            [_base(i) for f in fe for i in f.atoms(Pow)]))
+        p = {i: Dummy(positive=True) for i in symbols}
+        if not all(i.xreplace(p).is_nonnegative for i in concerns):
+            prec1 = prec + 1  # to flag failure
+            # TODO figure out a good way to pick values for
+            # symbols so all denominators/bases are positive
+        else:
+            from random import randint
+            n = [randint(1, 100*len(symbols)) for i in symbols]
             for s, i in zip(symbols, n):
                 points[s] = i
 
-            # evaluate the expression at these points
-            candidates = [(abs(f.subs(points).n(prec1)), i)
-                for i,f in enumerate(fe)]
+    while prec1 <= prec:
+        # evaluate the expression at these points
+        candidates = [(abs(f.xreplace(points).n(prec1)), i)
+            for i, f in enumerate(fe)]
 
-            # find the smallest two -- if they differ significantly
-            # then we assume we have found the factor that becomes
-            # 0 when v is substituted into it
-            can = sorted(candidates)
-            (a, ix), (b, _) = can[:2]
-            if b > a * 10**6:  # XXX what to use?
-                return factors[ix]
+        # find the smallest two -- if they differ significantly
+        # then we assume we have found the factor that becomes
+        # 0 when v is substituted into it
+        can = sorted(candidates)
+        (a, ix), (b, _) = can[:2]
+        if b > a * 10**6:  # XXX what to use?
+            return factors[ix]
 
         prec1 *= 2
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)
 
 
-
 def _separate_sq(p):
     """
     helper function for ``_minimal_polynomial_sq``

oscarbenjamin · 2021-04-26T20:18:14Z

It's not generally possible to make everything inside the radicals nonnegative:

In [15]: minpoly(sqrt(x)+sqrt(-x)+sqrt(x-1), t)
Out[15]: 
 8    6              4 ⎛    2           ⎞    2 ⎛    3       2           ⎞       4       3       2          
t  + t ⋅(4 - 4⋅x) + t ⋅⎝14⋅x  - 12⋅x + 6⎠ + t ⋅⎝44⋅x  - 36⋅x  - 12⋅x + 4⎠ + 25⋅x  - 20⋅x  + 14⋅x  - 4⋅x + 1t  + t ⋅(-4⋅x - 4) + t ⋅⎝14⋅x  + 12⋅x + 6⎠ + t ⋅⎝44⋅x  + 36⋅x  - 12⋅x - 4⎠ + 25⋅x  + 20⋅x  + 14⋅x  + 4⋅x + 1

smichr · 2021-04-26T21:16:46Z

It's not generally possible to make everything inside the radicals nonnegative:

The modifications will raise an error if there is ever more than one factor and this is the case. Will the arguments of the radicals always be polynomials?

oscarbenjamin · 2021-04-26T21:27:15Z

Will the arguments of the radicals always be polynomials?

If someone calls minpoly then they can pass anything in.

I think that the key point is that this is well defined for algebraic expressions that do not involve symbols and will work for those provided that numerical evaluation is robust. However this can not work in general for arbitrary radical expressions involving symbols.

I'm not aware of minpoly actually being used anywhere for expressions involving symbols. Certainly constructing a polynomial domain with extension=True will compute the minimal polynomials of any algebraic expressions and return an algebraic number field but that fails for expressions with symbols:

In [2]: QQ.algebraic_field(sqrt(2), sqrt(3))
Out[2]: QQ<sqrt(2) + sqrt(3)>

In [3]: QQ.algebraic_field(sqrt(2), sqrt(x))
---------------------------------------------------------------------------
TypeError 

In [5]: construct_domain([sqrt(2), sqrt(3)], extension=True)
Out[5]: 
(QQ<sqrt(2) + sqrt(3)>,
 [ANP([mpq(1,2), mpq(0,1), mpq(-9,2), mpq(0,1)], [mpq(1,1), mpq(0,1), mpq(-10,1), mpq(0,1), mpq(1,1)], QQ),
  ANP([mpq(-1,2), mpq(0,1), mpq(11,2), mpq(0,1)], [mpq(1,1), mpq(0,1), mpq(-10,1), mpq(0,1), mpq(1,1)], QQ)])

In [4]: construct_domain([sqrt(2), sqrt(x)], extension=True)
Out[4]: (EX, [EX(sqrt(2)), EX(sqrt(x))])

smichr · 2021-04-26T22:32:57Z

Something like this (that passes in master) would fail with the diff I proposed:

>>> minpoly(I + sqrt(2 - I*y))
_x**4 - 2*_x**2 - 4*_x*y + y**2 + 9

smichr · 2021-04-27T20:53:32Z

This diff over the current PR removes doubt about numerics -- an exact 0 is sought for a factor into which the value is substituted after making all symbols positive. If it is not found, an error is raised. I am not sure if it would be better to use posify which would allow any negative symbols to remain unchanged.

diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index 27247c7790..01ce657c79 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -49,51 +49,59 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     """
     Return a factor having root ``v``
     It is assumed that one of the factors has root ``v``.
-    """
-    from sympy.polys.polyutils import illegal
 
+    When `v` is numeric, the evaluation of a factor at that value
+    may not compute with precision to 0, but the other factors
+    will and will be larger, so we will take the one with the
+    minimum value.
+
+    When `v` is symbolic, the factor which simplifies to 0 when
+    values are assumed to be positive will be selected as the desired
+    root. If none factors to 0 then a NotImplementedError will be
+    raised.
+    """
     if isinstance(factors[0], tuple):
         factors = [f[0] for f in factors]
     if len(factors) == 1:
         return factors[0]
 
     prec1 = 10
-    points = {}
     symbols = dom.symbols if hasattr(dom, 'symbols') else []
-    while True:
-        # when dealing with non-Rational numbers we usually evaluate
-        # with `subs` argument but we only need a ballpark evaluation
-        xv = {x:v if not v.is_number else v.n(prec1)}
-        fe = [f.as_expr().xreplace(xv) for f in factors]
-
-        # assign integers [0, n) to symbols (if any)
-        for n in subsets(range(bound), k=len(symbols), repetition=True):
-            for s, i in zip(symbols, n):
-                points[s] = i
-
-            # evaluate the expression at these points
-            candidates = [(abs(f.subs(points).n(prec1)), i)
-                for i,f in enumerate(fe)]
-
-            # if we get invalid numbers (e.g. from division by zero)
-            # we try again
-            if any(i in illegal for i, _ in candidates):
-                continue
-
-            # find the smallest two -- if they differ significantly
-            # then we assume we have found the factor that becomes
-            # 0 when v is substituted into it
-            can = sorted(candidates)
-            (a, ix), (b, _) = can[:2]
-            if b > a * 10**6:  # XXX what to use?
-                return factors[ix]
+
+    # when dealing with non-Rational numbers we usually evaluate
+    # with `subs` argument but we only need a ballpark evaluation
+    xv = {x: v if not v.is_number else v.n(prec1)}
+    fe = [f.as_expr().xreplace(xv) for f in factors]
+
+    # handle symbolic expressions by using simplification to
+    # give an expression that is 0
+    if not all(i.is_number for i in fe):
+        p = {i: Dummy('p', positive=True) for i in symbols}
+        fe = [f.xreplace(p).factor() for f in fe]
+        if S.Zero in fe:
+            return factors[fe.index(S.Zero)]
+        # for some reason the expression did not simplify
+        # to zero -- find a new way to select the factor that
+        # becomes 0 when x is replaced with non-numeric v
+        prec1 = prec + 1
+
+    while prec1 <= prec:
+        # evaluate the expression at this precision
+        candidates = [(abs(f.n(prec1)), i) for i, f in enumerate(fe)]
+
+        # find the smallest two -- if they differ significantly
+        # then we assume we have found the factor that becomes
+        # 0 when v is substituted into it
+        can = sorted(candidates)
+        (a, ix), (b, _) = can[:2]
+        if b > a * 10**6:  # XXX what to use?
+            return factors[ix]
 
         prec1 *= 2
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)
 
 
-
 def _separate_sq(p):
     """
     helper function for ``_minimal_polynomial_sq``
@@ -120,7 +128,6 @@ def _separate_sq(p):
     -x**8 + 48*x**6 - 536*x**4 + 1728*x**2 - 400
 
     """
-    from sympy.utilities.iterables import sift
     def is_sqrt(expr):
         return expr.is_Pow and expr.exp is S.Half
     # p = c1*sqrt(q1) + ... + cn*sqrt(qn) -> a = [(c1, q1), .., (cn, qn)]
diff --git a/sympy/polys/tests/test_numberfields.py b/sympy/polys/tests/test_numberfields.py
index 52a259a6ae..16350403ca 100644
--- a/sympy/polys/tests/test_numberfields.py
+++ b/sympy/polys/tests/test_numberfields.py
@@ -182,6 +182,9 @@ def test_minimal_polynomial_issue_19732():
             + 211295968822207088328287206509522887719741955693091053353263782924470627623790749534705683380138972642560898936171035770539616881000369889020398551821767092685775598633794696371561234818461806577723412581353857653829324364446419444210520602157621008010129702779407422072249192199762604318993590841636967747488049176548615614290254356975376588506729604345612047361483789518445332415765213187893207704958013682516462853001964919444736320672860140355089)
     assert minimal_polynomial(expr, x) == poly
 
+    assert minimal_polynomial(I + sqrt(2 - I*y), x) == \
+        x**4 - 2*x**2 - 4*x*y + y**2 + 9
+
 
 def test_minimal_polynomial_hi_prec():
     p = 1/sqrt(1 - 9*sqrt(2) + 7*sqrt(3) + Rational(1, 10)**30)
@@ -324,6 +327,7 @@ def test_primitive_element():
     a, b = I*sqrt(2*sqrt(2) + 3), I*sqrt(-2*sqrt(2) + 3)
     assert primitive_element([a, b, I], x) == (x**4 + 6*x**2 + 1, [1, 0, 0])
 
+
 def test_field_isomorphism_pslq():
     a = AlgebraicNumber(I)
     b = AlgebraicNumber(I*sqrt(3))
@@ -732,6 +736,7 @@ def test_isolate():
 
     raises(NotImplementedError, lambda: isolate(I))
 
+
 def test_minpoly_fraction_field():
     assert minimal_polynomial(1/x, y) == -x*y + 1
     assert minimal_polynomial(1 / (x + 1), y) == (x + 1)*y - 1
@@ -768,11 +773,13 @@ def test_minpoly_fraction_field():
     raises(GeneratorsError, lambda: minimal_polynomial(sqrt(x) - y, x))
     raises(NotImplementedError, lambda: minimal_polynomial(sqrt(x), y, compose=False))
 
+
 @slow
 def test_minpoly_fraction_field_slow():
     assert minimal_polynomial(minimal_polynomial(sqrt(x**Rational(1,5) - 1),
         y).subs(y, sqrt(x**Rational(1,5) - 1)), z) == z
 
+
 def test_minpoly_domain():
     assert minimal_polynomial(sqrt(2), x, domain=QQ.algebraic_field(sqrt(2))) == \
         x - sqrt(2)
@@ -806,6 +813,7 @@ def test_issue_13230():
     + sqrt(196*sqrt(35) + 1941)/29), Point2D(-1 + (-sqrt(7) + sqrt(5))*(-sqrt(196*sqrt(35)
     + 1941)/29 - 2*sqrt(7)/29 + 9*sqrt(5)/29), -sqrt(196*sqrt(35) + 1941)/29 - 2*sqrt(7)/29 + 9*sqrt(5)/29)]
 
+
 def test_issue_19760():
     e = 1/(sqrt(1 + sqrt(2)) - sqrt(2)*sqrt(1 + sqrt(2))) + 1
     mp_expected = x**4 - 4*x**3 + 4*x**2 - 2

oscarbenjamin · 2021-04-27T21:17:28Z

an exact 0 is sought for a factor into which the value is substituted after making all symbols positive.

I don't think this really resolves the problem that the whole premise of the function is not well defined in the case of arbitrary symbols in the "algebraic" expression. I'm not sure that any diff can resolve that problem...

To me the main goal is just making sure that the function works reliably and efficiently in the case where there are no symbols.

smichr · 2021-04-28T01:33:45Z

function is not well defined in the case of arbitrary symbols in the "algebraic" expression.

Let @jksuom's statement be the epitaph: "algorithms in numberfields have been implement under the assumption that the elements lie in a well defined field. That is obviously true for numbers but not for expressions involving symbols" and let's just raise a TypeError on non-rational symbolic input.

jksuom · 2021-04-28T07:51:45Z

I think that this would work fine also with symbolic expressions that are well defined such as those that would be handled by risch. It might fail with NonImplementedError in some (rare) cases but then we could consider lowering the constant 10**6 or adding more substitutions for the symbols. I would be willing to merge this as it is.

oscarbenjamin · 2021-04-28T10:44:39Z

Yes, I suppose there are cases that are well defined like if the symbol is not in a radical. In the cases where the minimal polynomial does depend on the value of the symbol the current implementation should return an answer that is valid for some choice of that value.

If we wanted to detect those cases where it depends on the value of the symbol then for a radical expression like yv we could consider all possible branches of the radicals. If every factor is nonzero for some branch of the radicals and some value of the symbol then a unique minimal polynomial is not well defined.

I was about to say that we should just merge this but then I found a problem from the symbol answer above. Substituting I/5 for the symbol should make it well defined but then minpoly seems to hang in this case:

In [1]: yv = nsimplify(sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**(5/2)*(1 + x**(-3))**(3/2)) + 1/(x**(11/2)*(1 + x**(-3))**(3/2)))

In [3]: e = yv.subs(x, I/5)

In [4]: minpoly(e)

In this case it seems that the two factors are:

In [17]: f1
Out[17]: Poly(x**8 + 63000*x**6 + 7630843843752*x**4 + 232711422859188000*x**2 + 14553778052345276611359376, x, domain='QQ')

In [18]: f2
Out[18]: Poly(x**8 + 63000*x**6 + 7630890718752*x**4 + 232711915046688000*x**2 + 14553718443946594955109376, x, domain='QQ')

Numerical evaluation suggests that e is not a root of either:

In [19]: f1.subs(x, e).n()
Out[19]: 1.45537548575065e+25 + 1.84948204781118e+17⋅ⅈ

In [20]: f2.subs(x, e).n()
Out[20]: 1.4553695249059e+25 + 1.84948591031e+17⋅ⅈ

Possibly this is a bug somewhere else but the algorithm in this PR has no upper limit on the precision so it goes into an infinite loop in this case.

jksuom · 2021-04-28T12:04:25Z

I had forgotten prec1. Obviously something will be needed to limit that even though it seems that the two polynomials f1 and f2 result from a bug elsewhere.

jksuom · 2021-04-28T16:34:25Z

The minimal polynomial of some simple expressions obtained by substituting I/5 is wrong. It looks like the error is coming from Factors:

In [5]: e = 125*I*I**(S(5)/2)*sqrt(1 + 125*I)                                   

In [6]: e                                                                       
Out[6]: 
       5/2   ___________
125⋅ⅈ⋅ⅈ   ⋅╲╱ 1 + 125⋅ⅈ 

In [7]: Factors(e)                                                              
Out[7]: Factors({125: 1, I: 1, 1 + 125*I: 1/2})

oscarbenjamin · 2021-04-28T17:34:07Z

A simpler example:

In [6]: from sympy.core.exprtools import Factors

In [7]: Factors(I*sqrt(I))
Out[7]: Factors({I: 1})

The fix seems to be

diff --git a/sympy/core/exprtools.py b/sympy/core/exprtools.py
index 58914ba83b..1392b06ab7 100644
--- a/sympy/core/exprtools.py
+++ b/sympy/core/exprtools.py
@@ -363,7 +363,7 @@ def __init__(self, factors=None):  # Factors
                     factors[q] = (factors[q] if q in factors else S.Zero) - factors[f]
                     factors.pop(f)
             if i:
-                factors[I] = S.One*i
+                factors[I] = factors.get(I, S.Zero) + S.One*i
             if nc:
                 factors[Mul(*nc, evaluate=False)] = S.One
         else:

That gives:

In [22]: yv = nsimplify(sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**(5/2)*(1 + x**(-3))**(3/2)) + 1/(x**(11/2)*(1 + x**(-3))**(3/2)))

In [23]: e = yv.subs(x, I/5)

In [24]: minpoly(e)
Out[24]: 
    4         2         
25⋅x  + 5000⋅x  + 250016

which seems to be correct.

It seems that Factors has special handling of I to work around the special handling that I has in as_powers_dict():

In [3]: x.as_powers_dict()
Out[3]: defaultdict(int, {x: 1})

In [4]: I.as_powers_dict()
Out[4]: defaultdict(int, {-1: 1/2})

I also wonder if I.as_powers_dict should be changed.

smichr · 2021-04-28T18:11:59Z

I also wonder if I.as_powers_dict should be changed.

I.as_powers_dict is ok, but Pow.as_powers_dict can be changed to recognize base I and then with that single change -- none to Factors -- the same result is obtained:

>>> from sympy import *
>>> from sympy.core.exprtools import Factors
>>> e = 125*I*I**(S(5)/2)*sqrt(1 + 125*I)
>>> Factors(e).as_expr().equals(e)
True
>>> from sympy.abc import x
>>> minpoly(e, x)
x**4 - 3906250*x**2 + 3814941406250
>>> [RootOf(x**4 - 3906250*x**2 + 3814941406250, i).n(2) -e.n(2) for i in range(4)]
[-2.8e+3, -2.8e+3 + 11.0*I, 0, 11.0*I]
>>> minpoly(yv.subs(x, I/5), x)
25*x**4 + 5000*x**2 + 250016

diff is

diff --git a/sympy/core/power.py b/sympy/core/power.py
index 4c8aabc702..da88025525 100644
--- a/sympy/core/power.py
+++ b/sympy/core/power.py
@@ -932,6 +932,8 @@ def as_base_exp(self):
         b, e = self.args
         if b.is_Rational and b.p == 1 and b.q != 1:
             return Integer(b.q), -e
+        if b is S.ImaginaryUnit:
+            return S.NegativeOne, e/2
         return b, e

     def _eval_adjoint(self):
diff --git a/sympy/core/tests/test_power.py b/sympy/core/tests/test_power.py
index 7ee6791e4d..2f5af49d0d 100644
--- a/sympy/core/tests/test_power.py
+++ b/sympy/core/tests/test_power.py
@@ -283,6 +283,7 @@ def test_pow_as_base_exp():
     assert p.base, p.exp == p.as_base_exp() == (S(2), -x)
     # issue 8344:
     assert Pow(1, 2, evaluate=False).as_base_exp() == (S.One, S(2))
+    assert Pow(I, S.Half).as_base_exp() == (-1, S(1)/4)

smichr · 2021-04-28T18:16:34Z

Here is a diff over current that limits symbolic input to known rational expressions only and has a limit on the precision:

diff --git a/sympy/core/power.py b/sympy/core/power.py
index 4c8aabc702..da88025525 100644
--- a/sympy/core/power.py
+++ b/sympy/core/power.py
@@ -932,6 +932,8 @@ def as_base_exp(self):
         b, e = self.args
         if b.is_Rational and b.p == 1 and b.q != 1:
             return Integer(b.q), -e
+        if b is S.ImaginaryUnit:
+            return S.NegativeOne, e/2
         return b, e
 
     def _eval_adjoint(self):
diff --git a/sympy/core/tests/test_power.py b/sympy/core/tests/test_power.py
index 7ee6791e4d..2f5af49d0d 100644
--- a/sympy/core/tests/test_power.py
+++ b/sympy/core/tests/test_power.py
@@ -283,6 +283,7 @@ def test_pow_as_base_exp():
     assert p.base, p.exp == p.as_base_exp() == (S(2), -x)
     # issue 8344:
     assert Pow(1, 2, evaluate=False).as_base_exp() == (S.One, S(2))
+    assert Pow(I, S.Half).as_base_exp() == (-1, S(1)/4)
 
 
 def test_nseries():
diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index 27247c7790..22d997a428 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -48,9 +48,14 @@
 def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     """
     Return a factor having root ``v``
-    It is assumed that one of the factors has root ``v``.
+    It is assumed that one of the factors has root ``v``
+    and that ``v`` is numeric.
+
+    The evaluation of a factor at ``v`` may not compute with
+    precision to 0, but the other factors will and will be
+    larger, so we will take the one with the minimum value.
     """
-    from sympy.polys.polyutils import illegal
+    # assert v.is_number is the assumption for what follows
 
     if isinstance(factors[0], tuple):
         factors = [f[0] for f in factors]
@@ -58,42 +63,31 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
         return factors[0]
 
     prec1 = 10
-    points = {}
     symbols = dom.symbols if hasattr(dom, 'symbols') else []
-    while True:
-        # when dealing with non-Rational numbers we usually evaluate
-        # with `subs` argument but we only need a ballpark evaluation
-        xv = {x:v if not v.is_number else v.n(prec1)}
-        fe = [f.as_expr().xreplace(xv) for f in factors]
-
-        # assign integers [0, n) to symbols (if any)
-        for n in subsets(range(bound), k=len(symbols), repetition=True):
-            for s, i in zip(symbols, n):
-                points[s] = i
-
-            # evaluate the expression at these points
-            candidates = [(abs(f.subs(points).n(prec1)), i)
-                for i,f in enumerate(fe)]
-
-            # if we get invalid numbers (e.g. from division by zero)
-            # we try again
-            if any(i in illegal for i, _ in candidates):
-                continue
 
-            # find the smallest two -- if they differ significantly
-            # then we assume we have found the factor that becomes
-            # 0 when v is substituted into it
-            can = sorted(candidates)
-            (a, ix), (b, _) = can[:2]
-            if b > a * 10**6:  # XXX what to use?
-                return factors[ix]
+    # when dealing with non-Rational numbers we usually evaluate
+    # with `subs` argument but we only need a ballpark evaluation
+    xv = {x: v if not v.is_number else v.n(prec1)}
+    fe = [f.as_expr().xreplace(xv) for f in factors]
 
+    while prec1 <= prec:
+        # evaluate the expression at this precision
+        candidates = [(abs(f.n(prec1)), i) for i, f in enumerate(fe)]
+
+        # find the smallest two -- if they differ significantly
+        # then we assume we have found the factor that becomes
+        # 0 when v is substituted into it
+        can = sorted(candidates)
+        (a, ix), (b, _) = can[:2]
+        if b > a * 10**6:  # XXX what to use?
+            return factors[ix]
+
+        # else we raise the precision and try again
         prec1 *= 2
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)
 
 
-
 def _separate_sq(p):
     """
     helper function for ``_minimal_polynomial_sq``
@@ -120,7 +114,6 @@ def _separate_sq(p):
     -x**8 + 48*x**6 - 536*x**4 + 1728*x**2 - 400
 
     """
-    from sympy.utilities.iterables import sift
     def is_sqrt(expr):
         return expr.is_Pow and expr.exp is S.Half
     # p = c1*sqrt(q1) + ... + cn*sqrt(qn) -> a = [(c1, q1), .., (cn, qn)]
@@ -657,11 +650,28 @@ def minimal_polynomial(ex, x=None, compose=True, polys=False, domain=None):
     from sympy.polys.polytools import degree
     from sympy.polys.domains import FractionField
     from sympy.core.basic import preorder_traversal
+    from sympy.core.power import Pow
 
     ex = sympify(ex)
+
     if ex.is_number:
-        # not sure if it's always needed but try it for numbers (issue 8354)
+        # not sure if it's always needed but (see issue 8354)
         ex = _mexpand(ex, recursive=True)
+
+    free = ex.free_symbols
+
+    if not ex.is_number:
+        for i in ex.atoms(Pow):
+            if i.is_number:
+                continue
+            if not i.exp.is_Integer:
+                if i.exp.is_number:
+                    raise TypeError('exponents of symbolic bases must be literal integers')
+                else:
+                    raise TypeError('symbolic exponents are not supported')
+            elif i.exp < 0:
+                if i.base.is_zero is not False:
+                    raise TypeError('denominators must be non-zero')
     for expr in preorder_traversal(ex):
         if expr.is_AlgebraicNumber:
             compose = False
@@ -673,8 +683,8 @@ def minimal_polynomial(ex, x=None, compose=True, polys=False, domain=None):
         x, cls = Dummy('x'), PurePoly
 
     if not domain:
-        if ex.free_symbols:
-            domain = FractionField(QQ, list(ex.free_symbols))
+        if free:
+            domain = FractionField(QQ, list(free))
         else:
             domain = QQ
     if hasattr(domain, 'symbols') and x in domain.symbols:
diff --git a/sympy/polys/tests/test_numberfields.py b/sympy/polys/tests/test_numberfields.py
index 52a259a6ae..6544d5d040 100644
--- a/sympy/polys/tests/test_numberfields.py
+++ b/sympy/polys/tests/test_numberfields.py
@@ -33,6 +33,10 @@
 from sympy.abc import x, y, z
 
 Q = Rational
+i = Symbol('i', integer=True)
+r = Symbol('r', rational=True)
+pr = Symbol('pr', rational=True, positive=True)
+nzr = Symbol('nzr', rational=True, zero=False)
 
 
 def test_minimal_polynomial():
@@ -74,8 +78,10 @@ def test_minimal_polynomial():
     assert minimal_polynomial(
         1/sqrt(a), x) == 392*x**8 - 1232*x**6 + 612*x**4 + 4*x**2 - 1
 
+    raises(TypeError, lambda: minimal_polynomial(1/r, x))
+    raises(TypeError, lambda: minimal_polynomial(2**y, x))
+    raises(TypeError, lambda: minimal_polynomial(2**i, x))
     raises(NotAlgebraic, lambda: minimal_polynomial(oo, x))
-    raises(NotAlgebraic, lambda: minimal_polynomial(2**y, x))
     raises(NotAlgebraic, lambda: minimal_polynomial(sin(1), x))
 
     assert minimal_polynomial(sqrt(2)).dummy_eq(x**2 - 2)
@@ -324,6 +330,7 @@ def test_primitive_element():
     a, b = I*sqrt(2*sqrt(2) + 3), I*sqrt(-2*sqrt(2) + 3)
     assert primitive_element([a, b, I], x) == (x**4 + 6*x**2 + 1, [1, 0, 0])
 
+
 def test_field_isomorphism_pslq():
     a = AlgebraicNumber(I)
     b = AlgebraicNumber(I*sqrt(3))
@@ -732,47 +739,39 @@ def test_isolate():
 
     raises(NotImplementedError, lambda: isolate(I))
 
+
 def test_minpoly_fraction_field():
-    assert minimal_polynomial(1/x, y) == -x*y + 1
-    assert minimal_polynomial(1 / (x + 1), y) == (x + 1)*y - 1
-
-    assert minimal_polynomial(sqrt(x), y) == y**2 - x
-    assert minimal_polynomial(sqrt(x + 1), y) == y**2 - x - 1
-    assert minimal_polynomial(sqrt(x) / x, y) == x*y**2 - 1
-    assert minimal_polynomial(sqrt(2) * sqrt(x), y) == y**2 - 2 * x
-    assert minimal_polynomial(sqrt(2) + sqrt(x), y) == \
-        y**4 + (-2*x - 4)*y**2 + x**2 - 4*x + 4
-
-    assert minimal_polynomial(x**Rational(1,3), y) == y**3 - x
-    assert minimal_polynomial(x**Rational(1,3) + sqrt(x), y) == \
-        y**6 - 3*x*y**4 - 2*x*y**3 + 3*x**2*y**2 - 6*x**2*y - x**3 + x**2
-
-    assert minimal_polynomial(sqrt(x) / z, y) == z**2*y**2 - x
-    assert minimal_polynomial(sqrt(x) / (z + 1), y) == (z**2 + 2*z + 1)*y**2 - x
-
-    assert minimal_polynomial(1/x, y, polys=True) == Poly(-x*y + 1, y, domain='ZZ(x)')
-    assert minimal_polynomial(1 / (x + 1), y, polys=True) == \
-        Poly((x + 1)*y - 1, y, domain='ZZ(x)')
-    assert minimal_polynomial(sqrt(x), y, polys=True) == Poly(y**2 - x, y, domain='ZZ(x)')
-    assert minimal_polynomial(sqrt(x) / z, y, polys=True) == \
-        Poly(z**2*y**2 - x, y, domain='ZZ(x, z)')
+    raises(TypeError, lambda: minimal_polynomial(1/x, y))
+    assert minimal_polynomial(1/nzr, x) == 1 - nzr*x
+    assert str(minimal_polynomial(1/nzr, x, polys=True)) == \
+        "Poly(-nzr*x + 1, x, domain='ZZ(nzr)')"
+    raises(TypeError, lambda: minimal_polynomial(1/(r + 1), y))
+    assert minimal_polynomial(1/(pr + 1), y) == (pr + 1)*y - 1
+    assert str(minimal_polynomial(1/(pr + 1), y, polys=True)) == \
+        "Poly((pr + 1)*y - 1, y, domain='ZZ(pr)')"
 
     # this is (sqrt(1 + x**3)/x).integrate(x).diff(x) - sqrt(1 + x**3)/x
+    # and is only 0 when x is real and greater than -1; the answer
+    # cannot be an unqualified y (see issue 21395)
     a = sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**Rational(5, 2)* \
         (1 + x**(-3))**Rational(3, 2)) + 1/(x**Rational(11, 2)*(1 + x**(-3))**Rational(3, 2))
+    raises(TypeError, lambda: minimal_polynomial(a, y))
 
-    assert minimal_polynomial(a, y) == y
+    # issue 8354
+    e=S('''-2**(1/3)*(3*sqrt(93) + 29)**2 - 4*(3*sqrt(93) + 29)**(4/3) +
+        12*sqrt(93)*(3*sqrt(93) + 29)**(1/3) + 116*(3*sqrt(93) + 29)**(1/3) +
+        174*2**(1/3)*sqrt(93) + 1678*2**(1/3)''')
+    assert minimal_polynomial(e, x) == x
 
     raises(NotAlgebraic, lambda: minimal_polynomial(exp(x), y))
-    raises(GeneratorsError, lambda: minimal_polynomial(sqrt(x), x))
-    raises(GeneratorsError, lambda: minimal_polynomial(sqrt(x) - y, x))
-    raises(NotImplementedError, lambda: minimal_polynomial(sqrt(x), y, compose=False))
+
 
 @slow
 def test_minpoly_fraction_field_slow():
     assert minimal_polynomial(minimal_polynomial(sqrt(x**Rational(1,5) - 1),
         y).subs(y, sqrt(x**Rational(1,5) - 1)), z) == z
 
+
 def test_minpoly_domain():
     assert minimal_polynomial(sqrt(2), x, domain=QQ.algebraic_field(sqrt(2))) == \
         x - sqrt(2)
@@ -806,6 +805,7 @@ def test_issue_13230():
     + sqrt(196*sqrt(35) + 1941)/29), Point2D(-1 + (-sqrt(7) + sqrt(5))*(-sqrt(196*sqrt(35)
     + 1941)/29 - 2*sqrt(7)/29 + 9*sqrt(5)/29), -sqrt(196*sqrt(35) + 1941)/29 - 2*sqrt(7)/29 + 9*sqrt(5)/29)]
 
+
 def test_issue_19760():
     e = 1/(sqrt(1 + sqrt(2)) - sqrt(2)*sqrt(1 + sqrt(2))) + 1
     mp_expected = x**4 - 4*x**3 + 4*x**2 - 2

oscarbenjamin · 2021-04-28T18:23:22Z

I also wonder if I.as_powers_dict should be changed.

I.as_powers_dict is ok, but Pow.as_powers_dict can be changed to recognize base I and then with that single change -- none to Factors -- the same result is obtained:

With that change we get:

In [1]: I**(S(3)/2)
Out[1]: 
    3/4
(-1)

Whereas master gives:

In [3]: I**(S(3)/2)
Out[3]: 
 3/2
ⅈ

smichr · 2021-04-28T18:40:02Z

Whereas master gives:

I guess I like retaining the power of I rather than changing it to -1. But it seems like powers should be reduced so I**(e + r) -> I**e*I**r where e is even, e.g. I**(5/S(2)) -> -sqrt(I)

oscarbenjamin · 2021-04-28T19:00:51Z

Well actually it's just the printing:

In [1]: I**(S(3)/2)
Out[1]: 
    3/4
(-1)   

In [2]: _.args
Out[2]: (ⅈ, 3/2)

I think that shows though that messing with as_base_exp is going to break expectations elsewhere. I generally think of as_base_exp as being a safe way to retrieve the args of a Pow without assuming that an expression actually is a Pow because it might have evaluated to something else.

To me the issue seemed to be the other way around (although I'm not sure now):

In [3]: x.as_powers_dict()
Out[3]: defaultdict(int, {x: 1})

In [4]: I.as_powers_dict()
Out[4]: defaultdict(int, {-1: 1/2})

Shouldn't I be treated as an atom? That in turn comes from

In [2]: I.as_base_exp()
Out[2]: (-1, 1/2)

but that is potentially not so easy/desirable to change.

There are only a handful of places around the codebase where as_powers_dict is being used with Factors being one of them. At least for Factors this behaviour is apparently unwanted and there seems to be code to try and compensate for this.

I'm just not quite sure I can get my head round the reasoning for each of these different behaviours though. It seems like Factors wants to treat I as being something different from Pow(-1, S.Half) but maybe that's the mistake and it should be Factors that should just accept that I is really a singleton Pow:

In [10]: Factors(2*sqrt(2))
Out[10]: Factors({2: 3/2})

In [11]: Factors((-1)*sqrt(-1))
Out[11]: Factors({-1: 1, I: 1})

smichr · 2021-04-28T19:25:36Z

it should be Factors that should just accept that I is really a singleton Pow

I suspect that it can handle keeping the two separate -- it wasn't expecting powers of I, however.

jksuom · 2021-04-29T08:54:50Z

I think this is ready to go. Thanks!

oscarbenjamin added the polys label Apr 22, 2021

smichr reviewed Apr 24, 2021

View reviewed changes

jksuom reviewed Apr 25, 2021

View reviewed changes

fix(polys): improve numerical evaluation in minpoly

9337bfe

oscarbenjamin force-pushed the pr_minpoly branch from 58ce858 to 9337bfe Compare April 25, 2021 10:54

smichr mentioned this pull request Apr 28, 2021

cancel must be careful with assumptions #21395

Closed

fix(core): fix handling of I in Factors

f9fc873

fix(polys): reintroduce precision limit in minpoly

609e9b1

jksuom merged commit d43e9dc into sympy:master Apr 29, 2021

oscarbenjamin deleted the pr_minpoly branch August 16, 2021 00:57

oscarbenjamin mentioned this pull request Jun 24, 2022

minimal_polynomial fails for very complicated algebraic number #23677

Open

Uh oh!

fix(polys): improve numerical evaluation in minpoly #21370

fix(polys): improve numerical evaluation in minpoly #21370

Uh oh!

Conversation

oscarbenjamin commented Apr 22, 2021

References to other Issues or PRs

Brief description of what is fixed or changed

Other comments

Release Notes

Uh oh!

sympy-bot commented Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oscarbenjamin commented Apr 23, 2021

Uh oh!

jksuom commented Apr 23, 2021

Uh oh!

oscarbenjamin commented Apr 23, 2021

Uh oh!

oscarbenjamin commented Apr 23, 2021

Uh oh!

jksuom commented Apr 23, 2021

Uh oh!

smichr commented Apr 23, 2021

Uh oh!

smichr commented Apr 23, 2021

Uh oh!

smichr commented Apr 23, 2021

Uh oh!

jksuom commented Apr 23, 2021

Uh oh!

smichr commented Apr 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oscarbenjamin commented Apr 23, 2021

Uh oh!

smichr commented Apr 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

smichr commented Apr 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oscarbenjamin commented Apr 24, 2021

Uh oh!

jksuom commented Apr 24, 2021

Uh oh!

smichr Apr 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

oscarbenjamin commented Apr 24, 2021

Uh oh!

fredrik-johansson commented Apr 24, 2021

Uh oh!

smichr commented Apr 25, 2021

Uh oh!

jksuom commented Apr 25, 2021

Uh oh!

jksuom commented Apr 25, 2021

Uh oh!

oscarbenjamin commented Apr 25, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

sympy-bot commented Apr 22, 2021 •

edited

Loading

smichr commented Apr 23, 2021 •

edited

Loading

smichr commented Apr 23, 2021 •

edited

Loading

smichr commented Apr 24, 2021 •

edited

Loading

smichr Apr 24, 2021 •

edited

Loading

oscarbenjamin commented Apr 26, 2021 •

edited

Loading

smichr commented Apr 26, 2021 •

edited

Loading

smichr commented Apr 27, 2021 •

edited

Loading

smichr commented Apr 28, 2021 •

edited

Loading

smichr commented Apr 28, 2021 •

edited

Loading