Skip to content

Conversation

oscarbenjamin
Copy link
Collaborator

References to other Issues or PRs

Fixes #19732

Brief description of what is fixed or changed

The minpoly algorithm for choosing a factor is made more robust by using better numerical evaluation.

Other comments

Release Notes

  • polys
    • The accuracy of the minpoly algorithm was improved to handle complicated algebraic expressions more reliably.

@sympy-bot
Copy link

sympy-bot commented Apr 22, 2021

Hi, I am the SymPy bot (v161). I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

  • polys
    • The accuracy of the minpoly algorithm was improved to handle complicated algebraic expressions more reliably. (#21370 by @oscarbenjamin)

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.9.

Click here to see the pull request description that was parsed.
<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->

Fixes #19732

#### Brief description of what is fixed or changed

The minpoly algorithm for choosing a factor is made more robust by using better numerical evaluation.

#### Other comments


#### Release Notes

<!-- Write the release notes for this release below between the BEGIN and END
statements. The basic format is a bulleted list with the name of the subpackage
and the release note for this PR. For example:

* solvers
  * Added a new solver for logarithmic equations.

* functions
  * Fixed a bug with log of integers.

or if no release note(s) should be included use:

NO ENTRY

See https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more
information on how to write release notes. The bot will check your release
notes automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
* polys
    * The accuracy of the `minpoly` algorithm was improved to handle complicated algebraic expressions more reliably.
<!-- END RELEASE NOTES -->

Update

The release notes on the wiki have been updated.

@oscarbenjamin
Copy link
Collaborator Author

This has failed:

In [1]: minpoly(residue(1/(x**4 + 1), x, exp(I*pi/4)) - (-(Rational(1, 4) + I/4)/sqrt(2)))
---------------------------------------------------------------------------
NotImplementedError 

@jksuom
Copy link
Member

jksuom commented Apr 23, 2021

The test n10._prec > 1 and n10 > 0 is not reliable because the precision is not properly computed.

In [13]: from sympy.core.evalf import pure_complex

In [16]: f = 1 + 32*(sqrt(2)/8 - (-1)**(S(1)/4)/4)**2                        

In [17]: a = f.evalf(10)                                                        

In [18]: r, i = pure_complex(a)                                                 

In [19]: r, i                                                                   
Out[19]: (0.e-127, -4.731953965e-132)

In [20]: r._prec, i._prec                                                       
Out[20]: (1, 37)

In [21]: abs(a)._prec                                                           
Out[21]: 37

The wrong precision 37 of the imaginary part comes from mpmath. Computation of abs (involving __add__) could probably be fixed to return a more realistic final precision.

However, I think that there is an easier way: _choose_factor is always called with a correct factor in the factor list so it should suffice to pick the one the factors f for which abs(f.evalf(10, points)) has the minimum value.

@oscarbenjamin
Copy link
Collaborator Author

The problem is basically this:

In [2]: f = 1 + 32*(sqrt(2)/8 - (-1)**(S(1)/4)/4)**2

In [3]: f
Out[3]: 
                    24 ____⎞ 
       ⎜√2   ╲╱ -11 + 32⋅⎜── - ──────⎟ 
       ⎝8      4In [4]: f.n()
Out[4]: 0.e-138 - 9.06039381185871e-144

In [5]: abs(f.n())._prec
Out[5]: 53

There should be no precision on the imaginary part because it is exactly zero.

@oscarbenjamin
Copy link
Collaborator Author

The wrong precision 37 of the imaginary part comes from mpmath

There isn't a _prec in mpmath. It is sympy's evalf code that computes the _prec of any result. Floats with different precision still have the same _mpf_ representation as far as mpmath is concerned.

@jksuom
Copy link
Member

jksuom commented Apr 23, 2021

There isn't a _prec in mpmath.

That is true. The mpf objects have no precision. An operation on two objects treats then as having the same precision, usually given as a parameter.

In this example, sqrt(2)/8 - (-1)**(S(1)/4)/4 is pure imaginary but mpmath sees it as re + im*I where re is very small and has low precision.

(Pdb) p re, im
((0, mpz(27), -76, 5), (1, mpz(13356869453140768985445), -76, 74))

It then computes its square using the standard formula:

(Pdb) p p, prec
(2, 52)
(Pdb) p libmp.mpc_pow_int((re, im), p, prec)
((1, mpz(4503599627370495), -57, 52), (1, mpz(1343471837173405), -123, 51))

For the imaginary part 2*re*im, mpmath gives 51 bits but most of them are spurious as re has only five significant bits.

@smichr
Copy link
Member

smichr commented Apr 23, 2021

sqrt(2)/8 - (-1)**(S(1)/4)/4

This is perhaps the reason for using as_real_imag() on the expression before evaluating instead of vice versa:

>>> (sqrt(2)/8 - (-1)**(S(1)/4)/4).as_real_imag()
(0, -sqrt(2)/8)

@smichr
Copy link
Member

smichr commented Apr 23, 2021

The problem is basically this:

>>> f = 1 + 32*(sqrt(2)/8 - (-1)**(S(1)/4)/4)**2
>>> r,i=f.as_real_imag()
>>> abs(r.n(10) + i.n(10)*I)._prec
-1
>>> r,i
(0, 0)
>>> f
1 + 32*(sqrt(2)/8 - (-1)**(1/4)/4)**2

@smichr
Copy link
Member

smichr commented Apr 23, 2021

Do you think it would be worth having a safe_nsimplify step at the outset of minpoly?

@jksuom
Copy link
Member

jksuom commented Apr 23, 2021

It looks like as_real_imag could be used:

---
 a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -65,8 +65,8 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
             n_temp = n_temp // bound
 
         def nonzero(f):
-            n10 = abs(f.evalf(10, points))
-            return n10._prec > 1 and n10 > 0
+            r, i = [abs(c.evalf(10, points)) for c in f.as_real_imag()]
+            return r._prec > 1 and r > 0 or i._prec > 1 and i > 0
 
         candidates = {f:fe for f, fe in factors.items() if not nonzero(fe)}
         if candidates:

@smichr
Copy link
Member

smichr commented Apr 23, 2021

It looks like as_real_imag could be used:

Would it be worth starting the whole minpoly process with expression taming?

# tame expression if possible, e.g. (sqrt(2)/8 - (-1)**(S(1)/4)/4) -> -sqrt(2)*I/8
r, i = expr.as_real_imag()
expr = r + i*I

@oscarbenjamin
Copy link
Collaborator Author

It looks like as_real_imag could be used:

That leads to a big slowdown in

sympy/polys/tests/test_numberfields.py::test_minpoly_fraction_field_slow

which goes from 5 seconds to 250 seconds on this machine.

Also the time taken to run the whole non-slow sympy/polys tests goes from 96 seconds for master to 103 seconds for the PR to 110 seconds with as_real_imag (that might not be a reliable difference though). Maybe the slowdown in the fraction field test is a separate performance problem.

Do you think it would be worth having a safe_nsimplify step at the outset of minpoly?

There are lots of heuristics that could be used here but this is potentially a slow function and the intention is that it should be able to work just by depending on robust numerical evaluation. The best fix would be to fix the actual bug in the numerical evaluation I think.

@smichr
Copy link
Member

smichr commented Apr 23, 2021

That leads to a big slowdown in

then maybe use as a fallback. But if the evalf trouble can be fixed, great. But it is my feeling that the calculation of a "hidden zero" is fraught with difficulties.

>>> abs((cos(1)**2+sin(1)**2-1).n(10))>0
True
>>> abs((cos(1)**2+sin(1)**2-1).n(10))._prec
1

My rule of thumb has been that if prec is 1 you don't know anything about the number via evaluation. If you get a prec of 1 for the real or imaginary part of the number it is perhaps a good indicator that you should look for a better form of the expression that you are trying to compute.

@smichr
Copy link
Member

smichr commented Apr 24, 2021

What about this (in light of @jksuom comment about there being only one true root):

diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index 073e5f3cb0..2cc8468183 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -1,6 +1,7 @@
 """Computational algebraic field theory. """
 
 from functools import reduce
+from collections import defaultdict
 
 from sympy import (
     S, Rational, AlgebraicNumber, GoldenRatio, TribonacciConstant,
@@ -10,6 +11,7 @@
 
 from sympy.core.exprtools import Factors
 from sympy.core.function import _mexpand
+from sympy.core.symbol import Symbol
 from sympy.functions.elementary.exponential import exp
 from sympy.functions.elementary.trigonometric import cos, sin
 from sympy.ntheory import sieve
@@ -64,16 +66,14 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
             points[s] = n_temp % bound
             n_temp = n_temp // bound
 
-        def nonzero(f):
-            n10 = abs(f.evalf(10, points))
-            return n10._prec > 1 and n10 > 0
-
-        candidates = {f:fe for f, fe in factors.items() if not nonzero(fe)}
-        if candidates:
-            factors = candidates
-        if len(factors) == 1:
-            [f] = factors
-            return f
+        if factors:
+            mag = lambda x: abs(x.n(10, points))
+            candidates = defaultdict(list)
+            for f, fe in factors.items():
+                candidates[mag(fe)].append(f)
+            c = min(candidates)
+            if len(candidates[c]) == 1:
+                return candidates[c][0]
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)
 

@oscarbenjamin
Copy link
Collaborator Author

What about this (in light of @jksuom comment about there being only one true root):

That also leads to slowdowns.

I've tried various different ways of computing this using evalf and Abs and _prec and so on. Theoretically any of these methods should work although it's understandable that some are faster than others. The ones that aren't slow seem to hit up against some or other bug in evalf/mpmath though.

On a basic level Float takes the higher precision in a product which seems wrong:

In [35]: re = Float(1e-100, precision=3)

In [36]: im = Float(0.1, precision=53)

In [37]: re
Out[37]: 0.e-100

In [38]: im
Out[38]: 0.100000000000000

In [39]: re*im
Out[39]: 1.00011396737199e-101

In [40]: _._prec
Out[40]: 53

I think also that the way that evalf handles this is not right because it assumes a single precision for both real and imaginary parts:

sympy/sympy/core/evalf.py

Lines 710 to 713 in 624549a

# General complex number to arbitrary integer power
re, im = libmp.mpc_pow_int((re, im), p, prec)
# Assumes full accuracy in input
return finalize_complex(re, im, target_prec)

@jksuom
Copy link
Member

jksuom commented Apr 24, 2021

Float takes the higher precision in a product which seems wrong:

I agree. The lower precision should be taken. That should be easy to fix.

The precision of a sum is harder to find. If one of the summands is much bigger than the other onw, then its precision should be taken. When the summands overlap, more work is needed to compute the proper precision.

@@ -53,30 +53,27 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
if len(factors) == 1:
return factors[0]

points = {x:v}
factors = {f:f.as_expr().subs({x:v}) for f in factors}
Copy link
Member

@smichr smichr Apr 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If v.is_number (and not v.is_Number) it should be put in with the points for best evaluation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bearing in mind that these are always exact algebraic polynomials is there a benefit in passing them to evalf rather than just using subs?

@oscarbenjamin
Copy link
Collaborator Author

Float takes the higher precision in a product which seems wrong:

I agree. The lower precision should be taken. That should be easy to fix.

The precision of a sum is harder to find. If one of the summands is much bigger than the other onw, then its precision should be taken. When the summands overlap, more work is needed to compute the proper precision.

I think that the whole approach of having a "precision" as an error bound is broken. I note that @fredrik-johansson has moved on from mpmath to Arb which uses ball arithmetic:
https://arblib.org/
https://www.texmacs.org/joris/ball/ball.html

I think that sympy's evalf should do the same even if it is still built on top of mpmath for now (it would be good to use Arb one day). It should be possible for sympy's evalf to be reorganised along these lines at least for algebraic functions (as needed here). For transcendental functions that would have to come from a separate library (e.g. Arb).

With ball arithmetic instead of thinking about precision you keep track of an error bound. Then the precision is effectively the input to a calculation and the error bound is part of the output. When you want to make the error smaller you put in more precision but the error bound that comes back is always strictly correct and depends only indirectly on the precision that was used. Rather than checking if the result "has precision" you can check if zero is contained in the bounds.

@fredrik-johansson
Copy link
Contributor

Yes, ball arithmetic is the way to go for this kind of thing (not for everything, but for most things evalf-esque).

It is possible to do ball arithmetic in pure Python with reasonable performance (about as fast as pure-Python mpmath). I think I even have buggy proof of concept code of this somewhere, but unsure if I'd have time to develop it.

@smichr
Copy link
Member

smichr commented Apr 25, 2021

The following diff passes the tests you have added and runs about 10% faster on master's 'minpoly' tests.

diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index 073e5f3cb0..786cd32613 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -14,6 +14,8 @@
 from sympy.functions.elementary.trigonometric import cos, sin
 from sympy.ntheory import sieve
 from sympy.ntheory.factor_ import divisors
+from sympy.utilities.iterables import subsets
+
 from sympy.polys.densetools import dup_eval
 from sympy.polys.domains import ZZ, QQ
 from sympy.polys.orthopolys import dup_chebyshevt
@@ -48,36 +50,50 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     Return a factor having root ``v``
     It is assumed that one of the factors has root ``v``.
     """
+    from sympy.polys.polyutils import illegal
+
     if isinstance(factors[0], tuple):
         factors = [f[0] for f in factors]
     if len(factors) == 1:
         return factors[0]
 
-    factors = {f:f.as_expr().subs({x:v}) for f in factors}
-
+    prec1 = 10
+    points = {}
     symbols = dom.symbols if hasattr(dom, 'symbols') else []
-
-    for n in range(bound**len(symbols)):
-        n_temp = n
-        points = {}
-        for s in symbols:
-            points[s] = n_temp % bound
-            n_temp = n_temp // bound
-
-        def nonzero(f):
-            n10 = abs(f.evalf(10, points))
-            return n10._prec > 1 and n10 > 0
-
-        candidates = {f:fe for f, fe in factors.items() if not nonzero(fe)}
-        if candidates:
-            factors = candidates
-        if len(factors) == 1:
-            [f] = factors
-            return f
+    while True:
+        # when dealing with non-Rational numbers we usually evaluate
+        # with `subs` argument but we only need a ballpark evaluation
+        xv = {x:v if not v.is_number else v.n(prec1)}
+        fe = [f.as_expr().xreplace(xv) for f in factors]
+
+        # assign integers [0, n) to symbols (if any)
+        for n in subsets(range(bound), k=len(symbols), repetition=True):
+            for s, i in zip(symbols, n):
+                points[s] = i
+
+            # evaluate the expression at these points
+            candidates = [(abs(f.subs(points).n(prec1)), i)
+                for i,f in enumerate(fe)]
+
+            # if we get invalid numbers (e.g. from division by zero)
+            # we try again
+            if any(i in illegal for i, _ in candidates):
+                continue
+
+            # find the smallest two -- if they differ significantly
+            # then we assume we have found the factor that becomes
+            # 0 when v is substituted into it
+            can = sorted(candidates)
+            (a, ix), (b, _) = can[:2]
+            if b != 0 and a/b < 10**6:  # XXX what to use?
+                return factors[ix]
+
+        prec1 *= 2
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)
 
 
+
 def _separate_sq(p):
     """
     helper function for ``_minimal_polynomial_sq``

@jksuom
Copy link
Member

jksuom commented Apr 25, 2021

It looks like subsets only returns non-decreasing sequences (though that will probably suffice). product would give all sequences.

@jksuom
Copy link
Member

jksuom commented Apr 25, 2021

Maybe the last condition could be of the type if b > 10**6*a.

@oscarbenjamin
Copy link
Collaborator Author

Thanks, I've pushed that.

# 0 when v is substituted into it
can = sorted(candidates)
(a, ix), (b, _) = can[:2]
if a < 10**6*b: # XXX what to use?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a is the smaller one:

> /home/jks/test/sympy/sympy/polys/numberfields.py(88)_choose_factor()
-> if a < 10**6*b:  # XXX what to use?
(Pdb) p a, b
(5.801108341e-11, 44.13886734)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes. Surprising that it passes all the tests...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like that confirms my expectation that the minimum could always be chosen.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are no symbols then I can see that the minimum could be chosen.

If we are substituting random integers for the symbols though then could it not be the case that multiple factors end up being exactly zero?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is possible though may not occur in our tests. So some condition like this is necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression is:

In [62]: yv = nsimplify(sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**(5/2)*(1 + x**(-3))**(3/2)) + 1/(x**(11/2)*(1 + x**(-3))**(3/2)))

In [63]: yv
Out[63]: 
                   ________3x        ╲╱  x  + 1           1                   1        
───────────── - ─────────── + ──────────────── + ─────────────────
     ________        x                     3/2                 3/21                    5/2111/21 ⎞   
   ╱  1 + ──                  x   ⋅⎜1 + ──⎟      x    ⋅⎜1 + ──⎟   
  ╱        33⎟            ⎜     3⎟   
╲╱        xx ⎠            ⎝    x

SymPy computes this expression as zero for any real x with x > -1. For x < -1 nonzero values are obtained:

In [64]: yv.subs(x, -2)
Out[64]: √7

We can split the expression up into two terms taking the one with the sqrt in the numerator apart from the others:

In [70]: y1 = yv.args[3]

In [71]: y2 = sum(yv.args[:3])

In [72]: y1
Out[72]: 
    ________3      
-╲╱  x  + 1  
─────────────
      x      

In [73]: y2
Out[73]: 
      √x               1                   1        
───────────── + ──────────────── + ─────────────────
     ________                3/2                 3/21      5/2111/21 ⎞   
   ╱  1 + ──    x   ⋅⎜1 + ──⎟      x    ⋅⎜1 + ──⎟   
  ╱        33⎟            ⎜     3⎟   
╲╱        xx ⎠            ⎝    x

For x > -1 the two parts y1 and y2 are nonzero but sum to zero. For x < -1 both are imaginary but have the same sign so they combine rather than cancel. This depends on our choice of branch for the square roots I guess though so if we allowed y1 to take the other branch then the terms would still cancel.

From purely algebraic considerations we can find a larger polynomial that this expression would need to be a root of e.g.:

In [12]: from sympy.solvers.solvers import unrad

In [13]: factor(unrad(yv - y)[0])
Out[13]: 
                           14                   
   2        142        ⎞   ⎛   3    2  2xy ⋅(x + 1)  ⋅⎝x  - x + 1⎠  ⋅⎝4x  - xy  + 4

This leaves 3 possible choices for y:

In [14]: r1, r2, r3 = roots(_, y)

In [15]: r1.subs(x, I)
Out[15]: 
      ____________
2⋅╲╱ -⋅(1 + ) 

In [16]: r1.subs(x, I).n()
Out[16]: 0.910179721124455 + 2.19736822693562

In [17]: r2.subs(x, I).n()
Out[17]: -0.910179721124455 - 2.19736822693562

In [18]: r3.subs(x, I).n()
Out[18]: 0

Exactly which one we have depends on our interpretation of the square root and other non-integer powers. The minpoly function substitutes only nonnegative integers for x and uses evalf/mpmath so all radicals take the principal root. That's why it chooses the zero root in this case. That's not the root it would find if it chose to substitute negative values for x though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not really sure how well-defined this is. The correct choice of minimal polynomial for this case depends on the value of x but x here is just considered to be an arbitrary symbol.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The minpoly function substitutes only nonnegative integers for x and uses evalf/mpmath so all radicals take the principal root.

Is there some assurance that arguments to radicals are non-negative for non-negative, real values of x?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some assurance that arguments to radicals are non-negative for non-negative, real values of x?

No, not really. I think that this depends essentially on the interpretation that we give for what x is supposed to mean in this expression and that isn't really specified anywhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

y = sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**(5/2)*(1 + x**(-3))**(3/2)) + 1/(x**(11/2)*(1 + x**(-3))**(3/2)) is an interesting example. It is the sum y1 + y2 (#21370 (comment)) where y1 is -sqrt(x**3 + 1)/x and y2 can be simplified (manually) to sqrt(x)*sqrt(1 + 1/x**3). The sum will be 0, if sqrt(x**3 + 1) is written in the form x*sqrt(x)*sqrt(1 + 1/x**3). That cannot be done for all values of x as sqrt(a*b) is different from sqrt(a)*sqrt(b) in general.

If y were to denote an element of an algebraic function field (in risch, for example), then the field should be first constructed by giving suitable generators, u = sqrt(x) and v = sqrt(1 + 1/x**3), say. Then y would be rewritten in terms of the generators. At that point, a choice would have to be made for sqrt(x**3 + 1): either x*u*v or -x*u*v. Then the element would be unambiguous and its minimal polynomial well defined, either y or x**2*y**2 - 4*x**3 - 4 (in monic form).

The algorithms in numberfields have been implement under the assumption that the elements lie in a well defined field. That is obviously true for numbers but not for expressions involving symbols. The minimal polynomial of a field element (over some base field like QQ or QQ(x)) is always irreducible. It might be possible to extend the definition of minimal polynomial to expressions like y above by taking it as the product of several irreducible factors but I'm not sure that it would be worth while.

@smichr
Copy link
Member

smichr commented Apr 25, 2021

product would give all sequences.

Yes, for n in product(*[list(range(bound))]*len(symbols)):

@oscarbenjamin
Copy link
Collaborator Author

product would give all sequences.

Yes, for n in product(*[list(range(bound))]*len(symbols)):

Is that needed?

I'm just not sure how principled this algorithm really is given the discussion above.

@smichr
Copy link
Member

smichr commented Apr 26, 2021

I'm just not sure how principled this algorithm really is

This adds some increased security to the method and still passes numberfield tests:

diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index b01f2cd71b..4b0daba41d 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -49,6 +49,17 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     """
     Return a factor having root ``v``
     It is assumed that one of the factors has root ``v``.
+
+    When `v` is numeric, the evaluation of a factor at that value
+    may not compute with precision to 0, but the other factors
+    will and will be larger, so we will take the one with the
+    minimum value.
+
+    When `v` is symbolic, we will substitute in arbitrary values
+    for the symbols making sure that no denominator is 0 and that
+    all arguments of radicals are positive (so we get the principle
+    value). This should yield a single expression that is 0 provided
+    that enough precision is used.
     """
     from sympy import Pow
     from sympy.solvers.solvers import denoms
@@ -61,45 +72,68 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     prec1 = 10
     points = {}
     symbols = dom.symbols if hasattr(dom, 'symbols') else []
-    while True:
-        # when dealing with non-Rational numbers we usually evaluate
-        # with `subs` argument but we only need a ballpark evaluation
-        xv = {x:v if not v.is_number else v.n(prec1)}
-        fe = [f.as_expr().xreplace(xv) for f in factors]
-
-        concerns = set([i.base for f in factors for i in f.atoms(Pow)
-            if not (i.exp.is_Integer and i.exp > 0 or i.base.is_number)])
-        ex = set.union(*[set(denoms(i)) for i in fe])
-        assert not ex - concerns
-        rbase = concerns
-        p = {i:Dummy(positive=True) for i in symbols}
-        assert all(i.xreplace(p).is_nonnegative for i in rbase)
-
-        # assign integers [0, n) to symbols (if any)
-        from random import randint
-        for do in range(bound):
-            n = [randint(1,100) for i in symbols]
+
+    # when dealing with non-Rational numbers we usually evaluate
+    # with `subs` argument but we only need a ballpark evaluation
+    xv = {x: v if not v.is_number else v.n(prec1)}
+    fe = [f.as_expr().xreplace(xv) for f in factors]
+
+    if not all(i.is_number for i in fe):
+        def _base(i):
+            """return denominator or base of non-integer Pow,
+            recursively
+
+            Examples
+            ========
+
+            >>> from sympy import sqrt
+            >>> from sympy.abc import x
+            >>> _base(x**2)
+            >>> _base(sqrt(x))
+            x
+            >>> _base(x**-2)
+            x
+            >>> _base(1/sqrt(x))
+            x
+            """
+            if not isinstance(i, Pow):
+                return i
+            if i.exp.is_Integer and i.exp < 0:
+                return _base(i.base)
+            if not i.exp.is_Integer:
+                return _base(i.base)
+
+        concerns = set(filter(None,
+            [_base(i) for f in fe for i in f.atoms(Pow)]))
+        p = {i: Dummy(positive=True) for i in symbols}
+        if not all(i.xreplace(p).is_nonnegative for i in concerns):
+            prec1 = prec + 1  # to flag failure
+            # TODO figure out a good way to pick values for
+            # symbols so all denominators/bases are positive
+        else:
+            from random import randint
+            n = [randint(1, 100*len(symbols)) for i in symbols]
             for s, i in zip(symbols, n):
                 points[s] = i
 
-            # evaluate the expression at these points
-            candidates = [(abs(f.subs(points).n(prec1)), i)
-                for i,f in enumerate(fe)]
+    while prec1 <= prec:
+        # evaluate the expression at these points
+        candidates = [(abs(f.xreplace(points).n(prec1)), i)
+            for i, f in enumerate(fe)]
 
-            # find the smallest two -- if they differ significantly
-            # then we assume we have found the factor that becomes
-            # 0 when v is substituted into it
-            can = sorted(candidates)
-            (a, ix), (b, _) = can[:2]
-            if b > a * 10**6:  # XXX what to use?
-                return factors[ix]
+        # find the smallest two -- if they differ significantly
+        # then we assume we have found the factor that becomes
+        # 0 when v is substituted into it
+        can = sorted(candidates)
+        (a, ix), (b, _) = can[:2]
+        if b > a * 10**6:  # XXX what to use?
+            return factors[ix]
 
         prec1 *= 2
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)
 
 
-
 def _separate_sq(p):
     """
     helper function for ``_minimal_polynomial_sq``

@oscarbenjamin
Copy link
Collaborator Author

oscarbenjamin commented Apr 26, 2021

It's not generally possible to make everything inside the radicals nonnegative:

In [15]: minpoly(sqrt(x)+sqrt(-x)+sqrt(x-1), t)
Out[15]: 
 8    6              4223       24       3       2          
t  + t ⋅(4 - 4x) + t ⋅⎝14x  - 12x + 6+ t ⋅⎝44x  - 36x  - 12x + 4+ 25x  - 20x  + 14x  - 4x + 1t  + t ⋅(-4x - 4) + t ⋅⎝14x  + 12x + 6+ t ⋅⎝44x  + 36x  - 12x - 4+ 25x  + 20x  + 14x  + 4x + 1

@smichr
Copy link
Member

smichr commented Apr 26, 2021

It's not generally possible to make everything inside the radicals nonnegative:

The modifications will raise an error if there is ever more than one factor and this is the case. Will the arguments of the radicals always be polynomials?

@oscarbenjamin
Copy link
Collaborator Author

Will the arguments of the radicals always be polynomials?

If someone calls minpoly then they can pass anything in.

I think that the key point is that this is well defined for algebraic expressions that do not involve symbols and will work for those provided that numerical evaluation is robust. However this can not work in general for arbitrary radical expressions involving symbols.

I'm not aware of minpoly actually being used anywhere for expressions involving symbols. Certainly constructing a polynomial domain with extension=True will compute the minimal polynomials of any algebraic expressions and return an algebraic number field but that fails for expressions with symbols:

In [2]: QQ.algebraic_field(sqrt(2), sqrt(3))
Out[2]: QQ<sqrt(2) + sqrt(3)>

In [3]: QQ.algebraic_field(sqrt(2), sqrt(x))
---------------------------------------------------------------------------
TypeError 

In [5]: construct_domain([sqrt(2), sqrt(3)], extension=True)
Out[5]: 
(QQ<sqrt(2) + sqrt(3)>,
 [ANP([mpq(1,2), mpq(0,1), mpq(-9,2), mpq(0,1)], [mpq(1,1), mpq(0,1), mpq(-10,1), mpq(0,1), mpq(1,1)], QQ),
  ANP([mpq(-1,2), mpq(0,1), mpq(11,2), mpq(0,1)], [mpq(1,1), mpq(0,1), mpq(-10,1), mpq(0,1), mpq(1,1)], QQ)])

In [4]: construct_domain([sqrt(2), sqrt(x)], extension=True)
Out[4]: (EX, [EX(sqrt(2)), EX(sqrt(x))])

@smichr
Copy link
Member

smichr commented Apr 26, 2021

Something like this (that passes in master) would fail with the diff I proposed:

>>> minpoly(I + sqrt(2 - I*y))
_x**4 - 2*_x**2 - 4*_x*y + y**2 + 9

@smichr
Copy link
Member

smichr commented Apr 27, 2021

This diff over the current PR removes doubt about numerics -- an exact 0 is sought for a factor into which the value is substituted after making all symbols positive. If it is not found, an error is raised. I am not sure if it would be better to use posify which would allow any negative symbols to remain unchanged.

diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index 27247c7790..01ce657c79 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -49,51 +49,59 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     """
     Return a factor having root ``v``
     It is assumed that one of the factors has root ``v``.
-    """
-    from sympy.polys.polyutils import illegal
 
+    When `v` is numeric, the evaluation of a factor at that value
+    may not compute with precision to 0, but the other factors
+    will and will be larger, so we will take the one with the
+    minimum value.
+
+    When `v` is symbolic, the factor which simplifies to 0 when
+    values are assumed to be positive will be selected as the desired
+    root. If none factors to 0 then a NotImplementedError will be
+    raised.
+    """
     if isinstance(factors[0], tuple):
         factors = [f[0] for f in factors]
     if len(factors) == 1:
         return factors[0]
 
     prec1 = 10
-    points = {}
     symbols = dom.symbols if hasattr(dom, 'symbols') else []
-    while True:
-        # when dealing with non-Rational numbers we usually evaluate
-        # with `subs` argument but we only need a ballpark evaluation
-        xv = {x:v if not v.is_number else v.n(prec1)}
-        fe = [f.as_expr().xreplace(xv) for f in factors]
-
-        # assign integers [0, n) to symbols (if any)
-        for n in subsets(range(bound), k=len(symbols), repetition=True):
-            for s, i in zip(symbols, n):
-                points[s] = i
-
-            # evaluate the expression at these points
-            candidates = [(abs(f.subs(points).n(prec1)), i)
-                for i,f in enumerate(fe)]
-
-            # if we get invalid numbers (e.g. from division by zero)
-            # we try again
-            if any(i in illegal for i, _ in candidates):
-                continue
-
-            # find the smallest two -- if they differ significantly
-            # then we assume we have found the factor that becomes
-            # 0 when v is substituted into it
-            can = sorted(candidates)
-            (a, ix), (b, _) = can[:2]
-            if b > a * 10**6:  # XXX what to use?
-                return factors[ix]
+
+    # when dealing with non-Rational numbers we usually evaluate
+    # with `subs` argument but we only need a ballpark evaluation
+    xv = {x: v if not v.is_number else v.n(prec1)}
+    fe = [f.as_expr().xreplace(xv) for f in factors]
+
+    # handle symbolic expressions by using simplification to
+    # give an expression that is 0
+    if not all(i.is_number for i in fe):
+        p = {i: Dummy('p', positive=True) for i in symbols}
+        fe = [f.xreplace(p).factor() for f in fe]
+        if S.Zero in fe:
+            return factors[fe.index(S.Zero)]
+        # for some reason the expression did not simplify
+        # to zero -- find a new way to select the factor that
+        # becomes 0 when x is replaced with non-numeric v
+        prec1 = prec + 1
+
+    while prec1 <= prec:
+        # evaluate the expression at this precision
+        candidates = [(abs(f.n(prec1)), i) for i, f in enumerate(fe)]
+
+        # find the smallest two -- if they differ significantly
+        # then we assume we have found the factor that becomes
+        # 0 when v is substituted into it
+        can = sorted(candidates)
+        (a, ix), (b, _) = can[:2]
+        if b > a * 10**6:  # XXX what to use?
+            return factors[ix]
 
         prec1 *= 2
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)
 
 
-
 def _separate_sq(p):
     """
     helper function for ``_minimal_polynomial_sq``
@@ -120,7 +128,6 @@ def _separate_sq(p):
     -x**8 + 48*x**6 - 536*x**4 + 1728*x**2 - 400
 
     """
-    from sympy.utilities.iterables import sift
     def is_sqrt(expr):
         return expr.is_Pow and expr.exp is S.Half
     # p = c1*sqrt(q1) + ... + cn*sqrt(qn) -> a = [(c1, q1), .., (cn, qn)]
diff --git a/sympy/polys/tests/test_numberfields.py b/sympy/polys/tests/test_numberfields.py
index 52a259a6ae..16350403ca 100644
--- a/sympy/polys/tests/test_numberfields.py
+++ b/sympy/polys/tests/test_numberfields.py
@@ -182,6 +182,9 @@ def test_minimal_polynomial_issue_19732():
             + 211295968822207088328287206509522887719741955693091053353263782924470627623790749534705683380138972642560898936171035770539616881000369889020398551821767092685775598633794696371561234818461806577723412581353857653829324364446419444210520602157621008010129702779407422072249192199762604318993590841636967747488049176548615614290254356975376588506729604345612047361483789518445332415765213187893207704958013682516462853001964919444736320672860140355089)
     assert minimal_polynomial(expr, x) == poly
 
+    assert minimal_polynomial(I + sqrt(2 - I*y), x) == \
+        x**4 - 2*x**2 - 4*x*y + y**2 + 9
+
 
 def test_minimal_polynomial_hi_prec():
     p = 1/sqrt(1 - 9*sqrt(2) + 7*sqrt(3) + Rational(1, 10)**30)
@@ -324,6 +327,7 @@ def test_primitive_element():
     a, b = I*sqrt(2*sqrt(2) + 3), I*sqrt(-2*sqrt(2) + 3)
     assert primitive_element([a, b, I], x) == (x**4 + 6*x**2 + 1, [1, 0, 0])
 
+
 def test_field_isomorphism_pslq():
     a = AlgebraicNumber(I)
     b = AlgebraicNumber(I*sqrt(3))
@@ -732,6 +736,7 @@ def test_isolate():
 
     raises(NotImplementedError, lambda: isolate(I))
 
+
 def test_minpoly_fraction_field():
     assert minimal_polynomial(1/x, y) == -x*y + 1
     assert minimal_polynomial(1 / (x + 1), y) == (x + 1)*y - 1
@@ -768,11 +773,13 @@ def test_minpoly_fraction_field():
     raises(GeneratorsError, lambda: minimal_polynomial(sqrt(x) - y, x))
     raises(NotImplementedError, lambda: minimal_polynomial(sqrt(x), y, compose=False))
 
+
 @slow
 def test_minpoly_fraction_field_slow():
     assert minimal_polynomial(minimal_polynomial(sqrt(x**Rational(1,5) - 1),
         y).subs(y, sqrt(x**Rational(1,5) - 1)), z) == z
 
+
 def test_minpoly_domain():
     assert minimal_polynomial(sqrt(2), x, domain=QQ.algebraic_field(sqrt(2))) == \
         x - sqrt(2)
@@ -806,6 +813,7 @@ def test_issue_13230():
     + sqrt(196*sqrt(35) + 1941)/29), Point2D(-1 + (-sqrt(7) + sqrt(5))*(-sqrt(196*sqrt(35)
     + 1941)/29 - 2*sqrt(7)/29 + 9*sqrt(5)/29), -sqrt(196*sqrt(35) + 1941)/29 - 2*sqrt(7)/29 + 9*sqrt(5)/29)]
 
+
 def test_issue_19760():
     e = 1/(sqrt(1 + sqrt(2)) - sqrt(2)*sqrt(1 + sqrt(2))) + 1
     mp_expected = x**4 - 4*x**3 + 4*x**2 - 2

@oscarbenjamin
Copy link
Collaborator Author

an exact 0 is sought for a factor into which the value is substituted after making all symbols positive.

I don't think this really resolves the problem that the whole premise of the function is not well defined in the case of arbitrary symbols in the "algebraic" expression. I'm not sure that any diff can resolve that problem...

To me the main goal is just making sure that the function works reliably and efficiently in the case where there are no symbols.

@smichr
Copy link
Member

smichr commented Apr 28, 2021

function is not well defined in the case of arbitrary symbols in the "algebraic" expression.

Let @jksuom's statement be the epitaph: "algorithms in numberfields have been implement under the assumption that the elements lie in a well defined field. That is obviously true for numbers but not for expressions involving symbols" and let's just raise a TypeError on non-rational symbolic input.

@jksuom
Copy link
Member

jksuom commented Apr 28, 2021

I think that this would work fine also with symbolic expressions that are well defined such as those that would be handled by risch. It might fail with NonImplementedError in some (rare) cases but then we could consider lowering the constant 10**6 or adding more substitutions for the symbols. I would be willing to merge this as it is.

@oscarbenjamin
Copy link
Collaborator Author

Yes, I suppose there are cases that are well defined like if the symbol is not in a radical. In the cases where the minimal polynomial does depend on the value of the symbol the current implementation should return an answer that is valid for some choice of that value.

If we wanted to detect those cases where it depends on the value of the symbol then for a radical expression like yv we could consider all possible branches of the radicals. If every factor is nonzero for some branch of the radicals and some value of the symbol then a unique minimal polynomial is not well defined.

I was about to say that we should just merge this but then I found a problem from the symbol answer above. Substituting I/5 for the symbol should make it well defined but then minpoly seems to hang in this case:

In [1]: yv = nsimplify(sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**(5/2)*(1 + x**(-3))**(3/2)) + 1/(x**(11/2)*(1 + x**(-3))**(3/2)))

In [3]: e = yv.subs(x, I/5)

In [4]: minpoly(e)

In this case it seems that the two factors are:

In [17]: f1
Out[17]: Poly(x**8 + 63000*x**6 + 7630843843752*x**4 + 232711422859188000*x**2 + 14553778052345276611359376, x, domain='QQ')

In [18]: f2
Out[18]: Poly(x**8 + 63000*x**6 + 7630890718752*x**4 + 232711915046688000*x**2 + 14553718443946594955109376, x, domain='QQ')

Numerical evaluation suggests that e is not a root of either:

In [19]: f1.subs(x, e).n()
Out[19]: 1.45537548575065e+25 + 1.84948204781118e+17

In [20]: f2.subs(x, e).n()
Out[20]: 1.4553695249059e+25 + 1.84948591031e+17

Possibly this is a bug somewhere else but the algorithm in this PR has no upper limit on the precision so it goes into an infinite loop in this case.

@jksuom
Copy link
Member

jksuom commented Apr 28, 2021

I had forgotten prec1. Obviously something will be needed to limit that even though it seems that the two polynomials f1 and f2 result from a bug elsewhere.

@jksuom
Copy link
Member

jksuom commented Apr 28, 2021

The minimal polynomial of some simple expressions obtained by substituting I/5 is wrong. It looks like the error is coming from Factors:

In [5]: e = 125*I*I**(S(5)/2)*sqrt(1 + 125*I)                                   

In [6]: e                                                                       
Out[6]: 
       5/2   ___________
125⋅ⅈ⋅ⅈ   ⋅╲╱ 1 + 125⋅ⅈ 

In [7]: Factors(e)                                                              
Out[7]: Factors({125: 1, I: 1, 1 + 125*I: 1/2})

@oscarbenjamin
Copy link
Collaborator Author

A simpler example:

In [6]: from sympy.core.exprtools import Factors

In [7]: Factors(I*sqrt(I))
Out[7]: Factors({I: 1})

The fix seems to be

diff --git a/sympy/core/exprtools.py b/sympy/core/exprtools.py
index 58914ba83b..1392b06ab7 100644
--- a/sympy/core/exprtools.py
+++ b/sympy/core/exprtools.py
@@ -363,7 +363,7 @@ def __init__(self, factors=None):  # Factors
                     factors[q] = (factors[q] if q in factors else S.Zero) - factors[f]
                     factors.pop(f)
             if i:
-                factors[I] = S.One*i
+                factors[I] = factors.get(I, S.Zero) + S.One*i
             if nc:
                 factors[Mul(*nc, evaluate=False)] = S.One
         else:

That gives:

In [22]: yv = nsimplify(sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**(5/2)*(1 + x**(-3))**(3/2)) + 1/(x**(11/2)*(1 + x**(-3))**(3/2)))

In [23]: e = yv.subs(x, I/5)

In [24]: minpoly(e)
Out[24]: 
    4         2         
25x  + 5000x  + 250016

which seems to be correct.

It seems that Factors has special handling of I to work around the special handling that I has in as_powers_dict():

In [3]: x.as_powers_dict()
Out[3]: defaultdict(int, {x: 1})

In [4]: I.as_powers_dict()
Out[4]: defaultdict(int, {-1: 1/2})

I also wonder if I.as_powers_dict should be changed.

@smichr
Copy link
Member

smichr commented Apr 28, 2021

I also wonder if I.as_powers_dict should be changed.

I.as_powers_dict is ok, but Pow.as_powers_dict can be changed to recognize base I and then with that single change -- none to Factors -- the same result is obtained:

>>> from sympy import *
>>> from sympy.core.exprtools import Factors
>>> e = 125*I*I**(S(5)/2)*sqrt(1 + 125*I)
>>> Factors(e).as_expr().equals(e)
True
>>> from sympy.abc import x
>>> minpoly(e, x)
x**4 - 3906250*x**2 + 3814941406250
>>> [RootOf(x**4 - 3906250*x**2 + 3814941406250, i).n(2) -e.n(2) for i in range(4)]
[-2.8e+3, -2.8e+3 + 11.0*I, 0, 11.0*I]
>>> minpoly(yv.subs(x, I/5), x)
25*x**4 + 5000*x**2 + 250016

diff is

diff --git a/sympy/core/power.py b/sympy/core/power.py
index 4c8aabc702..da88025525 100644
--- a/sympy/core/power.py
+++ b/sympy/core/power.py
@@ -932,6 +932,8 @@ def as_base_exp(self):
         b, e = self.args
         if b.is_Rational and b.p == 1 and b.q != 1:
             return Integer(b.q), -e
+        if b is S.ImaginaryUnit:
+            return S.NegativeOne, e/2
         return b, e

     def _eval_adjoint(self):
diff --git a/sympy/core/tests/test_power.py b/sympy/core/tests/test_power.py
index 7ee6791e4d..2f5af49d0d 100644
--- a/sympy/core/tests/test_power.py
+++ b/sympy/core/tests/test_power.py
@@ -283,6 +283,7 @@ def test_pow_as_base_exp():
     assert p.base, p.exp == p.as_base_exp() == (S(2), -x)
     # issue 8344:
     assert Pow(1, 2, evaluate=False).as_base_exp() == (S.One, S(2))
+    assert Pow(I, S.Half).as_base_exp() == (-1, S(1)/4)

@smichr
Copy link
Member

smichr commented Apr 28, 2021

Here is a diff over current that limits symbolic input to known rational expressions only and has a limit on the precision:

diff --git a/sympy/core/power.py b/sympy/core/power.py
index 4c8aabc702..da88025525 100644
--- a/sympy/core/power.py
+++ b/sympy/core/power.py
@@ -932,6 +932,8 @@ def as_base_exp(self):
         b, e = self.args
         if b.is_Rational and b.p == 1 and b.q != 1:
             return Integer(b.q), -e
+        if b is S.ImaginaryUnit:
+            return S.NegativeOne, e/2
         return b, e
 
     def _eval_adjoint(self):
diff --git a/sympy/core/tests/test_power.py b/sympy/core/tests/test_power.py
index 7ee6791e4d..2f5af49d0d 100644
--- a/sympy/core/tests/test_power.py
+++ b/sympy/core/tests/test_power.py
@@ -283,6 +283,7 @@ def test_pow_as_base_exp():
     assert p.base, p.exp == p.as_base_exp() == (S(2), -x)
     # issue 8344:
     assert Pow(1, 2, evaluate=False).as_base_exp() == (S.One, S(2))
+    assert Pow(I, S.Half).as_base_exp() == (-1, S(1)/4)
 
 
 def test_nseries():
diff --git a/sympy/polys/numberfields.py b/sympy/polys/numberfields.py
index 27247c7790..22d997a428 100644
--- a/sympy/polys/numberfields.py
+++ b/sympy/polys/numberfields.py
@@ -48,9 +48,14 @@
 def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
     """
     Return a factor having root ``v``
-    It is assumed that one of the factors has root ``v``.
+    It is assumed that one of the factors has root ``v``
+    and that ``v`` is numeric.
+
+    The evaluation of a factor at ``v`` may not compute with
+    precision to 0, but the other factors will and will be
+    larger, so we will take the one with the minimum value.
     """
-    from sympy.polys.polyutils import illegal
+    # assert v.is_number is the assumption for what follows
 
     if isinstance(factors[0], tuple):
         factors = [f[0] for f in factors]
@@ -58,42 +63,31 @@ def _choose_factor(factors, x, v, dom=QQ, prec=200, bound=5):
         return factors[0]
 
     prec1 = 10
-    points = {}
     symbols = dom.symbols if hasattr(dom, 'symbols') else []
-    while True:
-        # when dealing with non-Rational numbers we usually evaluate
-        # with `subs` argument but we only need a ballpark evaluation
-        xv = {x:v if not v.is_number else v.n(prec1)}
-        fe = [f.as_expr().xreplace(xv) for f in factors]
-
-        # assign integers [0, n) to symbols (if any)
-        for n in subsets(range(bound), k=len(symbols), repetition=True):
-            for s, i in zip(symbols, n):
-                points[s] = i
-
-            # evaluate the expression at these points
-            candidates = [(abs(f.subs(points).n(prec1)), i)
-                for i,f in enumerate(fe)]
-
-            # if we get invalid numbers (e.g. from division by zero)
-            # we try again
-            if any(i in illegal for i, _ in candidates):
-                continue
 
-            # find the smallest two -- if they differ significantly
-            # then we assume we have found the factor that becomes
-            # 0 when v is substituted into it
-            can = sorted(candidates)
-            (a, ix), (b, _) = can[:2]
-            if b > a * 10**6:  # XXX what to use?
-                return factors[ix]
+    # when dealing with non-Rational numbers we usually evaluate
+    # with `subs` argument but we only need a ballpark evaluation
+    xv = {x: v if not v.is_number else v.n(prec1)}
+    fe = [f.as_expr().xreplace(xv) for f in factors]
 
+    while prec1 <= prec:
+        # evaluate the expression at this precision
+        candidates = [(abs(f.n(prec1)), i) for i, f in enumerate(fe)]
+
+        # find the smallest two -- if they differ significantly
+        # then we assume we have found the factor that becomes
+        # 0 when v is substituted into it
+        can = sorted(candidates)
+        (a, ix), (b, _) = can[:2]
+        if b > a * 10**6:  # XXX what to use?
+            return factors[ix]
+
+        # else we raise the precision and try again
         prec1 *= 2
 
     raise NotImplementedError("multiple candidates for the minimal polynomial of %s" % v)
 
 
-
 def _separate_sq(p):
     """
     helper function for ``_minimal_polynomial_sq``
@@ -120,7 +114,6 @@ def _separate_sq(p):
     -x**8 + 48*x**6 - 536*x**4 + 1728*x**2 - 400
 
     """
-    from sympy.utilities.iterables import sift
     def is_sqrt(expr):
         return expr.is_Pow and expr.exp is S.Half
     # p = c1*sqrt(q1) + ... + cn*sqrt(qn) -> a = [(c1, q1), .., (cn, qn)]
@@ -657,11 +650,28 @@ def minimal_polynomial(ex, x=None, compose=True, polys=False, domain=None):
     from sympy.polys.polytools import degree
     from sympy.polys.domains import FractionField
     from sympy.core.basic import preorder_traversal
+    from sympy.core.power import Pow
 
     ex = sympify(ex)
+
     if ex.is_number:
-        # not sure if it's always needed but try it for numbers (issue 8354)
+        # not sure if it's always needed but (see issue 8354)
         ex = _mexpand(ex, recursive=True)
+
+    free = ex.free_symbols
+
+    if not ex.is_number:
+        for i in ex.atoms(Pow):
+            if i.is_number:
+                continue
+            if not i.exp.is_Integer:
+                if i.exp.is_number:
+                    raise TypeError('exponents of symbolic bases must be literal integers')
+                else:
+                    raise TypeError('symbolic exponents are not supported')
+            elif i.exp < 0:
+                if i.base.is_zero is not False:
+                    raise TypeError('denominators must be non-zero')
     for expr in preorder_traversal(ex):
         if expr.is_AlgebraicNumber:
             compose = False
@@ -673,8 +683,8 @@ def minimal_polynomial(ex, x=None, compose=True, polys=False, domain=None):
         x, cls = Dummy('x'), PurePoly
 
     if not domain:
-        if ex.free_symbols:
-            domain = FractionField(QQ, list(ex.free_symbols))
+        if free:
+            domain = FractionField(QQ, list(free))
         else:
             domain = QQ
     if hasattr(domain, 'symbols') and x in domain.symbols:
diff --git a/sympy/polys/tests/test_numberfields.py b/sympy/polys/tests/test_numberfields.py
index 52a259a6ae..6544d5d040 100644
--- a/sympy/polys/tests/test_numberfields.py
+++ b/sympy/polys/tests/test_numberfields.py
@@ -33,6 +33,10 @@
 from sympy.abc import x, y, z
 
 Q = Rational
+i = Symbol('i', integer=True)
+r = Symbol('r', rational=True)
+pr = Symbol('pr', rational=True, positive=True)
+nzr = Symbol('nzr', rational=True, zero=False)
 
 
 def test_minimal_polynomial():
@@ -74,8 +78,10 @@ def test_minimal_polynomial():
     assert minimal_polynomial(
         1/sqrt(a), x) == 392*x**8 - 1232*x**6 + 612*x**4 + 4*x**2 - 1
 
+    raises(TypeError, lambda: minimal_polynomial(1/r, x))
+    raises(TypeError, lambda: minimal_polynomial(2**y, x))
+    raises(TypeError, lambda: minimal_polynomial(2**i, x))
     raises(NotAlgebraic, lambda: minimal_polynomial(oo, x))
-    raises(NotAlgebraic, lambda: minimal_polynomial(2**y, x))
     raises(NotAlgebraic, lambda: minimal_polynomial(sin(1), x))
 
     assert minimal_polynomial(sqrt(2)).dummy_eq(x**2 - 2)
@@ -324,6 +330,7 @@ def test_primitive_element():
     a, b = I*sqrt(2*sqrt(2) + 3), I*sqrt(-2*sqrt(2) + 3)
     assert primitive_element([a, b, I], x) == (x**4 + 6*x**2 + 1, [1, 0, 0])
 
+
 def test_field_isomorphism_pslq():
     a = AlgebraicNumber(I)
     b = AlgebraicNumber(I*sqrt(3))
@@ -732,47 +739,39 @@ def test_isolate():
 
     raises(NotImplementedError, lambda: isolate(I))
 
+
 def test_minpoly_fraction_field():
-    assert minimal_polynomial(1/x, y) == -x*y + 1
-    assert minimal_polynomial(1 / (x + 1), y) == (x + 1)*y - 1
-
-    assert minimal_polynomial(sqrt(x), y) == y**2 - x
-    assert minimal_polynomial(sqrt(x + 1), y) == y**2 - x - 1
-    assert minimal_polynomial(sqrt(x) / x, y) == x*y**2 - 1
-    assert minimal_polynomial(sqrt(2) * sqrt(x), y) == y**2 - 2 * x
-    assert minimal_polynomial(sqrt(2) + sqrt(x), y) == \
-        y**4 + (-2*x - 4)*y**2 + x**2 - 4*x + 4
-
-    assert minimal_polynomial(x**Rational(1,3), y) == y**3 - x
-    assert minimal_polynomial(x**Rational(1,3) + sqrt(x), y) == \
-        y**6 - 3*x*y**4 - 2*x*y**3 + 3*x**2*y**2 - 6*x**2*y - x**3 + x**2
-
-    assert minimal_polynomial(sqrt(x) / z, y) == z**2*y**2 - x
-    assert minimal_polynomial(sqrt(x) / (z + 1), y) == (z**2 + 2*z + 1)*y**2 - x
-
-    assert minimal_polynomial(1/x, y, polys=True) == Poly(-x*y + 1, y, domain='ZZ(x)')
-    assert minimal_polynomial(1 / (x + 1), y, polys=True) == \
-        Poly((x + 1)*y - 1, y, domain='ZZ(x)')
-    assert minimal_polynomial(sqrt(x), y, polys=True) == Poly(y**2 - x, y, domain='ZZ(x)')
-    assert minimal_polynomial(sqrt(x) / z, y, polys=True) == \
-        Poly(z**2*y**2 - x, y, domain='ZZ(x, z)')
+    raises(TypeError, lambda: minimal_polynomial(1/x, y))
+    assert minimal_polynomial(1/nzr, x) == 1 - nzr*x
+    assert str(minimal_polynomial(1/nzr, x, polys=True)) == \
+        "Poly(-nzr*x + 1, x, domain='ZZ(nzr)')"
+    raises(TypeError, lambda: minimal_polynomial(1/(r + 1), y))
+    assert minimal_polynomial(1/(pr + 1), y) == (pr + 1)*y - 1
+    assert str(minimal_polynomial(1/(pr + 1), y, polys=True)) == \
+        "Poly((pr + 1)*y - 1, y, domain='ZZ(pr)')"
 
     # this is (sqrt(1 + x**3)/x).integrate(x).diff(x) - sqrt(1 + x**3)/x
+    # and is only 0 when x is real and greater than -1; the answer
+    # cannot be an unqualified y (see issue 21395)
     a = sqrt(x)/sqrt(1 + x**(-3)) - sqrt(x**3 + 1)/x + 1/(x**Rational(5, 2)* \
         (1 + x**(-3))**Rational(3, 2)) + 1/(x**Rational(11, 2)*(1 + x**(-3))**Rational(3, 2))
+    raises(TypeError, lambda: minimal_polynomial(a, y))
 
-    assert minimal_polynomial(a, y) == y
+    # issue 8354
+    e=S('''-2**(1/3)*(3*sqrt(93) + 29)**2 - 4*(3*sqrt(93) + 29)**(4/3) +
+        12*sqrt(93)*(3*sqrt(93) + 29)**(1/3) + 116*(3*sqrt(93) + 29)**(1/3) +
+        174*2**(1/3)*sqrt(93) + 1678*2**(1/3)''')
+    assert minimal_polynomial(e, x) == x
 
     raises(NotAlgebraic, lambda: minimal_polynomial(exp(x), y))
-    raises(GeneratorsError, lambda: minimal_polynomial(sqrt(x), x))
-    raises(GeneratorsError, lambda: minimal_polynomial(sqrt(x) - y, x))
-    raises(NotImplementedError, lambda: minimal_polynomial(sqrt(x), y, compose=False))
+
 
 @slow
 def test_minpoly_fraction_field_slow():
     assert minimal_polynomial(minimal_polynomial(sqrt(x**Rational(1,5) - 1),
         y).subs(y, sqrt(x**Rational(1,5) - 1)), z) == z
 
+
 def test_minpoly_domain():
     assert minimal_polynomial(sqrt(2), x, domain=QQ.algebraic_field(sqrt(2))) == \
         x - sqrt(2)
@@ -806,6 +805,7 @@ def test_issue_13230():
     + sqrt(196*sqrt(35) + 1941)/29), Point2D(-1 + (-sqrt(7) + sqrt(5))*(-sqrt(196*sqrt(35)
     + 1941)/29 - 2*sqrt(7)/29 + 9*sqrt(5)/29), -sqrt(196*sqrt(35) + 1941)/29 - 2*sqrt(7)/29 + 9*sqrt(5)/29)]
 
+
 def test_issue_19760():
     e = 1/(sqrt(1 + sqrt(2)) - sqrt(2)*sqrt(1 + sqrt(2))) + 1
     mp_expected = x**4 - 4*x**3 + 4*x**2 - 2

@oscarbenjamin
Copy link
Collaborator Author

I also wonder if I.as_powers_dict should be changed.

I.as_powers_dict is ok, but Pow.as_powers_dict can be changed to recognize base I and then with that single change -- none to Factors -- the same result is obtained:

With that change we get:

In [1]: I**(S(3)/2)
Out[1]: 
    3/4
(-1) 

Whereas master gives:

In [3]: I**(S(3)/2)
Out[3]: 
 3/2
 

@smichr
Copy link
Member

smichr commented Apr 28, 2021

Whereas master gives:

I guess I like retaining the power of I rather than changing it to -1. But it seems like powers should be reduced so I**(e + r) -> I**e*I**r where e is even, e.g. I**(5/S(2)) -> -sqrt(I)

@oscarbenjamin
Copy link
Collaborator Author

Well actually it's just the printing:

In [1]: I**(S(3)/2)
Out[1]: 
    3/4
(-1)   

In [2]: _.args
Out[2]: (, 3/2)

I think that shows though that messing with as_base_exp is going to break expectations elsewhere. I generally think of as_base_exp as being a safe way to retrieve the args of a Pow without assuming that an expression actually is a Pow because it might have evaluated to something else.

To me the issue seemed to be the other way around (although I'm not sure now):

In [3]: x.as_powers_dict()
Out[3]: defaultdict(int, {x: 1})

In [4]: I.as_powers_dict()
Out[4]: defaultdict(int, {-1: 1/2})

Shouldn't I be treated as an atom? That in turn comes from

In [2]: I.as_base_exp()
Out[2]: (-1, 1/2)

but that is potentially not so easy/desirable to change.

There are only a handful of places around the codebase where as_powers_dict is being used with Factors being one of them. At least for Factors this behaviour is apparently unwanted and there seems to be code to try and compensate for this.

I'm just not quite sure I can get my head round the reasoning for each of these different behaviours though. It seems like Factors wants to treat I as being something different from Pow(-1, S.Half) but maybe that's the mistake and it should be Factors that should just accept that I is really a singleton Pow:

In [10]: Factors(2*sqrt(2))
Out[10]: Factors({2: 3/2})

In [11]: Factors((-1)*sqrt(-1))
Out[11]: Factors({-1: 1, I: 1})

@smichr
Copy link
Member

smichr commented Apr 28, 2021

it should be Factors that should just accept that I is really a singleton Pow

I suspect that it can handle keeping the two separate -- it wasn't expecting powers of I, however.

@jksuom
Copy link
Member

jksuom commented Apr 29, 2021

I think this is ready to go. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

minpoly fails for complicated algebraic number
5 participants