## Tuesday, July 1, 2008

### GMPY makes mpmath more speedy

This post will mostly be about recent mpmath development, but first a short update about my GSoC progress. Over the last week, I have improved the support for complex numbers (addition and multiplication of complex numbers works and detects cancellation in the real and imaginary parts), implemented some more functions, and added support for integrals. I still need to clean up and test the code some more. A longer update will be posted later today or tomorrow.

Besides working directly with the GSoC project, I've been busy playing with the results of a patch for mpmath that was submitted a few days ago by casevh. This brilliant patch allows mpmath to use GMPY mpz's instead of Python's built-in long integers.

GMPY, of course, is a Python wrapper for the highly optimized GMP bignum library. The beauty of GMPY is that you can just import it anywhere in your Python code, change x = 1 to x = mpz(1), and then almost any code written under the assumption of x being a Python int will continue to work, the only difference being that it will be much faster if x grows large.

(Mpmath attempts to accomplish something similar, being an intended drop-in replacement for the math and cmath modules and even parts of SciPy. However, mpmath's advantage is only that it gives increased precision; for low-precision number crunching, it is obviously much slower than ordinary floats.)

To try out mpmath with GMPY support, you need to check out the current development version from the mpmath SVN repository. There will be a new release soon, but not in the next few days. Casevh discovered a few bugs in GMPY while implementing this feature, so the patched version 1.03 of GMPY is needed. Mpmath will still work if an older version of GMPY is present; in that case, it will simply ignore it and use Python integers as usual. Just to be clear, mpmath is still a pure-Python library and will continue to work fine with GMPY unavailable (it will just not be as fast).

The improvement is quite significant. At precisions between 1,000 to 10,000 digits, most mpmath functions are of the order of 10 times faster with GMPY enabled. I have on occasion tried out some computations at 100,000 digits with mpmath; although they work fine, I have had to do something else while waiting for the results. With GMPY enabled, computing something like exp(sin(sqrt(2))) to 100,000 digits now only takes a couple of seconds.

SymPy will of course benefit from this feature. N/evalf will be able to handle ill-conditioned input much more efficiently, and the maximum working precision can be set higher, giving improved power and reliability for the same computing time.

The GMPY mode is slightly slower than the Python mode at low precision (10 percent slower below 100 digits of precision, or thereabout). That's something I can live with.

A particularly dramatic improvement can be seen by running the pidigits.py demo script. After I fixed an inefficiency in the number-to-string conversion, computing 1 million decimals of π with mpmath in GMPY-mode takes less than 30 seconds on my laptop:

C:\Source\mp\trunk\demo>pidigits.pyCompute digits of pi with mpmathWhich base? (2-36, 10 for decimal)> 10How many digits? (enter a big number, say, 10000)> 1000000Output to file? (enter a filename, or just press enterto print directly to the screen)> pi.txtStep 1 of 2: calculating binary value...  iteration 1 (accuracy ~= 0 base-10 digits)  iteration 2 (accuracy ~= 2 base-10 digits)  iteration 3 (accuracy ~= 4 base-10 digits)  iteration 4 (accuracy ~= 10 base-10 digits)  iteration 5 (accuracy ~= 21 base-10 digits)  iteration 6 (accuracy ~= 43 base-10 digits)  iteration 7 (accuracy ~= 86 base-10 digits)  iteration 8 (accuracy ~= 173 base-10 digits)  iteration 9 (accuracy ~= 348 base-10 digits)  iteration 10 (accuracy ~= 697 base-10 digits)  iteration 11 (accuracy ~= 1396 base-10 digits)  iteration 12 (accuracy ~= 2793 base-10 digits)  iteration 13 (accuracy ~= 5587 base-10 digits)  iteration 14 (accuracy ~= 11176 base-10 digits)  iteration 15 (accuracy ~= 22353 base-10 digits)  iteration 16 (accuracy ~= 44707 base-10 digits)  iteration 17 (accuracy ~= 89414 base-10 digits)  iteration 18 (accuracy ~= 178830 base-10 digits)  iteration 19 (accuracy ~= 357662 base-10 digits)  iteration 20 (accuracy ~= 715325 base-10 digits)  iteration 21 (accuracy ~= 1000017 base-10 digits)  final divisionStep 2 of 2: converting to specified base...Writing output...Finished in 28.540354 seconds (26.498292 calc, 2.042062 convert)

The same operation takes around 30 minutes in Python mode.

The speed of computing π with mpmath+GMPY is entirely respectable; some of the specialized π programs out there are actually slower (see Stu's pi page for an overview). The really fast ones run in 2-3 seconds; most require between 30 seconds and a minute.

One reason for mpmath+GMPY still being a bit slower for calculating π than the fastest programs is that it uses an arithmetic-geometric mean (AGM) based algorithm. The fastest algorithm in practice is the Chudnovsky series (with binary splitting), but its implementation is more complex, and making it really fast would require eliminating the Python and GMPY overhead as well. (A highly optimized implementation for GMP has been written by Hanhong Xue.)

The presence of GMPY does change the relative performance of various algorithms, particularly by favoring algorithms that take advantage of asymptotically fast multiplication. To try this out, I implemented binary splitting for computing e (see commit) in mpmath. Computing 1 million digits of e with mpmath now takes 3 seconds (converting to a base-10 string takes an additional 2 seconds). It actually turns out that binary splitting method for e is faster than the old direct summation method in Python mode as well, but only slightly.

Two of the most important functions are exp and log. Mpmath computes exp via Taylor series, using Brent's trick of replacing x by x/22k to accelerate convergence. Log is computed by inverting exp with Newton's method (in fact Halley's method is used now, but the principle is the same). This makes log roughly half as fast as exp at low precision, but asymptotically equal to exp. With GMPY, it becomes much faster to compute log at high precision using an AGM-based algorithm. This also implies that it becomes faster to compute exp at extremely high precision using Newton inversion of log, in an interesting reversal of roles. I have written some preliminary code for this, which I should be able to commit soon.

Several others have made great contributions to mpmath recently. Mario Pernici has improved the implementations of various elementary functions, Vinzent Steinberg has submitted some very interesting new root-finding code, and Mike Taschuk sent in a file implementing Jacobi elliptic functions for mpmath.

Felix Richter, who works for the semiconductor theory group at the physics department of the University of Rostock, shared some code for solving linear equation systems in a private correspondence. (I promised I would tidy it up and add it to mpmath, but it appears that I've been too lazy to do that yet.) He had some very interesting things to say about his research and why he wrote the code (I hope he doesn't mind that I quote a part of his mail here):

The range of problems we currently attack can be described as follows: Imagine a light beam incident on some solid body, e.g., a semiconductor crystal of some 50 micrometers length. When the light propagates into the sample it gets damped, and its strength will sooner or later become incredibly small, i.e., it can no longer be held in a double precision float variable.

However, a lot of interesting things can be calculated with the help of these electromagnetic field strength, such as predictions for the output of semiconductor lasers or signs for the much-sought-after Bose-Einstein condensation of excitons (electron-hole pairs in a semiconductor). I sure shouldn't go much more into detail here, as I suppose you're not a physicist.

Just one example: As a consistency check, the equation

1 - |r|^2 - |t|^2 = \int dx dx' A(x') chi(x,x') A(x)

should be fullfilled for any light frequency. r, t, A(x) are complex electromagnetic field strengths and just cannot be calculated with machine precision alone, not to mention further calculations based on them. (Chi is the susceptibility and describes the electromagnetic properties of the matter).

However, due to the magic of math (and mpmath ;-) ), the interplay of these incredibly small numbers yields two perfectly equal numbers between 0 and 1, as it should.

I'm really happy having chosen Python for my work. Mpmath fits in well, its so easy to use. I especially like the operator overloading, so that I may add some complex multiprecision numbers with a simple "+". And mpmath's mathematical and numerical function library is so extensive that it can take you really a long way.

It is always fun to hear from people who use software you're working on, but it's especially rewarding when you learn that the software is being used to solve real problems.

Computing a million digits of π, on the other hand, could hardly be called a real-world problem. But it is the sort of thing any competent arbitrary-precision library should be able to do without effort. Sometimes math is done to construct semiconductor lasers; sometimes math is done just because it's fun (and often there is some overlap :-).

#### 3 comments:

Alex said...

Hi Fredrik -- as you say, "It is always fun to hear from people who use software you're working on" -- as gmpy's author that's exactly how I felt when reading this;-). (And Casevh's work to make 1.03 is clearly proving just as great as it looked, too;-).

I'm particularly happy to hear about my software being used in Göteborg, a delightful city to which I've traveled dozens of times when I was a freelance a few years ago...!-)

Alex

Fredrik Johansson said...

Hi Alex, thanks for the comment (and for gmpy :-)

generic cialis said...

Interesting article, added his blog to Favorites