Wednesday, September 22, 2010

Again, mpmath in Sage is about to get faster

My summer project on special functions in mpmath and Sage, generously supported by William Stein with funds from NSF grant DMS-0757627, is nearing completion. I will soon release mpmath-0.16, which contains lots of new special functions and bugfixes. Sage users will also benefit from ~1500 lines of new Cython code (preliminary patch here) that speeds up various basic operations. Executing mpmath.runtests() in Sage on my laptop now takes 10.47 seconds (8.60 from a warm cache), compared to 14.21 (11.84) seconds with the new extensions disabled -- a global speedup of 30%.

For comparison, pure-Python mpmath with gmpy as the backend takes 21.46 (18.72) seconds to execute the unit tests and pure-Python mpmath with the pure-Python backend takes 52.33 (45.92) seconds.

Specifically, the new extension code implements exp for real and complex arguments, cos, sin and ln for real arguments, complex exponentiation in some cases, and summation of hypergeometric series, entirely in Cython.

Timings before (new extensions disabled):

sage: import mpmath
sage: x = mpmath.mpf(0.37)
sage: y = mpmath.mpf(0.49)
sage: %timeit mpmath.exp(x)
625 loops, best of 3: 14.5 µs per loop
sage: %timeit mpmath.ln(x)
625 loops, best of 3: 23.2 µs per loop
sage: %timeit mpmath.cos(x)
625 loops, best of 3: 17.2 µs per loop
sage: %timeit x ^ y
625 loops, best of 3: 39.9 µs per loop
sage: %timeit mpmath.hyp1f1(2r,3r,4r)
625 loops, best of 3: 90.3 µs per loop
sage: %timeit mpmath.hyp1f1(x,y,x)
625 loops, best of 3: 83.6 µs per loop
sage: %timeit mpmath.hyp1f1(x,y,mpmath.mpc(x,y))
625 loops, best of 3: 136 µs per loop


Timings after (new extensions enabled):

sage: import mpmath
sage: x = mpmath.mpf(0.37)
sage: y = mpmath.mpf(0.49)
sage: %timeit mpmath.exp(x)
625 loops, best of 3: 2.72 µs per loop
sage: %timeit mpmath.ln(x)
625 loops, best of 3: 7.25 µs per loop
sage: %timeit mpmath.cos(x)
625 loops, best of 3: 4.13 µs per loop
sage: %timeit x ^ y
625 loops, best of 3: 10.5 µs per loop
sage: %timeit mpmath.hyp1f1(2r,3r,4r)
625 loops, best of 3: 47.1 µs per loop
sage: %timeit mpmath.hyp1f1(x,y,x)
625 loops, best of 3: 59.4 µs per loop
sage: %timeit mpmath.hyp1f1(x,y,mpmath.mpc(x,y))
625 loops, best of 3: 83.1 µs per loop


The new elementary functions use a combination of custom algorithms and straightforward MPFR wrappers. Why not just wrap MPFR for everything? There are two primary reasons:

Firstly, because MPFR numbers have a limited range, custom code still needs to be used in the overflowing cases, and this is almost as much work as an implementation-from-scratch. (There are also some more minor incompatibilities, like lack of round-away-from-zero in MPFR, that result in a lot of extra work.)

Secondly, MPFR is not always fast (or as fast as it could be), so it pays off to write custom code. In fact, some of the ordinary Python implementations of functions in mpmath are faster than their MPFR counterparts in various cases, although that is rather exceptional (atan is an example). But generally, at low-mid precisions, it is possible to be perhaps 2-4x faster than MPFR with carefully optimized C code (see fastfunlib). This is a longer-term goal.

Already now, with the new extension code, the mpmath exponential function becomes faster than the Sage RealNumber version (based on MPFR) at low precision:

sage: %timeit mpmath.exp(x)
625 loops, best of 3: 2.75 µs per loop
sage: w = RealField(53)(x)
sage: %timeit w.exp()
625 loops, best of 3: 5.57 µs per loop


As the timings above indicate, hypergeometric series have gotten up to 2x faster. The speedup of the actual summation is much larger, but much of that gain is lost in various Python overheads (more work can be done on this). There should be a noticeable speedup for some hypergeometric function computations, while others will not benefit as much, for the moment.

Another benchmark is the extratest_zeta.py script in mpmath, which exercises the mpmath implementation of the Riemann-Siegel formula for evaluation of ζ(s) for complex s with large imaginary part. Such computations largely depend on elementary function performance (cos, sin, exp, log).

Here are the new timings for mpmath in Sage:

fredrik@scv:~/sage$ ./sage /home/fredrik/mp/mpmath/tests/extratest_zeta.py
399999999 156762524.675 ok = True (time = 1.144)
241389216 97490234.2277 ok = True (time = 9.271)
526196239 202950727.691 ok = True (time = 1.671)
542964976 209039046.579 ok = True (time = 1.189)
1048449112 388858885.231 ok = True (time = 1.774)
1048449113 388858885.384 ok = True (time = 1.604)
1048449114 388858886.002 ok = True (time = 2.096)
1048449115 388858886.002 ok = True (time = 2.587)
1048449116 388858886.691 ok = True (time = 1.546)

This is mpmath in Sage with the new extension code disabled:

fredrik@scv:~/sage$ ./sage /home/fredrik/mp/mpmath/tests/extratest_zeta.py
399999999 156762524.675 ok = True (time = 2.352)
241389216 97490234.2277 ok = True (time = 14.088)
526196239 202950727.691 ok = True (time = 3.036)
542964976 209039046.579 ok = True (time = 2.104)
1048449112 388858885.231 ok = True (time = 3.707)
1048449113 388858885.384 ok = True (time = 3.283)
1048449114 388858886.002 ok = True (time = 4.444)
1048449115 388858886.002 ok = True (time = 5.592)
1048449116 388858886.691 ok = True (time = 3.101)

This is mpmath in ordinary Python mode, using gmpy:

fredrik@scv:~/sage$ python /home/fredrik/mp/mpmath/tests/extratest_zeta.py
399999999 156762524.675 ok = True (time = 2.741)
241389216 97490234.2277 ok = True (time = 13.842)
526196239 202950727.691 ok = True (time = 3.124)
542964976 209039046.579 ok = True (time = 2.143)
1048449112 388858885.231 ok = True (time = 3.257)
1048449113 388858885.384 ok = True (time = 2.912)
1048449114 388858886.002 ok = True (time = 3.953)
1048449115 388858886.002 ok = True (time = 4.964)
1048449116 388858886.691 ok = True (time = 2.762)

With the new extension code, it appears that zeta computations are up to about twice as fast. This speedup could be made much larger as there still is a significant amount of Python overhead left to remove -- also a project for the future.

No comments: