Wednesday, March 17, 2010

Hypergeneralization

The generalized hypergeometric function can itself be generalized endlessly. I have recently implemented three such extensions in mpmath: bilateral series, two-dimensional hypergeometric series, and q-analog (or "basic") hypergeometric series.

Bilateral series

The bilateral hypergeometric series is the simplest extension, and consists of taking the usual hypergeometric series and extending the range of summation from [0,∞) to (-∞,∞): This series only converges when |z| = 1 and A = B, but interpreted as a sum of two ordinary hypergeometric series, it can be assigned a value for arbitrary z through the analytic continuation (or Borel regularization) of the ordinary hypergeometric series, which mpmath implements. Anyway, the convergent case is the interesting one. Here one obtains, for instance, Dougall's identity for 2H2:

>>> from mpmath import *
>>> mp.dps = 25; mp.pretty = True
>>> a,b,c,d = 0.5, 1.5, 2.25, 3.25
>>> bihyper([a,b],[c,d],1)
-14.49118026212345786148847
>>> gammaprod([c,d,1-a,1-b,c+d-a-b-1],[c-a,d-a,c-b,d-b])
-14.49118026212345786148847

As an example of regularization, the divergent 1H0 series can be expressed as the sum of one 2F0 function and one 1F1 function:

>>> a = mpf(0.25)
>>> z = mpf(0.75)
>>> bihyper([a], [], z)
(0.2454393389657273841385582 + 0.2454393389657273841385582j)
>>> hyper([a,1],[],z) + (hyper(,[1-a],-1/z)-1)
(0.2454393389657273841385582 + 0.2454393389657273841385582j)
>>> hyper([a,1],[],z) + hyper(,[2-a],-1/z)/z/(a-1)
(0.2454393389657273841385582 + 0.2454393389657273841385582j)

Two-dimensional series

The most common hypergeometric series of two variables, i.e. a twodimensional series whose summand is a hypergeometric expression with respect to both indices separately, is the Appell F1 function, previously available in mpmath as appellf1: However, much more general functions are possible. There are three other Appell functions: F2, F3, F4. The Horn functions are 34 distinct functions of order two, containing the Appell functions as special cases. The Kampé de Fériet function provides a generalization of Appell functions to arbitrary orders.

The new hyper2d function in mpmath can evaluate all these named functions, and more general functions still. The trick for speed is to write the series as a nested series, where the inner series is a generalized hypergeometric series that can be evaluated efficiently with hyper, and where the outer series has a rational recurrence formula. This rewriting also permits evaluating the analytic continuation with respect to the inner variable (as implemented by hyper).

The user specifies the format of the series in quasi-symbolic form, and the rewriting to nested form is done automatically by mpmath. For example, the Appell F1 function can be computed as

hyper2d({'m+n':[a], 'm':[b1], 'n':[b2]}, {'m+n':[c]}, x, y)

and indeed, this is essentially what appellf1 now does internally. The Appell F2-F4 functions have also been added explicitly as appellf2, appellf3, appellf4.

Hypergeometric functions of two (or more) variables have numerous applications, such as solving high-order algebraic equations, expressing various derivatives and integrals in closed form, and solving differential equations, but I have not yet found any simple examples that make good demonstrations except for F1. (I have mostly found examples of that take half a page to write down.) Any such examples for the documentation would be a welcome contribution!

Some trivial examples from the documentation are:

>>> x, y = mpf(0.25), mpf(0.5)
>>> hyper2d({'m':1,'n':1}, {}, x,y)
2.666666666666666666666667
>>> 1/(1-x)/(1-y)
2.666666666666666666666667
>>> hyper2d({'m':[1,2],'n':[3,4]}, {'m':,'n':}, x,y)
4.164358531238938319669856
>>> hyp2f1(1,2,5,x)*hyp2f1(3,4,6,y)
4.164358531238938319669856
>>> hyper2d({'m':1,'n':1},{'m+n':1},x,y)
2.013417124712514809623881
>>> (exp(x)*x-exp(y)*y)/(x-y)
2.013417124712514809623881

An example of a Horn function, H3:

>>> x, y = 0.0625, 0.125
>>> a,b,c = 0.5,0.75,0.625
>>> hyper2d({'2m+n':a,'n':b},{'m+n':c},x,y)
1.190003093972956004227425
>>> nsum(lambda m,n: rf(a,2*m+n)*rf(b,n)/rf(c,m+n)*\
... x**m*y**n/fac(m)/fac(n), [0,inf], [0,inf])
1.190003093972956004227425

This also demonstrates the recently added generic support for multidimensional infinite series in mpmath. But of course, nsum is much slower than hyper2d.

Hypergeometric q-series

Before introducing the q-analog of the hypergeometric series, I should introduce the q-Pochhammer symbol, This itself is a new function in mpmath, implemented as qp(a,q,n) (with two- and one-argument forms qp(a,q) and qp(q) also permitted) and is the basis for more general computation involving q-analogs. The q-factorial and q-gamma function have also been added (as qfac and qgamma), but are not yet documented.

The q-analogs have important applications in number theory. As a very neat example, numerically computing the Taylor series of 1/(q, q) with mpmath gives

>>> taylor(lambda q: 1/qp(q), 0, 10)
[1.0, 1.0, 2.0, 3.0, 5.0, 7.0, 11.0, 15.0, 22.0, 30.0, 42.0]

These are the values of the partition function P(n) for n = 0,1,2,..., i.e. the number of ways of writing n as a sum of positive integers.

Replacing the rising factorials (Pochhammer symbols) in the generalized hypergeometric series with their q-analogs gives the hypergeometric q-series or basic hypergeometric series This function is implemented as qhyper. Like hyper, it supports arbitrary combinations of real and complex arguments (assuming |q| < 1). Some examples from the documentation:

>>> qhyper([0.5], [2.25], 0.25, 4)
-0.1975849091263356009534385
>>> qhyper([0.5], [2.25], 0.25-0.25j, 4)
(2.806330244925716649839237 + 3.568997623337943121769938j)
>>> qhyper([1+j], [2,3+0.5j], 0.25, 3+4j)
(9.112885171773400017270226 - 1.272756997166375050700388j)

Like hyper, it automatically ensures accurate evaluation for alternating series:

>>> q = 0.998046875
>>> mp.dps=5; qhyper(,[0.5], q, -0.5)
6.6738e-69
>>> mp.dps=15; qhyper(,[0.5], q, -0.5)
6.67376764851253e-69
>>> mp.dps=25; qhyper(,[0.5], q, -0.5)
6.673767648512527695718826e-69
>>> mp.dps=100; qhyper(,[0.5], q, -0.5)
6.673767648512527695718826106778799352769798151218768443717704076836963752188876
561888441662933081804e-69

With the q-analog of the generalized hypergeometric function implemented, it becomes possible to compute q-exponentials, q-sines, q-orthogonal polynomials, q-Bessel functions, and pretty much anything else. If there is interest, such function could be added explicitly to mpmath.

For more information and examples of the functions discussed in this post, see the sections on hypergeometric functions and q-functions (a little terse at the moment) in the mpmath documentation.

Friday, March 12, 2010

Computing large zeta zeros with mpmath

Juan Arias de Reyna, who wrote the code used in mpmath for evaluating the Riemann zeta function near the critical strip, has contributed a new implementation of the zetazero function which computes the nth zero on the critical line.

The old version (written by myself) used a lookup table for initial approximations, so it was limited to computing the first few zeros. Juan's code calculates the position of arbitrary zeros using Gram's law, with a sophisticated algorithm (about 300 lines of code) to find the correct zero when Gram's law (which is actually just a heuristic) fails.

A bunch of zeros, from small to large:

from mpmath import *
from timeit import default_timer as clock
mp.dps = 20
for n in range(14):
t1 = clock()
v1 = zetazero(10**n)
t2 = clock()
v2 = zetazero(10**n+1)
t3 = clock()
print "10<sup>%i</sup> "%n, v1, "%.2f" % (t2-t1)
print "10<sup>%i</sup>+1"%n, v2, "%.2f" % (t3-t2)

n value time (s)
100 (0.5 + 14.13472514173469379j) 0.12
100+1 (0.5 + 21.022039638771554993j) 0.05
101 (0.5 + 49.773832477672302182j) 0.05
101+1 (0.5 + 52.970321477714460644j) 0.04
102 (0.5 + 236.5242296658162058j) 0.32
102+1 (0.5 + 237.769820480925204j) 0.30
103 (0.5 + 1419.4224809459956865j) 0.93
103+1 (0.5 + 1420.416526323751136j) 0.84
104 (0.5 + 9877.7826540055011428j) 3.75
104+1 (0.5 + 9878.6547723856922882j) 3.71
105 (0.5 + 74920.827498994186794j) 2.53
105+1 (0.5 + 74921.929793958414308j) 2.58
106 (0.5 + 600269.67701244495552j) 2.43
106+1 (0.5 + 600270.30109071169866j) 2.89
107 (0.5 + 4992381.014003178666j) 1.63
107+1 (0.5 + 4992381.2627112065366j) 2.30
108 (0.5 + 42653549.760951553903j) 2.36
108+1 (0.5 + 42653550.046758478876j) 2.93
109 (0.5 + 371870203.83702805273j) 6.98
109+1 (0.5 + 371870204.36631304458j) 6.72
1010 (0.5 + 3293531632.3971367042j) 11.48
1010+1 (0.5 + 3293531632.6869557853j) 15.92
1011 (0.5 + 29538618431.613072811j) 53.67
1011+1 (0.5 + 29538618432.07777426j) 43.00
1012 (0.5 + 267653395648.62594824j) 162.01
1012+1 (0.5 + 267653395648.84752313j) 174.67
1013 (0.5 + 2445999556030.2468814j) 1262.55
1013+1 (0.5 + 2445999556030.6222451j) 1456.38

The function correctly separates close zeros, and can optionally return information about the separation. For example:

>>> mp.dps = 20; mp.pretty = True
>>> zetazero(542964976,info=True)
((0.5 + 209039046.578535272j), [542964969, 542964978], 6, '(013111110)')

The extra information is explained in the docstring for zetazero:

[This means that the] zero is between Gram points 542964969 and 542964978, it is the 6-th zero between them. Finally (01311110) is the pattern of zeros in this interval. The numbers indicate the number of zeros in each Gram interval, (Rosser Blocks between parenthesis). In this case only one Rosser Block of length nine.

Juan reports having computed the 1015th zero, which is

208514052006405.46942460229754774510611

and verifying that it satisfies the Riemann hypothesis. And fortunately, the values of large zeros computed with mpmath seem to agree with those reported by Andrew Odlyzko and Xavier Gourdon.

The new zetazero function in mpmath is indeed able to separate the closest known pair of zeros, found in Xavier Gourdon's exhaustive search up to 1013:

>>> mp.dps = 25
>>> v1 = zetazero(8637740722917, info=True)
>>> v2 = zetazero(8637740722918, info=True)
>>> v1
((0.5 + 2124447368584.392964661515j), [8637740722909L, 8637740722925L], 7L,
'(1)(1)(1)(1)(1)(1)(02)(1)(02)(1)(1)(20)(1)')
>>> v2
((0.5 + 2124447368584.392981706039j), [8637740722909L, 8637740722925L], 8L,
'(1)(1)(1)(1)(1)(1)(02)(1)(02)(1)(1)(20)(1)')
>>> nprint(abs(v1-v2))
1.70445e-5

This computation takes over 2 hours on my laptop.

Of course, zeros can be computed to any desired precision:

>>> mp.dps = 1000
>>> print zetazero(1)
(0.5 + 14.1347251417346937904572519835624702707842571156992431756855674601499634
29809256764949010393171561012779202971548797436766142691469882254582505363239447
13778041338123720597054962195586586020055556672583601077370020541098266150754278
05174425913062544819786510723049387256297383215774203952157256748093321400349904
68034346267314420920377385487141378317356396995365428113079680531491688529067820
82298049264338666734623320078758761792005604868054356801444424651065597568665903
22868651054485944432062407272703209427452221304874872092412385141835146054279015
24478338354254533440044879368067616973008190007313938549837362150130451672696838
92003917628512321285422052396913342583227533516406016976352756375896953767492033
61272092599917304270756830879511844534891800863008264831251691127106829105237596
17977431815170713545316775495153828937849036474709727019948485532209253574357909
22612524773659551801697523346121397731600535412592674745572587780147260983080897
860071253208750939599796666067537838121489190886j)

The only real limitation of the code, as far as I can tell, is that it's impractical for computing a large number of consecutive zeros. For that one needs to use something like the Odlyzko–Schönhage algorithm for multi-evaluation of the zeta function.

A nice exercise would be to parallelize the zeta code using multiprocessing. The speedup for large zeros should be essentially linear with the number of processors.

Thursday, March 11, 2010

Speedups of elementary functions

I have a fairly large backlog of changes in mpmath to blog about. For today, I'll just briefly mention the most recent one, which is a set of optimizations to elementary functions. The commit is here. I intend to provide much faster Cython versions of the elementary functions some time in the future, but since there was room for optimizing the Python versions, I decided to invest a little time in that.

Most importantly, the asymptotic performance of trigonometric function has been improved greatly, due to a more optimized complexity reduction strategy. The faster strategy was previously used only for exp, but cos/sin are virtually identical in theory so fixing this was mostly a coding problem. In fact, the high-precision code for exp/cosh/sinh/cos/sin is largely shared now, which is nice because only one implementation needs to be optimized.

The following image shows performance for computing cos(3.7). The red graph is the old implementation; blue is new. The top subplot shows evaluations/second at low precision, i.e. higher is better, and the bottom subplot shows seconds/evaluation at higher precision, i.e. lower is better. Already at a few thousand bits, the new code is more than twice as fast, and then it just gets better and better. There is also a little less overhead at low precision now, so the new code is uniformly faster.

Also exp, cosh and sinh have been optimized slightly. Most importantly, there is a little less overhead at low precision, but asymptotic speed has also improved a bit.

Performance for computing cosh(3.7): Performance for computing exp(3.7): I should note that the performance depends on some tuning parameters which naturally are system-specific. The results above are on a 32-bit system with gmpy as the backend. I hope to later implement optimized tuning parameters for other systems as well.

Elementary function performance is of course important globally, but it's especially important for some specific functions. For example, the speed of the Riemann zeta function depends almost proportionally on the total speed of evaluating one exp, one log, one sine, and one cosine, since the main part of the work consists of summing the truncated L-series With the improved elementary functions, and some overhead removals to the zetasum code in the same commit, the Riemann zeta function is up to 2x faster now, and about 50% faster on the critical line.