## Wednesday, March 17, 2010

### Hypergeneralization

The generalized hypergeometric function can itself be generalized endlessly. I have recently implemented three such extensions in mpmath: bilateral series, two-dimensional hypergeometric series, and q-analog (or "basic") hypergeometric series.

### Bilateral series

The bilateral hypergeometric series is the simplest extension, and consists of taking the usual hypergeometric series and extending the range of summation from [0,∞) to (-∞,∞):

This series only converges when |z| = 1 and A = B, but interpreted as a sum of two ordinary hypergeometric series, it can be assigned a value for arbitrary z through the analytic continuation (or Borel regularization) of the ordinary hypergeometric series, which mpmath implements. Anyway, the convergent case is the interesting one. Here one obtains, for instance, Dougall's identity for 2H2:
`>>> from mpmath import *>>> mp.dps = 25; mp.pretty = True>>> a,b,c,d = 0.5, 1.5, 2.25, 3.25>>> bihyper([a,b],[c,d],1)-14.49118026212345786148847>>> gammaprod([c,d,1-a,1-b,c+d-a-b-1],[c-a,d-a,c-b,d-b])-14.49118026212345786148847`

As an example of regularization, the divergent 1H0 series can be expressed as the sum of one 2F0 function and one 1F1 function:
`>>> a = mpf(0.25)>>> z = mpf(0.75)>>> bihyper([a], [], z)(0.2454393389657273841385582 + 0.2454393389657273841385582j)>>> hyper([a,1],[],z) + (hyper([1],[1-a],-1/z)-1)(0.2454393389657273841385582 + 0.2454393389657273841385582j)>>> hyper([a,1],[],z) + hyper([1],[2-a],-1/z)/z/(a-1)(0.2454393389657273841385582 + 0.2454393389657273841385582j)`

### Two-dimensional series

The most common hypergeometric series of two variables, i.e. a twodimensional series whose summand is a hypergeometric expression with respect to both indices separately, is the Appell F1 function, previously available in mpmath as appellf1:

However, much more general functions are possible. There are three other Appell functions: F2, F3, F4. The Horn functions are 34 distinct functions of order two, containing the Appell functions as special cases. The Kampé de Fériet function provides a generalization of Appell functions to arbitrary orders.

The new hyper2d function in mpmath can evaluate all these named functions, and more general functions still. The trick for speed is to write the series as a nested series, where the inner series is a generalized hypergeometric series that can be evaluated efficiently with hyper, and where the outer series has a rational recurrence formula. This rewriting also permits evaluating the analytic continuation with respect to the inner variable (as implemented by hyper).

The user specifies the format of the series in quasi-symbolic form, and the rewriting to nested form is done automatically by mpmath. For example, the Appell F1 function can be computed as
`hyper2d({'m+n':[a], 'm':[b1], 'n':[b2]}, {'m+n':[c]}, x, y)`

and indeed, this is essentially what appellf1 now does internally. The Appell F2-F4 functions have also been added explicitly as appellf2, appellf3, appellf4.

Hypergeometric functions of two (or more) variables have numerous applications, such as solving high-order algebraic equations, expressing various derivatives and integrals in closed form, and solving differential equations, but I have not yet found any simple examples that make good demonstrations except for F1. (I have mostly found examples of that take half a page to write down.) Any such examples for the documentation would be a welcome contribution!

Some trivial examples from the documentation are:
`>>> x, y = mpf(0.25), mpf(0.5)>>> hyper2d({'m':1,'n':1}, {}, x,y)2.666666666666666666666667>>> 1/(1-x)/(1-y)2.666666666666666666666667>>> hyper2d({'m':[1,2],'n':[3,4]}, {'m':[5],'n':[6]}, x,y)4.164358531238938319669856>>> hyp2f1(1,2,5,x)*hyp2f1(3,4,6,y)4.164358531238938319669856>>> hyper2d({'m':1,'n':1},{'m+n':1},x,y)2.013417124712514809623881>>> (exp(x)*x-exp(y)*y)/(x-y)2.013417124712514809623881`

An example of a Horn function, H3:
`>>> x, y = 0.0625, 0.125>>> a,b,c = 0.5,0.75,0.625>>> hyper2d({'2m+n':a,'n':b},{'m+n':c},x,y)1.190003093972956004227425>>> nsum(lambda m,n: rf(a,2*m+n)*rf(b,n)/rf(c,m+n)*\...     x**m*y**n/fac(m)/fac(n), [0,inf], [0,inf])1.190003093972956004227425`

This also demonstrates the recently added generic support for multidimensional infinite series in mpmath. But of course, nsum is much slower than hyper2d.

### Hypergeometric q-series

Before introducing the q-analog of the hypergeometric series, I should introduce the q-Pochhammer symbol,

This itself is a new function in mpmath, implemented as qp(a,q,n) (with two- and one-argument forms qp(a,q) and qp(q) also permitted) and is the basis for more general computation involving q-analogs. The q-factorial and q-gamma function have also been added (as qfac and qgamma), but are not yet documented.

The q-analogs have important applications in number theory. As a very neat example, numerically computing the Taylor series of 1/(q, q) with mpmath gives
`>>> taylor(lambda q: 1/qp(q), 0, 10)[1.0, 1.0, 2.0, 3.0, 5.0, 7.0, 11.0, 15.0, 22.0, 30.0, 42.0]`

These are the values of the partition function P(n) for n = 0,1,2,..., i.e. the number of ways of writing n as a sum of positive integers.

Replacing the rising factorials (Pochhammer symbols) in the generalized hypergeometric series with their q-analogs gives the hypergeometric q-series or basic hypergeometric series

This function is implemented as qhyper. Like hyper, it supports arbitrary combinations of real and complex arguments (assuming |q| < 1). Some examples from the documentation:
`>>> qhyper([0.5], [2.25], 0.25, 4)-0.1975849091263356009534385>>> qhyper([0.5], [2.25], 0.25-0.25j, 4)(2.806330244925716649839237 + 3.568997623337943121769938j)>>> qhyper([1+j], [2,3+0.5j], 0.25, 3+4j)(9.112885171773400017270226 - 1.272756997166375050700388j)`

Like hyper, it automatically ensures accurate evaluation for alternating series:
`>>> q = 0.998046875>>> mp.dps=5; qhyper([2],[0.5], q, -0.5)6.6738e-69>>> mp.dps=15; qhyper([2],[0.5], q, -0.5)6.67376764851253e-69>>> mp.dps=25; qhyper([2],[0.5], q, -0.5)6.673767648512527695718826e-69>>> mp.dps=100; qhyper([2],[0.5], q, -0.5)6.673767648512527695718826106778799352769798151218768443717704076836963752188876561888441662933081804e-69`

With the q-analog of the generalized hypergeometric function implemented, it becomes possible to compute q-exponentials, q-sines, q-orthogonal polynomials, q-Bessel functions, and pretty much anything else. If there is interest, such function could be added explicitly to mpmath.

For more information and examples of the functions discussed in this post, see the sections on hypergeometric functions and q-functions (a little terse at the moment) in the mpmath documentation.

## Friday, March 12, 2010

### Computing large zeta zeros with mpmath

Juan Arias de Reyna, who wrote the code used in mpmath for evaluating the Riemann zeta function near the critical strip, has contributed a new implementation of the zetazero function which computes the nth zero on the critical line.

The old version (written by myself) used a lookup table for initial approximations, so it was limited to computing the first few zeros. Juan's code calculates the position of arbitrary zeros using Gram's law, with a sophisticated algorithm (about 300 lines of code) to find the correct zero when Gram's law (which is actually just a heuristic) fails.

A bunch of zeros, from small to large:

`from mpmath import *from timeit import default_timer as clockmp.dps = 20for n in range(14):    t1 = clock()    v1 = zetazero(10**n)    t2 = clock()    v2 = zetazero(10**n+1)    t3 = clock()    print "10<sup>%i</sup>  "%n, v1, "%.2f" % (t2-t1)    print "10<sup>%i</sup>+1"%n, v2, "%.2f" % (t3-t2)n      value                          time (s)100    (0.5 + 14.13472514173469379j)  0.12100+1  (0.5 + 21.022039638771554993j) 0.05101    (0.5 + 49.773832477672302182j) 0.05101+1  (0.5 + 52.970321477714460644j) 0.04102    (0.5 + 236.5242296658162058j)  0.32102+1  (0.5 + 237.769820480925204j)   0.30103    (0.5 + 1419.4224809459956865j) 0.93103+1  (0.5 + 1420.416526323751136j)  0.84104    (0.5 + 9877.7826540055011428j) 3.75104+1  (0.5 + 9878.6547723856922882j) 3.71105    (0.5 + 74920.827498994186794j) 2.53105+1  (0.5 + 74921.929793958414308j) 2.58106    (0.5 + 600269.67701244495552j) 2.43106+1  (0.5 + 600270.30109071169866j) 2.89107    (0.5 + 4992381.014003178666j)  1.63107+1  (0.5 + 4992381.2627112065366j) 2.30108    (0.5 + 42653549.760951553903j) 2.36108+1  (0.5 + 42653550.046758478876j) 2.93109    (0.5 + 371870203.83702805273j) 6.98109+1  (0.5 + 371870204.36631304458j) 6.721010   (0.5 + 3293531632.3971367042j) 11.481010+1 (0.5 + 3293531632.6869557853j) 15.921011   (0.5 + 29538618431.613072811j) 53.671011+1 (0.5 + 29538618432.07777426j)  43.001012   (0.5 + 267653395648.62594824j) 162.011012+1 (0.5 + 267653395648.84752313j) 174.671013   (0.5 + 2445999556030.2468814j) 1262.551013+1 (0.5 + 2445999556030.6222451j) 1456.38`

The function correctly separates close zeros, and can optionally return information about the separation. For example:
`>>> mp.dps = 20; mp.pretty = True>>> zetazero(542964976,info=True)((0.5 + 209039046.578535272j), [542964969, 542964978], 6, '(013111110)')`

The extra information is explained in the docstring for zetazero:

[This means that the] zero is between Gram points 542964969 and 542964978, it is the 6-th zero between them. Finally (01311110) is the pattern of zeros in this interval. The numbers indicate the number of zeros in each Gram interval, (Rosser Blocks between parenthesis). In this case only one Rosser Block of length nine.

Juan reports having computed the 1015th zero, which is
`208514052006405.46942460229754774510611`

and verifying that it satisfies the Riemann hypothesis. And fortunately, the values of large zeros computed with mpmath seem to agree with those reported by Andrew Odlyzko and Xavier Gourdon.

The new zetazero function in mpmath is indeed able to separate the closest known pair of zeros, found in Xavier Gourdon's exhaustive search up to 1013:

`>>> mp.dps = 25>>> v1 = zetazero(8637740722917, info=True)>>> v2 = zetazero(8637740722918, info=True)>>> v1((0.5 + 2124447368584.392964661515j), [8637740722909L, 8637740722925L], 7L,'(1)(1)(1)(1)(1)(1)(02)(1)(02)(1)(1)(20)(1)')>>> v2((0.5 + 2124447368584.392981706039j), [8637740722909L, 8637740722925L], 8L,'(1)(1)(1)(1)(1)(1)(02)(1)(02)(1)(1)(20)(1)')>>> nprint(abs(v1[0]-v2[0]))1.70445e-5`

This computation takes over 2 hours on my laptop.

Of course, zeros can be computed to any desired precision:
`>>> mp.dps = 1000>>> print zetazero(1)(0.5 + 14.13472514173469379045725198356247027078425711569924317568556746014996342980925676494901039317156101277920297154879743676614269146988225458250536323944713778041338123720597054962195586586020055556672583601077370020541098266150754278051744259130625448197865107230493872562973832157742039521572567480933214003499046803434626731442092037738548714137831735639699536542811307968053149168852906782082298049264338666734623320078758761792005604868054356801444424651065597568665903228686510544859444320624072727032094274522213048748720924123851418351460542790152447833835425453344004487936806761697300819000731393854983736215013045167269683892003917628512321285422052396913342583227533516406016976352756375896953767492033612720925999173042707568308795118445348918008630082648312516911271068291052375961797743181517071354531677549515382893784903647470972701994848553220925357435790922612524773659551801697523346121397731600535412592674745572587780147260983080897860071253208750939599796666067537838121489190886j)`

The only real limitation of the code, as far as I can tell, is that it's impractical for computing a large number of consecutive zeros. For that one needs to use something like the Odlyzko–Schönhage algorithm for multi-evaluation of the zeta function.

A nice exercise would be to parallelize the zeta code using multiprocessing. The speedup for large zeros should be essentially linear with the number of processors.

## Thursday, March 11, 2010

### Speedups of elementary functions

I have a fairly large backlog of changes in mpmath to blog about. For today, I'll just briefly mention the most recent one, which is a set of optimizations to elementary functions. The commit is here. I intend to provide much faster Cython versions of the elementary functions some time in the future, but since there was room for optimizing the Python versions, I decided to invest a little time in that.

Most importantly, the asymptotic performance of trigonometric function has been improved greatly, due to a more optimized complexity reduction strategy. The faster strategy was previously used only for exp, but cos/sin are virtually identical in theory so fixing this was mostly a coding problem. In fact, the high-precision code for exp/cosh/sinh/cos/sin is largely shared now, which is nice because only one implementation needs to be optimized.

The following image shows performance for computing cos(3.7). The red graph is the old implementation; blue is new. The top subplot shows evaluations/second at low precision, i.e. higher is better, and the bottom subplot shows seconds/evaluation at higher precision, i.e. lower is better.

Already at a few thousand bits, the new code is more than twice as fast, and then it just gets better and better. There is also a little less overhead at low precision now, so the new code is uniformly faster.

Also exp, cosh and sinh have been optimized slightly. Most importantly, there is a little less overhead at low precision, but asymptotic speed has also improved a bit.

Performance for computing cosh(3.7):

Performance for computing exp(3.7):

I should note that the performance depends on some tuning parameters which naturally are system-specific. The results above are on a 32-bit system with gmpy as the backend. I hope to later implement optimized tuning parameters for other systems as well.

Elementary function performance is of course important globally, but it's especially important for some specific functions. For example, the speed of the Riemann zeta function depends almost proportionally on the total speed of evaluating one exp, one log, one sine, and one cosine, since the main part of the work consists of summing the truncated L-series

With the improved elementary functions, and some overhead removals to the zetasum code in the same commit, the Riemann zeta function is up to 2x faster now, and about 50% faster on the critical line.