**TL;DR**

The current state of my code is available at

https://github.com/lllamnyp/Faddeeva/blob/master/Faddeeva.m

Minor updates (03.08.17):

I replaced the `Total[Divide[..., ...]]`

in the definition of `Faddeeva`

for `Abs[z] < 10`

with a string of two `Dot`

products, it is now slightly faster.

When passing `CompilationTarget -> "C"`

to `Faddeeva`

I kept getting a bunch of `::cmperr`

warnings, as a result a `LibraryFunction`

was not being generated. After removing `RuntimeOptions -> "Speed"`

it now works properly and can be inlined in further compiled functions.

I disabled symbolic evaluation. Now things like `Plot[Faddeeva[x] // ReIm, {x, -5, 5}, Evaluated -> True`

work properly.

As do the authors of the aforementioned Faddeeva package, I begin by implementing the scaled real and imaginary error functions of real arguments:

$$ \mathrm{ErfcxRe}[x] = \mathrm{Exp}[x^2]\mathrm{Erfc}[x],$$

$$ \mathrm{ErfcxIm}[x] = \mathrm{Exp}[-x^2]\mathrm{Erfi}[x].$$

Later on I realized that the former is compilable and seems to be faster anyhow, so I ended up not using it in the implmentation of `Faddeeva`

, but it is nonetheless present in the package. It is implemented in a manner quite similar to the implementation of `ErfcxIm`

, as I detail below.

```
ErfcxIm = With[{lookupErfcxIm = lookupErfcxIm},
Compile[{{x, _Real}},
Block[{res = 0., xAbs = Abs[x]},
If[xAbs >= 5.*^7, Return[Divide[0.5641895835477563`,x]]];
If[xAbs > 48.9, With[{x2=x*x},
Return[Divide[
0.5641895835477563` x (-558+x2 (740+x2 (-216+16 x2))),
105+x2 (-840+x2 (840+x2 (-224+16 x2)))]]]];
If[x == 0., Return[res]];
res = With[{lookupTable = lookupErfcxIm, y = Divide[1, 1 + xAbs]},
With[{n = Subtract[Floor[100 y], 1]},
With[{yOff = y - 0.015 - 0.01 n},
Fold[# yOff + #2 &, Reverse[lookupTable[[n]] ] ]
]]];
res * Sign[x]
],
RuntimeAttributes->{Listable}, RuntimeOptions->"Speed",
Parallelization->True, CompilationTarget->"C"]
];
```

`0.5641895835477563``

is `1/Sqrt[Pi]`

. For very large arguments (`>5e7`

) a `1/x`

dependence is within machine precision, for smaller ones the continued fraction approximation works well, for the yet smaller arguments the function is reparametrized as follows (same approach as in the linked Faddeeva C-package): `y -> 1/(1+x)`

, which is subdivided into several (50) intervals and each is fitted with a 7th degree polynomial of `y`

such that the adjacent intervals have matching derivatives up to 3rd order. The polynomials' are taken in `HornerForm`

and their coefficients are stored in the `lookupTable`

, a `{50, 8}`

matrix. The polynomial is reconstructed by the `Fold`

statement. Using MMA's arbitrary precision arithmetic I have verified that this gives results very close to machine precision.

This function is useful for one special case. It is used for calculating the Faddeeva function of purely real arguments (in optical spectroscopy this corresponds to a Gaussian absorption line).

The general form of the Faddeeva function for smallish complex arguments is, as given in the linked pre-print, the following (I have factored out the $\exp(-x^2)$ from the sums in the following):

$$ Re[w(z)] = e^{-x^2} \left\{\mathrm{ErfcxRe}(y) \cos(2xy) + \\ 2 a \left[ x \sin(xy) \mathrm{sinc}(xy) -y \left(\cos(2xy)\Sigma_1 - \Sigma_2/2 - \Sigma_3/2\right)\right]/\pi\right\}$$

$$ Im[w(z)] = e^{-x^2} \left\{-\mathrm{ErfcxRe}(y) \sin(2xy) + \\ 2 a \left[ x \mathrm{sinc}(2xy) + y \sin(2xy)\Sigma_1 - \Sigma_4/2 + \Sigma_5/2\right]/\pi\right\}$$

the $\Sigma_i$ are

$$ \left\{\Sigma_1, \Sigma_2, \Sigma_3, \Sigma_4, \Sigma_5\right\} = \\
\sum_{n=1}^\infty \frac{\exp(-a^2n^2)}{a^2n^2+y^2}\left\{1, e^{-2anx}, e^{2anx}, an e^{-2anx}, an e^{2anx} \right\}$$

I slightly simplify the sums by redefining:

$$ \Sigma_1 = \frac{1}{2}\sum_{n=-\infty}^\infty \frac{\exp(-a^2n^2)}{a^2n^2+y^2} = \Sigma$$

$$ \Sigma_2 + \Sigma_3 = \sum_{n=-\infty}^\infty \frac{\exp(2anx-a^2n^2)}{a^2n^2+y^2} = \Sigma_{23}$$

$$ \Sigma_5 - \Sigma_4 = \sum_{n=-\infty}^\infty \frac{an \exp(2anx-a^2n^2)}{a^2n^2+y^2} = \Sigma_{45}$$

In all the above sums `n==0`

is excluded.

There are several corresponding terms in the real and imaginary parts which can be nicely converted to exponential form:

$$ w(z) = e^{-x^2} \left\{\mathrm{ErfcxRe}(y)e^{-2ixy} +
a\left[2 ix\, \mathrm{sinc}(xy)e^{-ixy} - y e^{-2ixy} \Sigma + y \Sigma_{23} + i\Sigma_{45}\right]/\pi\right\}$$

$a$ is a parameter used in an approximation for $e^{t^2}$ to provide an analytical solution for the integral involving $e^{t^2}$. The pre-print shows nicely that `a<.5`

is by far sufficient to achieve machine precision. In my implementation I use `a==.25`

.

In order to speed up computation, I precompute tables of $\exp(-a^2n^2)$, $an$, $a^2n^2$ which are injected into the body of the compiled function with a `With`

. I take `-106<=n<=106`

. This almost reaches `$MinMachineNumber`

for `Exp[-a^2 n^2]`

.

With all these definitions out of the way, the definition of the `Faddeeva`

function is as follows:

```
Faddeeva =
Block[{lookupEmA2N2 = Table[Reverse@#~Join~#&[Table[Exp[-n^2/16.] + 0. I,{n,106}]],{3}] // Developer`ToPackedArray,
lookupAN = Delete[Table[n/4., {n,-106, 106}],{107}], lookupA2N2},
lookupEmA2N2[[3]] *= I lookupAN;
lookupA2N2 = lookupAN^2 + 0.I;
With[{lookup = lookupEmA2N2//Developer`ToPackedArray, lAN = lookupAN//Developer`ToPackedArray, lA2N2 = lookupA2N2//Developer`ToPackedArray},
Compile[
{{z, _Complex}},With[{x = Re[z], y = Im[z]},With[{ere = Exp[y*y]Erfc[y], mxx = Minus[x*x], mxy = Minus[x*y]},
If[z == 0. + 0. I, Return[1. + 0. I]];
If[x == 0., Return[0. I + ere]];
If[y == 0., Return[Exp[mxx] + I ErfcxIm[x]]];
If[Abs[z] < 10.,
Block[{sums = lookup, e2ANx = Exp[2 lAN x]},
sums[[2]] *= e2ANx; sums[[3]] *= e2ANx;
Return[Exp[mxx] *
(ere Exp[2I mxy] + 0.07957747154594767` *
(2 I x Sinc[mxy] Exp[I mxy] +
{Minus[y Exp[2 I mxy]], y, 1.}.sums.Divide[1,(lA2N2 + y*y)]))
]
]
];
With[{zz=z*z},0.5641895835477563` I Divide[z (-558+zz (740+zz (-216+16 zz))),105+zz (-840+zz (840+zz (-224+16 zz)))]]]],
RuntimeAttributes->{Listable}, RuntimeOptions->{"EvaluateSymbolically"->False}, Parallelization->True,
CompilationOptions->{"InlineExternalDefinitions" -> True, "InlineCompiledFunctions"->False}, CompilationTarget-> "C"
]
]
]
```

Within the body of the last `Return`

statement for `Abs[z]<10`

`sums`

evaluates to a table of the form

$$sums = \left\{ \exp(-a^2n^2), \exp(2anx-a^2n^2), ian \exp(2anx-a^2n^2) \right\}_{n=-106..106} $$

so

$$\left\{-ye^{-2ixy},y,1\right\}.sums.\left( 1/\left\{a^2n^2+y^2\right\}\right)$$

gives the last 3 addends of $w(z)$ in the square braces.

A quick performance check:

```
test = RandomComplex[.3 {-15 - 15 I, 15 + 15 I}, 10^4];
Faddeeva[test] // AbsoluteTiming // First
(* 0.149745 *)
Exp[-test^2] Erfc[-I test] // AbsoluteTiming // First
(* 2.88898 *)
```

20x speed increase. Could this be better? After all, .1ms per complex number in the difficult range is not *that* great.

A precision test:

```
N[Table[Exp[-(x + I y)^2] Erfc[-I (x + I y)], {x, 0, 5, 1/20}, {y, 0,
5, 1/20}], 30]/
Table[Faddeeva[x + I y], {x, 0, 5, 1/20}, {y, 0, 5, 1/20}] // Log // Abs // Max
(* 6.63025*10^-13 *)
```

Roughly 12-13 digits of precision at worst. Certainly good enough for experimental data.

For small absolutely

`z`

, you might be able to use a Padé expansion to obtain a ratio of polynomials. Have a look at the documentation page of`PadeApproximant`

. – Henrik Schumacher – 2017-07-31T12:53:35.340Another option is to write a wrapper for the c++ code with the help of LTemplate. This is not as complicated as it seems.

– Henrik Schumacher – 2017-07-31T13:14:10.717I have a routine, but it needs to be cleaned up a bit. I hope you don't mind a wait of a few days. These two are related questions.

– J. M.'s ennui – 2017-07-31T13:56:51.833@Henrik, in the particular case of the error/Voigt/Faddeeva function, one must first factor out the exponential behavior before constructing the Padé approximant; otherwise the approximant is numerically unstable. – J. M.'s ennui – 2017-07-31T13:58:43.160

@J.M. I'm certainly interested. I've just managed to write up my self-answer before I have to leave work, take a look. – LLlAMnYP – 2017-07-31T14:22:25.563

@LLlAMnYP Sorry for joining the party so late, but these days I want to be sure that someone is really dedicated to a topic before I give an answer that needs more than 10 minutes of my time. I'm providing a fast C-implementation in a moment. – halirutan – 2017-08-18T01:04:27.517

@halirutan thanks, I appreciate any contribution. I'm on vacation till next month, but I'll try to give the topic as much attention as possible. – LLlAMnYP – 2017-08-18T17:55:30.463

I'm looking into my stuff in more detail. Do you need a solution that works to arbitrary precision (slightly hard), or are you fine with a machine precision approximation (easier)? – J. M.'s ennui – 2017-08-20T01:30:44.057

1@J.M arbitrary precision would be extremely interesting from an educational point of view, although for practical purposes it is not necessary. – LLlAMnYP – 2017-08-21T16:32:13.187