`vignettes/differential-operators.Rmd`

`differential-operators.Rmd`

Orthogonal coordinates are a special but extremely common case of
curvilinear coordinates where the coordinate surfaces all meet at right
angles. The chief advantage of non-Cartesian coordinates is that they
can be chosen to match the symmetry of the problem. For example,
spherical coordinates are the most common curvilinear coordinate systems
and are used in Earth sciences, cartography, quantum mechanics,
relativity, and engineering.^{1} These coordinates may be derived from a set
of Cartesian coordinates by using a transformation that is locally
invertible (a one-to-one map) at each point. This means that one can
convert a point given in a Cartesian coordinate system to its
curvilinear coordinates and back. Differential operators such as the
gradient, divergence, curl, and Laplacian can be transformed from one
coordinate system to another via the usage of scale factors.^{2} The
package implements these operators in Cartesian, polar, spherical,
cylindrical, parabolic coordinates, and supports arbitrary orthogonal
coordinates systems defined by custom scale factors.

Curvilinear coordinates \((q_1, q_2, q_3)\) | Transformation from cartesian \((x, y, z)\) | Scale factors |
---|---|---|

Spherical polar coordinates \((r,\theta ,\phi )\) | \({\begin{aligned}x&=r\sin \theta \cos \phi \\y&=r\sin \theta \sin \phi \\z&=r\cos \theta \end{aligned}}\) | \({\begin{aligned}h_{1}&=1\\h_{2}&=r\\h_{3}&=r\sin \theta \end{aligned}}\) |

Cylindrical polar coordinates \((r,\phi ,z)\) | \({\begin{aligned}x&=r\cos \phi \\y&=r\sin \phi \\z&=z\end{aligned}}\) | \({\begin{aligned}h_{1}&=h_{3}=1\\h_{2}&=r\end{aligned}}\) |

Parabolic coordinates \((u,v,\phi )\) | \({\begin{aligned}x&=uv\cos \phi \\y&=uv\sin \phi \\z&={\frac {1}{2}}(u^{2}-v^{2})\end{aligned}}\) | \({\begin{aligned}h_{1}&=h_{2}={\sqrt {u^{2}+v^{2}}}\\h_{3}&=uv\end{aligned}}\) |

Parabolic cylindrical coordinates \((u,v,z)\) | \({\begin{aligned}x&={\frac {1}{2}}(u^{2}-v^{2})\\y&=uv\\z&=z\end{aligned}}\) | \({\begin{aligned}h_{1}&=h_{2}={\sqrt {u^{2}+v^{2}}}\\h_{3}&=1\end{aligned}}\) |

The gradient of a scalar-valued function \(F\) is the vector \((\nabla F)_i\) whose components are the partial derivatives of \(F\) with respect to each variable \(i\). In arbitrary orthogonal coordinate systems, the gradient is expressed in terms of the scale factors \(h_i\) as follows:

\[(\nabla F)_i = \frac{1}{h_i}\partial_iF\]

The function `gradient`

implements the symbolic and numeric gradient of `functions`

,
`expressions`

and `characters`

. In Cartesian
coordinates:

and in spherical coordinates:

```
gradient("x*y*z", var = c("x","y","z"), coordinates = "spherical")
#> [1] "1/1 * (y * z)" "1/x * (x * z)" "1/(x*sin(y)) * (x * y)"
```

To support arbitrary orthogonal coordinate systems, it is possible to
pass custom scale factors to the argument `coordinates`

. For
instance, the following call is equivalent to the previous example in
spherical coordinates where the scale factors are now explicitly
specified:

```
gradient("x*y*z", var = c("x","y","z"), coordinates = c(1,"x","x*sin(y)"))
#> [1] "1/(1) * (y * z)" "1/(x) * (x * z)" "1/(x*sin(y)) * (x * y)"
```

Numerical methods are applied when working with
`functions`

with the same sintax introduced for derivatives.

```
f <- function(x, y, z) x*y*z
gradient(f, var = c(x = 1, y = pi/2, z = 0), coordinates = "spherical")
#> [1] 0.000000 0.000000 1.570796
```

or in vectorized form:

```
f <- function(x) x[1]*x[2]*x[3]
gradient(f, var = c(1, pi/2, 0), coordinates = "spherical")
#> [1] 0.000000 0.000000 1.570796
```

When the function \(F\) is a tensor-valued function \(F_{d_1,\dots,d_n}\), the gradient is computed for each scalar component.

\[(\nabla F_{d_1,\dots,d_n})_i = \frac{1}{h_i}\partial_iF_{d_1,\dots,d_n}\]

In particular, this reduces to the Jacobian matrix for vector-valued functions \(F_{d_1}\):

```
f <- function(x) c(prod(x), sum(x))
gradient(f, var = c(3, 2, 1))
#> [,1] [,2] [,3]
#> [1,] 2 3 6
#> [2,] 1 1 1
```

that may be expressed in arbitrary orthogonal coordinate systems.

```
f <- function(x) c(prod(x), sum(x))
gradient(f, var = c(3, 2, 1), coordinates = "cylindrical")
#> [,1] [,2] [,3]
#> [1,] 2 1.0000000 6
#> [2,] 1 0.3333333 1
```

The function `jacobian`

is a wrapper for `gradient`

that always returns the Jacobian as a matrix, even in the case of
unidimensional scalar-valued functions.

In Cartesian coordinates, the Hessian of a scalar-valued function \(F\) is the square matrix of second-order partial derivatives:

\[(H(F))_{ij} = \partial_{ij}F\]

It might be tempting to apply the definition of the Hessian as the Jacobian of the gradient to write it in terms of the scale factors. However, this results in a Hessian matrix that is not symmetric and ignores the distinction between vector and covectors in tensor analysis. The generalization to arbitrary coordinate system is not implemented and only Cartesian coordinates are supported:

```
f <- function(x, y, z) x*y*z
hessian(f, var = c(x = 3, y = 2, z = 1))
#> [,1] [,2] [,3]
#> [1,] 1.222284e-11 1.00000e+00 2.000000e+00
#> [2,] 1.000000e+00 2.75014e-11 3.000000e+00
#> [3,] 2.000000e+00 3.00000e+00 1.100056e-10
```

When the function \(F\) is a
tensor-valued function \(F_{d_1,\dots,d_n}\), the `hessian`

is computed for each scalar component.

\[(H(F_{d_1,\dots,d_n}))_{ij} = \partial_{ij}F_{d_1,\dots,d_n}\]

In this case, the function returns an `array`

of Hessian
matrices:

that can be extracted with the corresponding indices.

```
h[1,,]
#> [,1] [,2] [,3]
#> [1,] 1.222284e-11 1.00000e+00 2.000000e+00
#> [2,] 1.000000e+00 2.75014e-11 3.000000e+00
#> [3,] 2.000000e+00 3.00000e+00 1.100056e-10
h[2,,]
#> [,1] [,2] [,3]
#> [1,] -1.833426e-11 7.883472e-12 -3.627321e-12
#> [2,] 7.883472e-12 -6.416993e-11 1.090683e-11
#> [3,] -3.627321e-12 1.090683e-11 9.167132e-11
```

The divergence of a vector-valued function \(F_i\) produces a scalar value \(\nabla \cdot F\) representing the volume
density of the outward flux of the vector field from an infinitesimal
volume around a given point.^{3} In terms of scale factors, it is expressed
as follows:

\[\nabla \cdot F = \frac{1}{J}\sum_i\partial_i\Biggl(\frac{J}{h_i}F_i\Biggl)\]

where \(J=\prod_ih_i\). When \(F\) is an `array`

of
vector-valued functions \(F_{d_1,\dots,d_n,i}\), the `divergence`

is computed for each vector:

\[(\nabla \cdot F)_{d_1,\dots,d_n} = \frac{1}{J}\sum_i\partial_i\Biggl(\frac{J}{h_i}F_{d_1,\dots,d_n,i}\Biggl)=\frac{1}{J}\sum_i\partial_i(Jh_i^{-1})F_{d_1,\dots,d_n,i}+Jh_i^{-1}\partial_i(F_{d_1,\dots,d_n,i})\]

where the last equality is preferable in practice as the derivatives of the scale factor can be computed symbolically and the computation of the derivatives of \(F\) is more efficient than the direct computation of \(\partial_i\bigl(\frac{J}{h_i}F_{d_1,\dots,d_n,i}\bigl)\) via finite differences. In Cartesian coordinates:

```
f <- c("x^2", "y^2", "z^2")
divergence(f, var = c("x","y","z"))
#> [1] "2 * x + 2 * y + 2 * z"
```

In polar coordinates:

```
f <- c("sqrt(r)/10", "sqrt(r)")
divergence(f, var = c("r","phi"), coordinates = "polar")
#> [1] "(0.5 * r^-0.5/10 * r + (sqrt(r)/10)) / (1*r)"
```

And for tensors of vector-valued functions:

```
f <- matrix(c("x^2","y^2","z^2","x","y","z"), nrow = 2, byrow = TRUE)
divergence(f, var = c("x","y","z"))
#> [1] "2 * x + 2 * y + 2 * z" "1 + 1 + 1"
```

The same syntax holds for `functions`

where numerical
methods are automatically applied:

```
f <- function(x,y,z) matrix(c(x^2,y^2,z^2,x,y,z), nrow = 2, byrow = TRUE)
divergence(f, var = c(x = 0, y = 0, z = 0))
#> [1] 0 3
```

The curl of a vector-valued function \(F_i\) at a point is represented by a vector
whose length and direction denote the magnitude and axis of the maximum
circulation.^{4} In 2 dimensions, the curl is written in
terms of the scale factors \(h\) and
the Levi-Civita
symbol \(\epsilon\) as follows:

\[\nabla \times F = \frac{1}{h_1h_2}\sum_{ij}\epsilon_{ij}\partial_i\Bigl(h_jF_j\Bigl)= \frac{1}{h_1h_2}\Biggl(\partial_1\Bigl(h_2F_2\Bigl)-\partial_2\Bigl(h_1F_1\Bigl)\Biggl)\]

In 3 dimensions:

\[(\nabla \times F)_k = \frac{h_k}{J}\sum_{ij}\epsilon_{ijk}\partial_i\Bigl(h_jF_j\Bigl)\]

where \(J=\prod_i h_i\). This
suggests to implement the `curl`

in \(m+2\) dimensions in such a way
that the formula reduces correctly to the previous cases:

\[(\nabla \times F)_{k_1\dots k_m} = \frac{h_{k_1}\cdots h_{k_m}}{J}\sum_{ij}\epsilon_{ijk_1\dots k_m}\partial_i\Bigl(h_jF_j\Bigl)\]

And in particular, when \(F\) is an
`array`

of vector-valued functions \(F_{d_1,\dots,d_n,i}\) the `curl`

is computed for each vector:

\[ \begin{split} (\nabla \times F)_{d_1\dots d_n,k_1\dots k_m} & =\\ &=\frac{h_{k_1}\cdots h_{k_m}}{J}\sum_{ij}\epsilon_{ijk_1\dots k_m}\partial_i\Bigl(h_jF_{d_1\dots d_n,j}\Bigl) \\ &=\sum_{ij}\frac{1}{h_ih_j}\epsilon_{ijk_1\dots k_m}\partial_i\Bigl(h_jF_{d_1\dots d_n,j}\Bigl) \\ &=\sum_{ij}\frac{1}{h_ih_j}\epsilon_{ijk_1\dots k_m}\Bigl(\partial_i(h_j)F_{d_1\dots d_n,j}+h_j\partial_i(F_{d_1\dots d_n,j})\Bigl) \end{split} \]

where the last equality is preferable in practice as the derivatives of the scale factor can be computed symbolically and the computation of the derivatives of \(F\) is more efficient than the direct computation of \(\partial_i\bigl(h_jF_{d_1\dots d_n,j}\bigl)\) via finite differences. In 2-dimensional Cartesian coordinates:

In 3 dimensions, for an irrotational vector field:

And for tensors of vector-valued functions:

```
f <- matrix(c("x","-y","z","x^3*y^2","x","0"), nrow = 2, byrow = TRUE)
curl(f, var = c("x","y","z"))
#> [,1] [,2] [,3]
#> [1,] "0" "0" "0"
#> [2,] "0" "0" "(1) * 1 + (x^3 * (2 * y)) * -1"
```

The same syntax holds for `functions`

where numerical
methods are automatically applied and for arbitrary orthogonal
coordinate systems as shown in the previous sections.

The Laplacian is a differential operator given by the divergence of
the gradient of a scalar-valued function \(F\), resulting in a scalar value giving the
flux density of the gradient flow of a function. The Laplacian occurs in
differential equations that describe many physical phenomena, such as
electric and gravitational potentials, the diffusion equation for heat
and fluid flow, wave propagation, and quantum mechanics.^{5} In terms of the scale
factor, the operator is written as:

\[\nabla^2F = \frac{1}{J}\sum_i\partial_i\Biggl(\frac{J}{h_i^2}\partial_iF\Biggl)\]

where \(J=\prod_ih_i\). When the
function \(F\) is a tensor-valued
function \(F_{d_1,\dots,d_n}\), the `laplacian`

is computed for each scalar component:

\[(\nabla^2F)_{d_1\dots d_n} = \frac{1}{J}\sum_i\partial_i\Biggl(\frac{J}{h_i^2}\partial_iF_{d_1\dots d_n}\Biggl)=\frac{1}{J}\sum_i\partial_i\Bigl(Jh_i^{-2}\Bigl)\partial_iF_{d_1\dots d_n}+Jh_i^{-2}\partial^2_iF_{d_1\dots d_n}\]

where the last equality is preferable in practice as the derivatives of the scale factor can be computed symbolically and the computation of the derivatives of \(F\) is more efficient than the direct computation of \(\partial_i\bigl(\frac{J}{h_i^2}\partial_iF\bigl)\) via finite differences. In Cartesian coordinates:

```
f <- "x^3+y^3+z^3"
laplacian(f, var = c("x","y","z"))
#> [1] "3 * (2 * x) + 3 * (2 * y) + 3 * (2 * z)"
```

And for tensors of scalar-valued functions:

```
f <- array(c("x^3+y^3+z^3", "x^2+y^2+z^2", "y^2", "z*x^2"), dim = c(2,2))
laplacian(f, var = c("x","y","z"))
#> [,1] [,2]
#> [1,] "3 * (2 * x) + 3 * (2 * y) + 3 * (2 * z)" "2"
#> [2,] "2 + 2 + 2" "z * 2"
```

The same syntax holds for `functions`

where numerical
methods are automatically applied and for arbitrary orthogonal
coordinate systems as shown in the previous sections.

Guidotti E (2022). “calculus: High-Dimensional Numerical and Symbolic
Calculus in R.” *Journal of Statistical Software*,
*104*(5), 1-37. doi:10.18637/jss.v104.i05

A BibTeX entry for LaTeX users is

```
@Article{calculus,
title = {{calculus}: High-Dimensional Numerical and Symbolic Calculus in {R}},
author = {Emanuele Guidotti},
journal = {Journal of Statistical Software},
year = {2022},
volume = {104},
number = {5},
pages = {1--37},
doi = {10.18637/jss.v104.i05},
}
```