Skip to content

Commit b789fae

Browse files
committed
fix: move underflow discussion to dropdown admonition block
1 parent 49bc031 commit b789fae

File tree

3 files changed

+471
-541
lines changed

3 files changed

+471
-541
lines changed
78.3 KB
Loading

examples/gaussian_processes/HSGP-Basic.ipynb

Lines changed: 343 additions & 422 deletions
Large diffs are not rendered by default.

examples/gaussian_processes/HSGP-Basic.myst.md

Lines changed: 128 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -317,7 +317,7 @@ m52_m, m52_c = pm.gp.hsgp_approx.approx_hsgp_hyperparams(
317317
)
318318
319319
print("Recommended smallest number of basis vectors for Matern 5/2 (m):", m52_m)
320-
print("Recommended smallest scaling factor for Matern 5/2(c):", np.round(m52_c, 1))
320+
print("Recommended smallest scaling factor for Matern 5/2 (c):", np.round(m52_c, 1))
321321
```
322322
323323
### The HSGP approximate Gram matrix
@@ -429,135 +429,144 @@ For your particular situation, **you will need to experiment across your range o
429429
430430
Be aware that it's also possible to encounter scenarios where a low fidelity HSGP approximation gives a more parsimonious fit than a high fidelity HSGP approximation. A low fidelity HSGP approximation is still a valid prior for some unknown function, if somewhat contrived. Whether that matters will depend on your context.
431431
432-
+++
433-
434-
## Avoiding underflow issues
435-
As noted above, the diagonal matrix $\Delta$ used in the calculation of the approximate Gram matrix contains information on the power spectral density, $\mathcal{S}$, of a given kernel. Thus, for the Gram matrix to be defined, $\mathcal{S} > 0$. Consequently, when picking HSGP hyperparameters $m$ and $L$ it is important to check $\mathcal{S} > 0$ for the suggested $m$ and $L$ values. The code in the next few cell compares the suitability of the suggested hyperparameters $m$ and $L$ for `matern52` to that of `ExpQuad` for the data spanning $x=-5$ to $x=95$, with the lengthscale prior between $\ell=1$ and $\ell=50$. As we shall see, the suggested hyperparameters for `ExpQuad` are for not suitable for $\ell=50$.
436-
437-
### Matern $\nu=5/2$, `matern52`
438-
439-
```{code-cell} ipython3
440-
m52_L = m52_c * 50 # c * s
441-
print(
442-
f"""m52_m = {m52_m:.1f},
443-
m52_c = {m52_c:.1f},
444-
m52_s = {50:.1f}
445-
and m52_L = {m52_L:.1f}"""
446-
)
447-
448-
m52_eigvals = pm.gp.hsgp_approx.calc_eigenvalues(m52_L, [m52_m])
449-
m52_omega = pt.sqrt(m52_eigvals)
450-
matern52_cov_ell_1 = pm.gp.cov.Matern52(1, ls=1)
451-
matern52_cov_ell_50 = pm.gp.cov.Matern52(1, ls=50)
452-
453-
# check non have underflowed to zero.
454-
assert np.all(matern52_cov_ell_1.power_spectral_density(m52_omega).eval() > 0)
455-
assert np.all(matern52_cov_ell_50.power_spectral_density(m52_omega).eval() > 0)
456-
```
457-
458-
### Squared exponential, `ExpQuad`
459-
460-
```{code-cell} ipython3
461-
# get ExpQuad suggested hyperparams.
462-
463-
epq_m, epq_c = pm.gp.hsgp_approx.approx_hsgp_hyperparams(
464-
x_range=[-5, 95], lengthscale_range=[1, 50], cov_func="ExpQuad"
465-
)
466-
467-
print("Recommended smallest number of basis vectors for ExpQuad (m):", epq_m)
468-
print("Recommended smallest scaling factor for ExpQuad (c):", np.round(epq_c, 1))
469-
```
470-
471-
```{code-cell} ipython3
472-
epq_L = epq_c * 50 # c * s
473-
print(
474-
f"""epq_m = {epq_m:.1f},
475-
epq_c = {epq_c:.1f},
476-
epq_s = {50:.1f}
477-
and epq_L = {epq_L:.1f}"""
478-
)
479-
480-
epq_eigvals = pm.gp.hsgp_approx.calc_eigenvalues(epq_L, [epq_m])
481-
epq_omega = pt.sqrt(epq_eigvals)
482-
epq_cov_ell_1 = pm.gp.cov.ExpQuad(1, ls=1)
483-
epq_cov_ell_50 = pm.gp.cov.ExpQuad(1, ls=50)
484-
485-
# repeat check as in the Matern52.
486-
assert np.all(epq_cov_ell_1.power_spectral_density(epq_omega).eval() > 0)
487-
assert np.all(
488-
epq_cov_ell_50.power_spectral_density(epq_omega).eval() > 0
489-
) # this will not pass assertion.
432+
+++ {"editable": true, "slideshow": {"slide_type": ""}}
433+
434+
:::{dropdown} Avoiding underflow issues
435+
:icon: alert-fill
436+
:color: info
437+
As noted above, the diagonal matrix $\Delta$ used in the calculation of the approximate Gram matrix contains information on the power spectral density, $\mathcal{S}$, of a given kernel. Thus, for the Gram matrix to be defined, $\mathcal{S} > 0$. Consequently, when picking HSGP hyperparameters $m$ and $L$ it is important to check $\mathcal{S} > 0$ for the suggested $m$ and $L$ values. The code in the next few cell compares the suitability of the suggested hyperparameters $m$ and $L$ for `matern52` to that of `ExpQuad` for the data spanning $x=-5$ to $x=95$, with the lengthscale prior between $\ell=1$ and $\ell=50$. As we shall see, the suggested hyperparameters for `ExpQuad` are for not suitable for $\ell=50$. <br> <br>
438+
**Matern $\mathbf{\nu = 5/2}$**
439+
```pycon
440+
>>> m52_L = m52_c * 50 # c * s, s is the half-range of the data i.e 0.5*(95 - -5).
441+
>>> print(
442+
... f"""m52_m = {m52_m:.1f},
443+
... m52_c = {m52_c:.1f},
444+
... m52_s = {50:.1f}
445+
... and m52_L = {m52_L:.1f}"""
446+
... )
447+
m52_m = 543.0,
448+
m52_c = 4.1,
449+
m52_s = 50.0
450+
and m52_L = 205.0
451+
452+
>>> m52_eigvals = pm.gp.hsgp_approx.calc_eigenvalues(m52_L, [m52_m])
453+
>>> m52_omega = pt.sqrt(m52_eigvals)
454+
>>> matern52_cov_ell_1 = pm.gp.cov.Matern52(1, ls=1)
455+
>>> matern52_cov_ell_50 = pm.gp.cov.Matern52(1, ls=50)
456+
457+
>>> # check none have underflowed to zero.
458+
>>> assert np.all(matern52_cov_ell_1.power_spectral_density(m52_omega).eval() > 0)
459+
>>> assert np.all(matern52_cov_ell_50.power_spectral_density(m52_omega).eval() > 0)
460+
```
461+
462+
**Squared exponential**
463+
```pycon
464+
>>> # get ExpQuad suggested hyperparams.
465+
>>> epq_m, epq_c = pm.gp.hsgp_approx.approx_hsgp_hyperparams(
466+
... x_range=[-5, 95], lengthscale_range=[1, 50], cov_func="ExpQuad"
467+
... )
468+
469+
>>> print("Recommended smallest number of basis vectors for ExpQuad (m):", epq_m)
470+
Recommended smallest number of basis vectors for ExpQuad (m): 280
471+
>>> print("Recommended smallest scaling factor for ExpQuad (c):", np.round(epq_c, 1))
472+
Recommended smallest scaling factor for ExpQuad (c): 3.2
473+
474+
>>> epq_L = epq_c * 50 # c * s
475+
>>> print(
476+
... f"""epq_m = {epq_m:.1f},
477+
... epq_c = {epq_c:.1f},
478+
... epq_s = {50:.1f},
479+
... and epq_L = {epq_L:.1f}"""
480+
... )
481+
m52_m = 543.0,
482+
m52_c = 4.1,
483+
m52_s = 50.0,
484+
and m52_L = 205.0
485+
486+
>>> epq_eigvals = pm.gp.hsgp_approx.calc_eigenvalues(epq_L, [epq_m])
487+
>>> epq_omega = pt.sqrt(epq_eigvals)
488+
>>> epq_cov_ell_1 = pm.gp.cov.ExpQuad(1, ls=1)
489+
>>> epq_cov_ell_50 = pm.gp.cov.ExpQuad(1, ls=50)
490+
491+
>>> # repeat check as in the Matern52.
492+
>>> assert np.all(epq_cov_ell_1.power_spectral_density(epq_omega).eval() > 0)
493+
>>> assert np.all(
494+
... epq_cov_ell_50.power_spectral_density(epq_omega).eval() > 0,
495+
... "Power spectral density underflows when ls = 50.",
496+
... ) # this will not pass assertion.
490497
```
491498
492499
We see that not all values of $\mathcal{S}$ are defined for the squared exponential kernel when $\ell=50$.
493500
494501
To see why, the covariance of the kernels considered are plotted below along with their power spectral densities in log space. The covariance plot shows that for a set $\ell$, the tails of `matern52` are heavier than `ExpQuad`, while a higher $\ell$ for a given kernel type gives rise to higher covariance. The power spectral density is inversely proportional to the covariance - essentially the flatter the shape of the covariance function, the narrower the bandwidth and the lower the power spectral density at higher values of $\omega$. As a result, we see that for `ExpQuad` with $\ell = 50$, $\mathcal{S}\left(\omega\right)$ rapidly decreases towards $0$ before the domain of $\omega$ is exhausted, and hence we reach values at which we underflow to $0$.
495502
496-
```{code-cell} ipython3
497-
x = np.linspace(0, 10, 101)[:, None]
498-
fig, ax = plt.subplots(2, layout="tight", figsize=(10, 6))
499-
500-
ax[0].set_title(f"Covariance")
501-
ax[0].plot(x, epq_cov_ell_1(x).eval()[0], label=r"ExpQuad, $\ell = 1$")
502-
ax[0].plot(x, epq_cov_ell_50(x).eval()[0], label=r"ExpQuad, $\ell = 50$")
503-
ax[0].plot(x, matern52_cov_ell_1(x).eval()[0], label=r"Matern 5/2, $\ell = 1$", linestyle="--")
504-
ax[0].plot(x, matern52_cov_ell_50(x).eval()[0], label=r"Matern 5/2, $\ell = 50$", linestyle="--")
505-
ax[0].set_xlabel(r"$x_\mathrm{p}-x_\mathrm{q}$")
506-
ax[0].set_ylabel(r"$k\left(x_\mathrm{p}-x_\mathrm{q}\right)$")
507-
ax[0].set_yscale("log")
508-
ax[0].set_ylim(1e-10, 1e1)
509-
ax[0].legend(frameon=False, loc="lower left")
510-
511-
512-
ax[1].plot(epq_omega.eval(), epq_cov_ell_1.power_spectral_density(epq_omega).eval())
513-
ax[1].plot(epq_omega.eval(), epq_cov_ell_50.power_spectral_density(epq_omega).eval())
514-
ax[1].plot(
515-
m52_omega.eval(), matern52_cov_ell_1.power_spectral_density(m52_omega).eval(), linestyle="--"
516-
)
517-
ax[1].plot(
518-
m52_omega.eval(), matern52_cov_ell_50.power_spectral_density(m52_omega).eval(), linestyle="--"
519-
)
520-
ax[1].set_title("Power Spectral Density")
521-
ax[1].set_xlabel(r"$\omega$")
522-
ax[1].set_ylabel(r"$\mathcal{S}\left(\omega\right)$")
523-
ax[1].set_yscale("log")
524-
ax[1].set_ylim(1e-10, 3e2)
525-
plt.show()
526-
```
527-
528-
These underflow issues can arise when using a broad prior on $\ell$ as you need a $m$ large to cover small lengthscales, but these may cause underflow in $\mathcal{S}$ when $\ell$ is large. As the graphs above suggest, one can **consider a different kernel with heavier tails such as `matern52` or `matern32`**.
529-
530-
Alternatively, if you are certain you need a specific kernel, **you can use the linear form of HSGPs (see below) with a boolean mask**. In doing so, the sinusoids with vanishingly small coefficients in the linear combination are effectively screened out. E.g:
531503
```python
532-
import pymc as pm
533-
import numpy as np
534-
535-
x = np.sort(np.random.uniform(-1, 1, 10))
504+
>>> x = np.linspace(0, 10, 101)[:, None]
505+
>>> fig, ax = plt.subplots(2, layout="tight", figsize=(10, 6))
506+
507+
>>> ax[0].set_title(f"Covariance")
508+
>>> ax[0].plot(x, epq_cov_ell_1(x).eval()[0], label=r"ExpQuad, $\ell = 1$")
509+
>>> ax[0].plot(x, epq_cov_ell_50(x).eval()[0], label=r"ExpQuad, $\ell = 50$")
510+
>>> ax[0].plot(x, matern52_cov_ell_1(x).eval()[0], label=r"Matern 5/2, $\ell = 1$", linestyle="--")
511+
>>> ax[0].plot(x, matern52_cov_ell_50(x).eval()[0], label=r"Matern 5/2, $\ell = 50$", linestyle="--")
512+
>>> ax[0].set_xlabel(r"$x_\mathrm{p}-x_\mathrm{q}$")
513+
>>> ax[0].set_ylabel(r"$k\left(x_\mathrm{p}-x_\mathrm{q}\right)$")
514+
>>> ax[0].set_yscale("log")
515+
>>> ax[0].set_ylim(1e-10, 1e1)
516+
>>> ax[0].legend(frameon=False, loc="lower left")
517+
518+
519+
>>> ax[1].plot(epq_omega.eval(), epq_cov_ell_1.power_spectral_density(epq_omega).eval())
520+
>>> ax[1].plot(epq_omega.eval(), epq_cov_ell_50.power_spectral_density(epq_omega).eval())
521+
>>> ax[1].plot(
522+
... m52_omega.eval(), matern52_cov_ell_1.power_spectral_density(m52_omega).eval(), linestyle="--"
523+
... )
524+
>>> ax[1].plot(
525+
... m52_omega.eval(), matern52_cov_ell_50.power_spectral_density(m52_omega).eval(), linestyle="--"
526+
... )
527+
>>> ax[1].set_title("Power Spectral Density")
528+
>>> ax[1].set_xlabel(r"$\omega$")
529+
>>> ax[1].set_ylabel(r"$\mathcal{S}\left(\omega\right)$")
530+
>>> ax[1].set_yscale("log")
531+
>>> ax[1].set_ylim(1e-10, 3e2)
532+
>>> plt.show()
533+
```
534+
![alt text](ExpQuad_vs_Matern52_psd.png)
535+
These underflow issues can arise when using a broad prior on $\ell$ as you need an $m$ large enough to cover small lengthscales, but these may cause underflow in $\mathcal{S}$ when $\ell$ is large. As the graphs above suggest, one can **consider a different kernel with heavier tails such as `matern52` or `matern32`**.
536536
537-
large_m, large_l = pm.gp.hsgp_approx.approx_hsgp_hyperparams(
538-
x_range=[-1, 1], lengthscale_range=[1E-2, 4], cov_func="ExpQuad"
539-
)
540-
541-
print(large_m, large_l)
542-
# (2240, 12.8)
537+
Alternatively, if you are certain you need a specific kernel, **you can use the linear form of HSGPs (see below) with a boolean mask**. In doing so, the sinusoids with vanishingly small coefficients in the linear combination are effectively screened out. E.g:
543538
544-
with pm.Model() as model:
545-
# some broad prior on the lengthscale.
546-
ell = pm.HalfNormal('ell', sigma=1)
547-
cov_func = pm.gp.cov.ExpQuad(input_dim=1, ls=ell)
548-
# setup HSGP.
549-
gp = pm.gp.HSGP(m=[large_m], L=[large_l], parametrization="noncentered", cov_func=cov_func)
550-
phi, sqrt_psd = gp.prior_linearized(x[:, None])
551-
basis_coeffs = pm.Normal("basis_coeffs", size=gp.n_basis_vectors)
552-
# create mask that screens out frequencies with underflowing power spectral densities.
553-
mask = sqrt_psd > 0
554-
# now apply the mask over the m dimension & calculate HSGP function.
555-
f = pm.Deterministic("f", phi[:, mask] @ (basis_coeffs[mask] * sqrt_psd[mask]))
556-
# setup your observation model
557-
...
539+
```pycon
540+
>>> import pymc as pm
541+
>>> import numpy as np
542+
543+
>>> x = np.sort(np.random.uniform(-1, 1, 10))
544+
545+
>>> large_m, large_l = pm.gp.hsgp_approx.approx_hsgp_hyperparams(
546+
... x_range=[-1, 1], lengthscale_range=[1E-2, 4], cov_func="ExpQuad"
547+
... )
548+
549+
>>> print(large_m, large_l)
550+
2240, 12.8
551+
552+
>>> with pm.Model() as model:
553+
... # some broad prior on the lengthscale.
554+
... ell = pm.HalfNormal('ell', sigma=1)
555+
... cov_func = pm.gp.cov.ExpQuad(input_dim=1, ls=ell)
556+
... # setup HSGP.
557+
... gp = pm.gp.HSGP(m=[large_m], L=[large_l], parametrization="noncentered", cov_func=cov_func)
558+
... phi, sqrt_psd = gp.prior_linearized(x[:, None])
559+
... basis_coeffs = pm.Normal("basis_coeffs", size=gp.n_basis_vectors)
560+
... # create mask that screens out frequencies with underflowing power spectral densities.
561+
... mask = sqrt_psd > 0
562+
... # now apply the mask over the m dimension & calculate HSGP function.
563+
... f = pm.Deterministic("f", phi[:, mask] @ (basis_coeffs[mask] * sqrt_psd[mask]))
564+
... # setup your observation model
565+
... ...
558566
```
567+
:::
559568
560-
+++
569+
+++ {"editable": true, "slideshow": {"slide_type": ""}}
561570
562571
## Example 2: Working with HSGPs as a parametric, linear model
563572

0 commit comments

Comments
 (0)