You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
--- either equally spaced (left plot) or using Chebyshev nodes (right plot). Additional for both cases we consider an exact evaluation, i.e. points $(x_i, y_i) = (x_i, f(x_i))$ as well as noisy evaluations $(x_i, \tilde{y}_i)$ with
838
838
```math
839
-
\tilde{y}_i = f(x_i) + η_i
839
+
\tilde{y}_i = f(x_i) + ε_i
840
840
```
841
-
where $η_i$ is a random number of magnitude $|\eta_i| ≤ 10^\texttt{η\_log\_poly}$.
841
+
where $ε_i$ is a random number of magnitude $|ε_i| ≤ \texttt{ε\_poly}$.
842
842
"""
843
843
844
844
# ╔═╡ d5de3b11-7781-4100-8a63-d26426685bbc
845
845
md"""
846
846
- Number of nodes `n_nodes_poly = ` $(@bind n_nodes_poly Slider(2:20; show_value=true, default=10))
where $\| v \|_2 = \sqrt{ \sum_{i=1}^n v_i^2 }$ is the Euklidean norm.
1727
-
If we are using the monomial basis
1728
-
the matrix $\mathbf{V}$ is again equal to a Vandermode matrix,
1729
-
same as in polynomial interpolation.
1730
-
However, since for regression problems we usually have that $n > m + 1$
1731
-
it is rectangular in this case:
1724
+
We again recognise $\mathbf V$ to be a Vandermonde matrix
1725
+
similar to the polynomial interpolation case, just in this case a rectangular
1726
+
matrix as $n > m + 1$, that is
1732
1727
```math
1733
1728
\mathbf{V} = \left(\begin{array}{ccc}
1734
1729
1 & x_1 & \ldots & x_1^m \\
@@ -1737,7 +1732,10 @@ it is rectangular in this case:
1737
1732
1 &x_n & \ldots & x_n^m \\
1738
1733
\end{array}\right) \in \mathbb{R}^{n\times (m+1)}
1739
1734
```
1735
+
"""
1740
1736
1737
+
# ╔═╡ 20832145-54ba-4503-8099-e49b45f5024f
1738
+
md"""
1741
1739
In polynomial regression our job is now to minimise expression (16),
1742
1740
which means that we want to find the coefficient vector $\mathbf{c}$,
1743
1741
which minimises $\|\mathbf{y} - \mathbf{V} \mathbf{c} \|_2^2$.
@@ -1868,29 +1866,29 @@ Here, we will restrict ourselves to exploring the parameter space a little using
1868
1866
Foldable("Some interesting experiments with the visualisation below.",
1869
1867
md"""
1870
1868
**Polynomial interpolation regime: Sanity check with our results from before:**
1871
-
- `n = 20` and `η_log` small: Increase degree `m` slowly up to `20`: you should observe Runge's phaenomenon for large `m` as before.
1872
-
- Keep `n = 20` and `m = 20` and increase the noise `η_log`: The error drastically increases (well beyond the original noise level `η_log`); the fit is very unstable as we are essentially doing polynomial interpolation with equispaced points.
1869
+
- `n = 20` and noise `ε` small (e.g. `ε = 0.001`): Increase degree `m` slowly up to `20`: you should observe Runge's phaenomenon for large `m` as before.
1870
+
- Keep `n = 20` and `m = 20` and increase the noise `ε`: The error drastically increases (well beyond the original noise level `ε`); the fit is very unstable as we are essentially doing polynomial interpolation with equispaced points.
1873
1871
- In general for `m = n` (polynomial interpolation) the situation is similar, try for example `n=15` and `m=15` with varying noise.
1874
1872
1875
1873
**The regime $n \gg m$ of least-squares regression:**
1876
-
- Set `m=15`, `m=15` and `η_log = -1`, so polynomial interpolation with large noise. The errors are fairly large. Now increase `n` slowly to `50`: All of a sudden the errors are in control and basically never exceed `η_log`. Play with `η_log` by making it even larger: You should see the error remains well below `η_log` in all cases.
1877
-
- Set `m=40` and `η_log = -0.5` and slide the degree `m` between `20` and `10`. At `m=10` the error is notacibly smaller than at `m=20`. This suggests that somehow the ratio $\frac{m}{n}$ has some influence to what extend measurement noise $η$ translates to an error in the outcome of polynomial regression.
1878
-
- Set `n=20` and `η_log = -1.5` and slide the polynomial degree. We realise that there is sweet spot aronud `m=10` when the error is overall smallest. At too low polynomial degrees $m$ the model is not rich enough to approximate the sine, while at too large degrees $m$ the ratio $\frac{m}{n}$ gets large we get into the regime of a polynomial interpolation. Therefore the numerical noise begins to amplify, resulting in larger and larger errors as we keep increasing $m$.
1874
+
- Set `m=15`, `m=15` and `ε = 0.1`, so polynomial interpolation with large noise. The errors are fairly large. Now increase `n` slowly to `50`: All of a sudden the errors are in control and basically never exceed `ε`. Play with `ε` by making it even larger: You should see the error remains well below `ε` in all cases.
1875
+
- Set `n=40` and `ε = 0.316` and slide the degree `m` between `20` and `10`. At `m=10` the error is notacibly smaller than at `m=20`. This suggests that somehow the ratio $\frac{m}{n}$ has some influence to what extend measurement noise $η$ translates to an error in the outcome of polynomial regression.
1876
+
- Set `n=20` and `ε = 0.0316` and slide the polynomial degree. We realise that there is sweet spot aronud `m=10` when the error is overall smallest. At too low polynomial degrees $m$ the model is not rich enough to approximate the sine, while at too large degrees $m$ the ratio $\frac{m}{n}$ gets large we get into the regime of a polynomial interpolation. Therefore the numerical noise begins to amplify, resulting in larger and larger errors as we keep increasing $m$.
1879
1877
""")
1880
1878
1881
1879
# ╔═╡ 4562b2f4-3fcb-4c60-83b6-cafd6e4b3144
1882
1880
md"""
1883
1881
- Number of samples `n = ` $(@bind n Slider(10:5:50; show_value=true, default=20))
- The **other solution** is to use **non-equally spaced points**:
1953
1951
* The typical approach are **Chebyshev nodes**
1954
1952
* These lead to **exponential convergence**
1953
+
1954
+
Notice that all of these problems lead to linear systems $\textbf A \textbf x = \textbf b$ that we need to solve. How this can me done numerically we will see in [Direct methods for linear systems](https://teaching.matmat.org/numerical-analysis/06_Direct_methods.html).
0 commit comments