Rousseeuw Peter J. and Mia Hubert described in the article "Regression Depth" a new statistical concept called regression depth. In this section, we define this concept and why it is interesting in a geometrical point of view.

In bivariate statistics there is a need to estimate the expression of the two statistical variables. An usual tool is to use simple linear regression. The simple regression line is the line that minimize the sum of the square residuals of a bivariate system.

Simple linear regression

Given a dataset $Z_n = \{(x_i, y_i), i = 1,...,n \} \subset \rm I\!R^2$, the simple regression line $f$ is given by
\[ f \equiv y = \alpha x + \beta \]
where $\alpha$ and $\beta$ are chosen such that they minimize $LS(\alpha, \beta)$
\[ LS(\alpha, \beta) = \sum^n_{i=1} (y_i - \beta - \alpha x_i)^2 \]

We present in the following chapter an alternative to simple regression introduced by Rousseeuw and Hubert.

Our goal is to find a line that fits a given dataset $Z_n = \{(x_i, y_i), i = 1,...,n \} \subset \rm I\!R^2$.

Given a candidate fit $l$ such that $l \equiv y = mx + b$, we define (as used in the previous section) its residuals by $r_i(l) = y_i - m x_i - b$ for every point $(x_i, y_i) \in Z_n$.

Definition: non-fit

A candidate fit $l = (m, b)$ to $Z_n$ is called a *non-fit* iff there exists a real number $v$ that does not coincide with any $x_i$ and such that $\phi_i(l)$ is satisfied.
\[ \phi_i(l) = ((\forall x_i < v, r_i(l) < 0) \wedge (\forall x_i >v, r_i(l) > 0)) \vee ((\forall x_i < v, r_i(l) > 0) \wedge (\forall x_i > v, r_i(l) < 0)) \]

**Note:** We use $l = (m, b)$ as a notation for a line $l \equiv y = mx + b$.

The following plot shows two *non-fits* $\Theta$ and $\Psi$ and the corresponding v-value $v_{\Theta}$ and $v_{\Psi}$. From this plot we see that we can interpret that a *non-fit* is a line for which there exists a tilting point (red cross) around which the line can rotate to verticality without crossing any point of $Z_n$.

Definition: regression depth

The regression depth $rdepth(l, Z_n)$ of any candidate fit $l$ relative to a dataset $Z_n$ is the minimal number of observations from $Z_n$ that should be removed so that $l$ becomes a *non-fit*.

To illustrate this principle, the following plot is similar to the previous one except a point has been added (in purple). In this example, $\Theta$ is not a non-fit anymore and $rdepth(\Theta, Z_n) = 1$ since the removal of the purple point would make $\Theta$ a non-fit again.

In this section we present an algorithm computing the regression depth of a given fit. First we need to define some notions.

Definition: $L^-(x), L^+(x), R^-(x), R^+(x)$

Given a line $l = (m, b)$, we define:

- $L^-(a) = \#\{j \mid x_j \le a$ and $r_j(l) \le 0\}$
- $L^+(a) = \#\{j \mid x_j \le a$ and $r_j(l) \ge 0\}$
- $R^-(a) = \#\{j \mid x_j > a$ and $r_j(l) \le 0\}$
- $R^+(a) = \#\{j \mid x_j > a$ and $r_j(l) \ge 0\}$

Now that we have defined these regions, we describe the algorithm as follow.

Algorithm: Regression depth (1)

- $l = (m, b)$ a fit
- $Z_n = \{(x_i, y_i), i = 1,...,n \} \subset \rm I\!R^2$

- Sort all element of $Z_n$ by their $x$ coordinate such that $x_1 < x_2 < ... < x_n$
- Compute $rdepth(l, Z_n) = \min_{1 \le i \le n} (\min \{L^+(x_i) + R^-(x_i), L^-(x_i) + R^+(x_i)\})$

Given the previous algorithm, one can compute the regression depth of a fit in $\mathcal{O}(n \log(n))$ time.

- Step one consist in a sorting, which can be done by an efficient sorting algorithm (e.g. Quicksort) in $\mathcal{O}(n \log(n))$ time.
- Step two can be performed in linear time ($\mathcal{O}(n)$) by updating $L^-, L^+, R^-, R^+$ for each value of $i$.

Here is an application implementing the previous algorithm. Points and lines can be added and the regression depth of any line can be obtained by selecting the option and placing your mouse over the line.