Minimize the sum of squared residuals.

The most commonly used method.

Easy to calculate.

The disadvantage is that it is greatly affected by outliers.

Minimize the sum of the absolute values of the residuals.

Outliers have less influence than the least squares method, but the fitting ability is weak.

For example, consider two points (x,y)=(0,0),(0,10).

Intuitively, the y=5 line is the best.

The sum of the absolute values of the residuals is 10 for all of y=0, y=5, and y=10, so y=5 is not fixed.

In other words, there is no ability to draw a line between points above and below.

Minimize the sum of residuals.

Equate the sum of the absolute residuals of the points above the line with the sum of the absolute residuals of the points below the line.

It has the ability to try to draw a line between points above and below.

It is possible to make up for the shortcomings of the minimum absolute value method while maintaining the advantage of being less affected by outliers.

If a human were to look at the plot and try to draw a regression line, this would be the method.

First, linearly interpolate all points.

Instead of regressing points to lines, regress linearly interpolated lines to lines.

Integrate the square of the difference between the lines or the absolute value, and calculate it in the same way as for points.

The absolute value method corresponds to minimizing the area between lines.

The equal upper and lower method gives the same result as the absolute value method.

Consider the proper use of "point-line regression" and "line-line regression".

To predict stock prices, consider drawing a regression line through stock prices over the past year.

It is basically data every day, but assume that it is data every minute only in April.

If you draw a regression line normally, the result will be that April data is emphasized.

Good result if you want to interpret the stock price in April.

Not suitable if you want to make predictions for months other than April.

Given that the sample was randomly selected, it is highly probable that the next random sample will also be April data.

For the purpose of predicting what the next random sample will be, this is a good result.

But if you know the month of the data you want to predict, a month-weighted regression line is better.

If you know the month you want to predict, it's good to do a local regression.

Suppose we want to draw an unbiased regression line over the entire month, rather than over a particular month.

In that case, if the points are linearly interpolated, the frequency of monthly data will not be biased.

"Line-Line Regression" aims to describe the population.

Random sampling does not mean that the sample is evenly distributed from the population.

Accidental biased sampling may occur.

For example, if you randomly sample 365 stocks from a year of stock data, it's unlikely that all days will be unique.

Random sample results are necessarily biased, and bias should be eliminated.

Think about the amount of information.

When you look at the result of rolling the dice, the information increases.

At that time, there is information about which side was up and information about what was written on that side.

The same is true for random sampling from the population.

You get information about which day was randomly selected and what the stock price was on that day.

The latter information is included in the population, but the former information is not.

The former information should be discarded when making population inferences.

Therefore, the points should be linearly interpolated.

In general local regression, weighted averaging is performed by weighting points closer to the point to be estimated.

However, if the points are biased to either the left or right, the guess result will be biased.

On the other hand, if the linearly interpolated line is local regression, the left/right bias has already been eliminated.

However, linear interpolation cannot extrapolate.

Robust regression is a method of eliminating the effects of outliers by ignoring them.

It is a problem that the regression curve is greatly deviated by only one outlier of an order of magnitude.

Outliers are meaningful and may not be ignored.

If there are too many outliers, as many as what you thought were correct, you don't know which one is correct.

Let us regard the majority as correct.

If you gradually increase the data, there are cases where the minority and the majority are reversed at some point.

Therefore, rather than ignoring outliers, consider them as another group of valid data.

There are multiple regression lines and each data belongs to the group of closest regression lines.

For example, let's consider the case of plotting either "y=x" or "y=2x" randomly with a probability of 50% each.

It can never be expressed by a single regression curve.

It doesn't have to be two groups.

How to divide the groups becomes a clustering problem.