All biases can be covered in three categories:

(1) [necessity bias] the bias between the extracted sample and the prediction target

(2) [Selectivity bias] Bias in which samples are not selected according to the sampling criteria

(3) [Chance bias] Bias of random or unknown explanatory variables

As an example, consider the case of predicting the sweetness of strawberries.

(1) is a state in which the average size of the extracted strawberry is larger than that of the target strawberry.

(2) is the state where there are some that have been extracted and some that have not, even though they look exactly the same.

(3), although the appearance is exactly the same, strawberries with a biased sweetness are extracted.

(1) alone does not take into account the information of samples that were not extracted.

If we consider both (1) and (2), all known information is taken into account.

(1) and (2) consider only part of the contribution of known explanatory variables to what we are trying to predict.

(3) considers the contribution of chance and unknown explanatory variables.

Considering (1), (2), and (3), we consider everything necessary to explain the prediction target.

It is necessary to consider not only whether bias is leaked but also whether there is duplication.

Suppose in (2) that there are some strawberries that were not extracted even though they were extracted based on the criteria of "must be strawberries."

It can be interpreted that selection was performed by some other explanatory variable.

In other words, the standard for extraction has changed from the original "must be a strawberry".

Sampling was not performed correctly for the sampling criteria of "planned".

It can be said that the extraction was performed correctly with respect to the extraction criterion of "achievement".

If we consider the extraction criteria of "results", the bias of (2) disappears and it is summarized in (1).

"Weight" is given to samples that can explain the objective variable with "certain explanatory variables."

When verifying the credibility of others' inferences, it is necessary to verify whether or not "weight" is given with "certain explanatory variables".

Identify a ``certain explanatory variable'' that can explain the ``weight''.

If the known explanatory variables cannot explain, new explanatory variables are assumed.

New explanatory variables are freely assigned values that explain well.

Only the sample to be predicted should not freely decide the value of the new explanatory variable, and it will be an "unknown" value.

The "unknown" may be assigned a probability distribution of guesses by inductive reasoning.

The weights of the samples to be predicted are also recomputed with the assumed weighting criteria.

For example, if the weight of the sample to be predicted is 0.9, the result of inductive inference is 90% and "unknown" is 10%.

Suitability (bias) as a member of the sample set to be predicted is expressed as a weight.

Even when multiple hypotheses are combined, the hypotheses are represented only by the "weight distribution".

There are a plurality of methods for calculating the "prediction target weight" by back-calculating the extraction criteria from the "weight distribution".

It suffices to select the one that maximizes the “weight of prediction target”.

"Selectivity bias" is expressed as "predictor weight".

The bias of the explanatory variables between the prediction target sample and the extracted sample set is the necessity bias.

All explanatory variables are targeted, not just the explanatory variables used as extraction criteria.

For example, even if the "time" is randomly sampled, the result will be biased toward the past.

However, even if the explanatory variable is biased, if there is no correlation with the objective variable, there is no bias and it can be ignored.

The size of the residual indicates the strength of the relationship when all explanatory variables are considered.

First, we hypothesize a formula that predicts the target variable with other variables.

For each sample, the difference between the prediction and the actual measurement is the sample bias.

Weighted averaging results in the bias of the entire distribution.

Multiply the sum of weights ÷ (sum of weights - weight of "unknown") by what you calculated except "unknown".

The necessity bias is in the same units as the target variable.

Even if a certain explanatory variable can explain the target variable well, the explanatory variable may be biased by chance.

If you choose the best one out of 1000 explanatory variables, even if they are all random variables, you misunderstand that they are regular.

From only the "weight distribution" information given, one has to infer how much random influence is involved.

You can compare it with the case where the "distribution of weights" is randomly selected.

A distribution with a larger Wasserstein distance of the objective variable based on the uniform distribution is considered to be farther from random.

Compute the Wasserstein distance of the "weight distribution" for all combinations.

You can see the chance by the ratio of where it is located.

Chance bias has a value between 0 and 1.

The ``chance bias'' can be regarded as ``unknown'' and the ``necessity bias'' can be calculated.

Predictor weight (selectivity bias) can also be pushed towards the sample.

The sum of all biases is in the same units as the objective variable.

The sum of all biases does not include "unknown".

The optimal inference is the smallest sum of all biases.