Processing math: 100%
본문 바로가기

ML&DL/Dive into Deep Learning

[4.3.4] Dive into Deep Learning : exercise answers

728x90
반응형
 

4.3. The Base Classification Model — Dive into Deep Learning 1.0.3 documentation

 

d2l.ai

[1]

 

Lv denotes the 'averaged total validation loss' and Ldv denotes 'averaged validation loss of a minibatch'. By the question, we have to find the relationship between Lv and Ldv.

 

Let 

sample size =N(total examples in the dataset)

minibatch size =M(number of examples in the minibatch)

 

Thus NM= number of minibatches = α 

 

Lv=1ααi=1Lqv 

Averaging the sum of all validation loss of minibatches.


[2]

 

In our case, there are no biased probability(all tha minibatches has the same probalility to occur which is 1/α). Which means that the expected value will match the averaged value.

 

Since Lv is the averaged Lqv, the expected values will be the same.

 

Reasons why we should use Lv even though it is unbiased. 

 

1. There are no evidence the data distributed through the minibatch is well distributed.

 

2. In statistics, it is known that unbiased estimators are more reliable if the batch size and number of batches are big enough.


[3]

 

Let L(ˆy|x) be the expeced loss. For optimal ˆy, we have to find ˆy that minimizes L. Thus ˆyoptimal=argminˆyL(ˆy|x)

 

By the same context of expected value, the same holds for expected loss.

 

L(ˆy|x)=yl(y,ˆy)P(y|x)

 

Here, P(y|x) is the probability of y and l(y,ˆy) is the loss given y from P(y|x).

 

Thus for an optimal selection of ˆy is

ˆyoptimal=argminˆyyl(y,ˆy)P(y|x)

728x90
반응형