Problem with the entropy #2

xuehy · 2017-07-31T06:26:26Z

The entropy of a Gaussian distribution is
k/2 * log(2 * pi * e) + 1/2 * log(|Sigma|) according to the Wikipedia where k is the dimension of the distribution.

However, in the code, the entropy is calculated by -1/2 * (log(2*pi + |sigma|) + 1).

Why?

alexis-jacq · 2017-08-04T11:53:58Z

it's a multidimensional normal distribution with a spherical covariance:
you have Nd_Sigma = 1d_sigma^2 * Nd_Identity. so |Nd_Sigma| = 1d_sigma^(2*k)

then, k/2 * log(2 * pi * e) + 1/2 * log(|Sigma|) = k/2 * (log(2*pi*e*1d_sigma^2)) = k/2 * (log(2*pi*1d_sigma^2) + 1)

After, you can divide by k to reduce the weight of the entropy loss. the -1 is to make the quantity positive, so the gradient descent will make it close to 0.

But you are right, there is still a typo. it should be
entropy = -0.5*((sigma_sq x 2*pi.expand_as(sigma_sq)).log()+1)
instead of
entropy = -0.5*((sigma_sq + 2*pi.expand_as(sigma_sq)).log()+1)

kkjh0723 · 2017-09-08T05:50:05Z

@alexis-jacq @xuehy Have you tried the modified entropy? I also found that the original entropy calculation seems wrong, and changed as @alexis-jacq one. But It seems the original one looks better in performance though I'm testing on a different environment (not Mujoco). I want to know how the modified entropy changes learning in Mujoco environment. Unfortunately, I couldn't run Mujoco because of Python version...

giubacchio · 2018-02-21T09:17:13Z

I have a doubt with using the entropy as well. If we use as Loss the probability density function of the gaussian with u and sigma squared estimated from the net, evaluated in the point corresponding to the executed action, we will found as its derivative with respect to sigma squared:
dL/dsigma_sq = 1/(2*sigma_sq) - (x-u)^2/(2*sigma_sq^2)

If we add to the loss also the entropy (with a minus sign), following the formula mentioned by @alexis-jacq, its derivative with respect to sigma squared would be:
d(-E)/dsigma_sq = -1/(2*sigma_sq)
which is equivalent (apart from the minus sign) to the first term of dL/dsigma_sq.

Since it is suggested to multiply the entropy by a constant factor (1e-4 in the Mnih's paper), it's seems to me that the contribution of the entropy would be very marginal..

Am I missing something?
Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem with the entropy #2

Problem with the entropy #2

xuehy commented Jul 31, 2017 •

edited

Loading

alexis-jacq commented Aug 4, 2017 •

edited

Loading

kkjh0723 commented Sep 8, 2017 •

edited

Loading

giubacchio commented Feb 21, 2018 •

edited

Loading

Problem with the entropy #2

Problem with the entropy #2

Comments

xuehy commented Jul 31, 2017 • edited Loading

alexis-jacq commented Aug 4, 2017 • edited Loading

kkjh0723 commented Sep 8, 2017 • edited Loading

giubacchio commented Feb 21, 2018 • edited Loading

xuehy commented Jul 31, 2017 •

edited

Loading

alexis-jacq commented Aug 4, 2017 •

edited

Loading

kkjh0723 commented Sep 8, 2017 •

edited

Loading

giubacchio commented Feb 21, 2018 •

edited

Loading