# Proof check of CPC paper

### Note

With high probability, I’m missing some important statement. If you find some mistake in this post, I appreciate you for letting me know.

### Intro

CPC with infoNCE^{1} is one of the most powerful unsupervised representation learning algorithms in the last few years.
When I read this paper carefully, I notice some minor points, so let me write here.

### Eq. 10

Appendix A.1 proves the optimal infoNCE’s loss is an upper bound of negative mutual information and , where . However, Eq. 10 in the paper does not hold always.

Let’s start from Eq. 9 in the paper:

where is a distribution over one sample and negative samples.

As know you, is a monotonically increase function, so if , then Eq. 10 in the paper is derived. But is density ratio that can be bigger than . Thus we cannot derive Eq. 10 from Eq. 9.

Fortunatelly, we can still obtain almost same bound:

### Eq. 15

Eq. 15 states infoNCE is a lower bound of MINE^{2} that is also lower bound of mutual information. But infoNCE may not be a lower bound of MINE. In Definition 3.1 in the MINE’s paper, MINE is defined by:

But, in the second term of Eq. 15 in CPC paper, is between two expectations. Even if we use Jensen’s inequality, the result is not equivalent to MINE.