Bregman divergence is a general class to measure “difference” between two data points. For instance, Euclidean distance and Kullback Leibler (KL) divergence are instances of Bregman divergence. In this post, I derive KL divergence from Bregman divergence formulation (for myself).

Bregman divergence is defined by the equation below:

where means inner product. When , this Bregman divergence is equivalent to KL divergence.

From Line 2 to Line 3, since is a probability distribution.

Reference

This blog post explains Bregman divergence in detail and gives useful links.