Simple Notes on Jensen's Inequality

1 What is Jensen's Inequality?

In mathematics, Jensen's Inequality, named after Danish mathematician Johan Jensen, connects a convex function's value within an integral to the integral of that convex function.

This concept, first proven by Jensen in 1906, builds on Otto Hölder's earlier 1889 proof for doubly-differentiable functions.

1.1 Convex and Concave Function

To understand the inequality itself, we first need to distinguish between convex and concave functions:

  • Convex Function: A function is convex if, for any two points in its domain, the line segment connecting these two points lies above the graph of the function. In mathematical terms, for a convex function \(\varphi\), if \(x_1\) and \(x_2\) are in its domain and \(0 \leq \theta \leq 1\), then:

    \[\varphi(\theta x_1 + (1-\theta) x_2) \leq \theta \varphi(x_1) + (1-\theta) \varphi(x_2) \]

  • Concave Function: A function is concave if the opposite holds true; that is, the line segment connecting any two points lies below the graph of the function.

2 Finite form of Jensen's Inequality

Now, let's state Jensen's Inequality:

  • For a real convex function \(\varphi\) and real numbers \(x_1, x_2, \ldots, x_n\) in its domain, if \(\theta_1, \theta_2, \ldots, \theta_n\) are non-negative numbers that sum to 1, then:

    \[\varphi(\theta_1 x_1 + \theta_2 x_2 + \ldots + \theta_n x_n) \leq \theta_1 \varphi(x_1) + \theta_2 \varphi(x_2) + \ldots + \theta_n \varphi(x_n) \]

    • In other words, a convex function \(\varphi\) applied to the weighted average of its domain points is always less than or equal to the weighted average of the function values at those points.
    • Equality holds if and only if \(x_1=x_2=\cdots=x_n\) or \(\varphi\) is linear on a domain containing \(x_1, x_2, \cdots, x_n\).
  • For concave function \(\varphi\), we have

    \[ \varphi(\theta_1 x_1 + \theta_2 x_2 + \ldots + \theta_n x_n) \geq \theta_1 \varphi(x_1) + \theta_2 \varphi(x_2) + \ldots + \theta_n \varphi(x_n) \]

2.1 Examples

  • When \(\theta_i = \frac{1}{n}\)

    For a real convex function \(\varphi\), numbers \(x_{1}, x_{2}, \ldots, x_{n}\) in its domain,

    \[\varphi\left(\frac{\sum x_{i}}{n}\right) \leq \frac{\sum \varphi\left(x_{i}\right)}{n} \]

  • When \(\theta_i = \frac{1}{n}\) and \(\varphi(x)=x^2\)

    Let \(x_{1}, \ldots, x_{n}\) be real numbers. Then

    \[A_{n}=\frac{x_{1}+\ldots+x_{n}}{n} \leq Q_{n}=\sqrt{\frac{x_{1}^{2}+\ldots+x_{n}^{2}}{n}} \]

    which gives

    \[ x_{1}+\ldots+x_{n} \leq \sqrt{n(x_{1}^{2}+\ldots+x_{n}^{2})} \]

3 Measure-theoretic form of Jensen's Inequality

Let

  • Let \((\Omega, A, \mu)\) be a measurable space.
  • Let \(f: \Omega \rightarrow \mathbb{R}\) be a \(\mu\)-measurable function
  • \(\varphi: \mathbb{R} \rightarrow \mathbb{R}\) is convex

Then

\[ \varphi\left(\int_{\Omega} f \mathrm{~d} \mu\right) \leq \int_{\Omega} \varphi \circ f \mathrm{~d} \mu \]

3.1 Example in probability space

Let

  • Let \((\Omega, \mathfrak{F}, \mathrm{P})\) be a probability space
  • \(X\) is an integrable real-valued random variable
  • \(\varphi\) a convex function.

Then

\[ \varphi(\mathrm{E}[X]) \leq \mathrm{E}[\varphi(X)] \]

Now, consider the function \(f\) as a random variable \(X\), and the integral with respect to \(\mu\) as an expected value, denoted as \(\mathrm{E}\).

4 Reference

Jensen's inequality - Wikipedia