Envelope and maximum theorems#

ECON2125/6012 Lecture 8 Fedor Iskhakov

Announcements & Reminders

  • Test 3 results and discussion

  • Thank you for submitting comments to the lecture notes!

  • Exam is scheduled to Monday, 6 November 9:00 to 12:15, multiple rooms

Plan for this lecture

  1. Correspondences and continuity

  2. The maximum theorem

  3. Envelope theorem

Supplementary reading:

  • Simon & Blume: 19.1, 19.2

  • Sundaram: chapter 9, 5.2.3

Parametric continuity#

Value function and parameters of optimization problems#

Let’s start with recalling the definition of a general optimization problem

Definition

The general form of the optimization problem is

\[\begin{split} V(\theta) = \max_{x} f(x,\theta) \\ \text {subject to} \\ g_i(x,\theta) = 0, \; i\in\{1,\dots,I\}\\ h_j(x,\theta) \le 0, \; j\in\{1,\dots,J\} \end{split}\]

where:

  • \(f(x,\theta) \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R}\) is an objective function

  • \(x \in \mathbb{R}^N\) are decision/choice variables

  • \(\theta \in \mathbb{R}^K\) are parameters

  • \(g_i(x,\theta) = 0, \; i\in\{1,\dots,I\}\) where \(g_i \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R}\), are equality constraints

  • \(h_j(x,\theta) \le 0, \; j\in\{1,\dots,J\}\) where \(h_j \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R}\), are inequality constraints

  • \(V(\theta) \colon \mathbb{R}^K \to \mathbb{R}\) is a value function

This lecture focuses on the value function in the optimization problem \(V(\theta)\), and how it depends on the parameters \(\theta\).

In economics we are interested how the optimized behavior changes when the circumstances of the decision-making process change

  • income/budget/wealth changes

  • intertemporal effects of changes in other time periods

We would like to establish the properties of the value function \(V(\theta)\):

  • continuity \(\rightarrow\) The maximum theorem

  • changes/derivative (if differentiable) \(\rightarrow\) Envelope theorem

  • monotonicity \(\rightarrow\) Supermodularity and increasing differences (not covered here, see Sundaram ch.10)

Main idea for the maximum theorem

When the components of the optimization problem \(f(x,\theta)\), \(g_i(x,\theta)\) and \(h_j(x,\theta)\) are continuous, then the value function \(V(\theta)\) is also continuous, in certain sense

We need to accurately define the notion of continuity for all components of the optimization problem

  • objective function

  • constraints \(\leftrightarrow\) admissible set

Denote the admissible set \(\mathcal{D}(\theta)\)

\[ \mathcal{D}(\theta) = \left\{ x \in \mathbb{R}^N \colon g_i(x,\theta) = 0, \; i\in\{1,\dots,I\}, \; h_j(x,\theta) \le 0, \; j\in\{1,\dots,J\} \right\} \]

In solving the optimization problem we are not only interested in the attainable optimal value \(V(\theta)\), but also in the set of maximizers/minimizers \(\mathcal{D}^\star(\theta)\) corresponding to each parameter value \(\theta\)

Definition

We will refer to the pair

\[\begin{split} V(\theta) = \max_{x} f(x,\theta) \\ \mathcal{D}^\star(\theta) = \mathrm{arg}\max_x f(x,\theta) \end{split}\]

as the solution of the optimization problem

\[\begin{split} f(x,\theta) \to \max_{x} \\ \text{subject to} \; x \in \mathcal{D}(\theta), \end{split}\]

where

\[\begin{split} \begin{array}{l} f(x,\theta) \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R},\\ \mathcal{D}(\theta) \subset \mathbb{R}^N \text{ for all } \theta,\\ \theta \in \Theta \subset \mathbb{R}^K \end{array} \end{split}\]

Correspondences#

Note that the mappings of \(\theta\) to \(\mathcal{D}(\theta)\) or \(\mathcal{D}^\star(\theta)\) are not functions because both \(\mathcal{D}(\theta)\) and often \(\mathcal{D}^\star(\theta)\) have multiple elements for a given \(\theta\)

Definition

A correspondence (set-valued function) is a map that associates elements of its domain to sets of elements in its range, i.e.

\[ f \colon \Theta \subset \mathbb{R}^K \to P(\mathbb{R}^N) \]

where \(P(\mathbb{R}^N)\) denotes the power set of \(\mathbb{R}^N\), i.e. the set of all subsets of \(\mathbb{R}^N\). It can also be denoted as \(2^{\mathbb{R}^N}\).

Example

Let correspondence \(\Phi\) be defined as

\[ \Phi \colon \theta \in \Theta \subset \mathbb{R}^K \to \mathcal{D}(\theta) \subset P(\mathbb{R}^N) \]

Example

\[ \phi \colon x \in [0,1] \mapsto \{0,1\} \]
\[ \phi \colon x \in [0,1] \mapsto (0,1) \]
\[ \phi \colon x \in [0,1] \mapsto [0,1] \]
\[ \phi \colon x \in [0,1] \mapsto [x,1] \]
\[ \phi \colon x \in [0,1] \mapsto (0,1/x] \]
\[ \phi \colon x \in (0,1] \mapsto (0,1/x] \]
_images/uhc_lhc.png

Fig. 106 Examples of correspondences (for labels see below)#

Correspondences are classified by the properties of the sets they output:

  • open-valued correspondences

  • closed-valued correspondences

  • non-empty-valued correspondences

  • bounded-valued correspondences

  • convex-valued correspondences

  • compact-valued correspondences

  • finite-valued correspondences

  • singleton-valued correspondences (functions?)

Continuity of correspondences#

Recall the definition of the continuous function

Definition

Function \(f \colon A \subset \mathbb{R}^n \to \mathbb{R}^m\) is called continuous at \({\bf x} \in A\) if as \(n \to \infty\) for every converging to \({\bf x}\) sequence

\[ {\bf x}_n \to {\bf x} \quad \implies \quad f({\bf x}_n) \to f({\bf x}) \]

The equivalent definition of continuity relies on the the open epsilon-balls

Fact

A function \(f \colon A \subset \mathbb{R}^n \to \mathbb{R}^m\) is continuous at \({\bf x} \in A\) if and only if for every \(\epsilon >0\) there is a \(\delta>0\) such that

\[ {\bf x}' \in B_\delta({\bf x}) \implies f({\bf x}') \in B_\epsilon(f({\bf x})) \]

Thinking of the definition of limit stated in terms of open epsilon-balls, it is not hard to see the equivalence result

Generalization to correspondences is not straightforward because \(\in\) operation does not convert to the set-valued case immediately:

  • can be replaced by set inclusion \(\subset\)

  • can be represented by non-empty intersection \(\bar{\cap}\), i.e. \(A \bar{\cap} B \iff A \cap B \ne \emptyset \)

Namely, the condition in the definition above can be replaced with either

  • \({\bf x}' \in B_\delta({\bf x}) \implies f({\bf x}') \subset B_\epsilon(f({\bf x}))\), or

  • \({\bf x}' \in B_\delta({\bf x}) \ne \emptyset \implies f({\bf x}') \cap B_\epsilon(f({\bf x})) \ne \emptyset\)

Definition

Correspondence \(\gamma \colon X \to 2^Y\) is called upper hemi-continuous (uhc) at \({\bf x} \in X\) if for every open set \(V\) containing \(f({\bf x})\), i.e. \(f({\bf x}) \subset V\), there is an open set \(U\) such that

\[ {\bf x} \in U \text{ and } {\bf x}' \in U \implies f({\bf x}') \subset V \]

Correspondence \(\gamma \colon X \to 2^Y\) is called lower hemi-continuous (lhc) at \({\bf x} \in X\) if for every open set \(V\) intersecting with \(f({\bf x})\), i.e. \(f({\bf x}) \cap V \ne \emptyset\), there is an open set \(U\) such that

\[ {\bf x} \in U \text{ and } {\bf x}' \in U \implies f({\bf x}') \cap V \ne \emptyset \]

Definition

A corresponse is called continous if it is both upper and lower hemi-continuous

Note

Semi-continuity is a special notion of continuity for functions, and may be used as equivalent to hemi-continuity for correspondences

Examples, examples, examples (whiteboard)

Fact

Constant correspondences are both uhc and lhc

\[ \gamma \colon x \in X \mapsto \mathcal{D} \subset Y \text { for every } x \implies \gamma \text{ is continuous (uhc and lhc)} \]

Note

For closed-valued correspondences, a good rule of thumb for determining hemi-continuity at \({\bf x}\) is:

  • if moving “a little amount” away from \({\bf x}\) no new points are “discontinuously/suddenly” appear outside of \(f({\bf x})\), \(f\) is uhc

  • if moving “a little amount” away from \({\bf x}\) no points “suddenly” disappear from \(f({\bf x})\), \(f\) is lhc

The statement of the maximum theorem#

Extremely useful in many fields of economics:

  • demand (consumer) theory

  • theory of the firm: supply of products, demand for inputs

  • theory of economic growth

  • game theory and industrial relations

The maximum theorem

Let \(f(x,\theta) \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R}\) be a continuous function, and \(\mathcal{D}(\theta)\) be a compact-valued continuous correspondence. Then the value function \(V(\theta)\) is continuous on \(\mathbb{R}^K\) and \(\mathcal{D}^\star(\theta)\) is a compact-valued upper hemi-continuous correspondence on \(\Theta\)

In other words, continuity of the fundamentals of the optimization problem are inherited by the value function, but not to the full extent (continuity \(\to\) hemi-continuity)

\[\begin{split} \begin{array}{l} \text{$f$ function continuous}\\ \text{$\theta \mapsto \mathcal{D}(\theta)$ compact-valued}\\ \text{$\theta \mapsto \mathcal{D}(\theta)$ continuous (uhc + lhc)} \end{array} \longrightarrow \begin{array}{l} V(\theta) \text{ continuous}\\ \mathcal{D}^\star(\theta) \text{ upper hemi-continuous} \end{array} \end{split}\]

Maximum theorem under convexity

Let \(f(x,\theta) \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R}\) be a continuous function, and \(\mathcal{D}(\theta)\) be a compact-valued continuous correspondence. Then:

  1. The value function \(V(\theta)\) is continuous on \(\mathbb{R}^K\) and \(\mathcal{D}^\star(\theta)\) is a compact-valued upper hemi-continuous correspondence on \(\Theta\) (as before)

  2. If \(f(x,\theta)\) is concave in \(x\) and \(\mathcal{D}(\theta)\) is convex-valued for every \(\theta\), then \(\mathcal{D}^\star(\theta)\) is a convex-valued.
    If concavity of \(f(x,\theta)\) is strict, then \(\mathcal{D}^\star(\theta)\) is a singleton-valued upper hemi-continuous correspondence, hence a continuous function

  3. If \(f\) is concave in \((x,\theta)\) and the graph of \(\mathcal{D}(\theta)\) is convex, in addition to the above the value function \(V(\theta)\) is concave. Under strict concavity \(V(\theta)\) is also strictly concave.

Definition

The graph of a correspondence \(\gamma \colon X \to 2^Y\) is defined as the set \(\{(x,y) \in X \times Y \colon y \in f(x)\}\)

Consider a special case for when the optimizer is unique for each \(\theta\)

Fact

A single-valued correspondence that is hemi-continuous (either uhc or lhc) is continuous when viewed as a function. Conversely, every continuous function, when viewed as a single-valued correspondence, is both uhc and lhc.

In this case the upper semi-continuity, lower semi-continuity coincide with the “usual” continuity

Example

Budget correspondence in the two goods consumer optimization problem. Assuming \(p_1>0,\, p_2>0,\, m>0\), the budget correspondence can be defined as

\[ \beta \colon (p_1,p_2,m) \mapsto \{(x_1,x_2) \in \mathbb{R}^2 \colon \forall i\; x_i \ge 0, p_1 x_1 + p_2 x_2 \leq m\} \]
_images/desmos-budget.png

See online animation

Fact

Budget correspondence \(\beta\) defined above is both uhc and lhc, and therefore continuous

Example

Maximization of log utility subject to budget constraint

\[\begin{split} u(x_1, x_2) = \alpha \log(x_1) + \beta \log(x_2) \to \max_{x_1, x_2} \\ \text{ subject to} \\ p_1 x_1 + p_2 x_2 \leq m \end{split}\]
  • \(p_i\) is the price of good \(i\), \(p_i>0\)

  • \(m\) is the budget, assumed non-negative

  • \(\alpha>0\), \(\beta>0\)

  • \(x_1 \geq 0, \; x_2 \geq 0\), can show that these constraints never bind

_images/log_util.png

Fig. 107 Log utility with \(\alpha=0.4\), \(\beta=0.5\)#

_images/budget_set_3.png

Fig. 108 Utility max for \(p_1=1\), \(p_2 = 1.2\), \(m=4\), \(\alpha=0.4\), \(\beta=0.5\)#

The maximizer according to the FOC conditions and verified with SOC (see lecture 8) is

\[\begin{split} x_1^\star = \frac{\alpha}{\alpha + \beta} \cdot \frac{m}{p_1} \\ x_2^\star = \frac{\beta}{\alpha+\beta} \cdot \frac{m}{p_2} \end{split}\]

Applying the maximum theorem:

  1. Objective function is continuous in all arguments

  2. Constrained set is compact for all parameters (as they are defined)

  3. Budget correspondence is continuous

Therefore the theorem applies and we have: the value function is continuous in all parameters, and the set of maximizers is upper hemi-continuous.

Moreover, we can show that the utility function is strictly concave for all parameters, therefore the clause 2 of the maximum theorem under convexity applies, and the set of maximizers is a singleton-valued correspondence, i.e. a function. We have already found it above.

We can verify that the value function is indeed continuous by plugging the maximizer back into the objective function

\[ V(p_1,p_2,m) = \alpha \log\left( \frac{\alpha}{\alpha + \beta} \cdot \frac{m}{p_1} \right) + \beta \log\left( \frac{\beta}{\alpha+\beta} \cdot \frac{m}{p_2} \right) \]

Question

Can \(\alpha\) and \(\beta\) be also considered as parameters in the previous analysis?

For the numerical example set \(\alpha = 2\), \(\beta = 1\), \(m = 10\), and let \(p_1\) and \(p_2\) vary

Hide code cell source
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm

alpha = 2
beta = 1
m = 10
f = lambda x: alpha * np.log( alpha * m / ( (alpha+beta)*x[0] )) + beta * np.log( beta * m / ( (alpha+beta)*x[1] ))

lb,ub = .05,5

x = y = np.linspace(lb,ub, 100)
X, Y = np.meshgrid(x, y)
zs = np.array([f((x,y)) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)

fig = plt.figure(dpi=160)
ax2 = fig.add_subplot(111)
ax2.set_aspect('equal', 'box')
ax2.contour(X, Y, Z, 50,
            cmap=cm.jet)
plt.setp(ax2, xticks=[],yticks=[])

fig = plt.figure(dpi=160)
ax3 = fig.add_subplot(111, projection='3d')
ax3.plot_surface(X, Y, Z, 
            rstride=2, 
            cstride=2,
            alpha=0.7,
            linewidth=0.25)

plt.show()
_images/2ac49993b0197aa3381eed8343988c5176b0ba120abd6a42a4ec3b2eaa376779.png

Fig. 109 Value function in the space of prices \(p_1,p_2\)#

_images/dacc68412b52b5159acb8565c9236aa11a0297160d18e274c7d498bc98e42507.png

Fig. 110 Value function in the space of prices \(p_1,p_2\)#

Hide code cell source
alpha = 2
beta = 1
m = 10
# f = lambda x: alpha * np.log( alpha * m / ( (alpha+beta)*x[0] )) + beta * np.log( beta * m / ( (alpha+beta)*x[1] ))
f1 = lambda x: alpha * m / ( (alpha+beta)*x)
f2 = lambda x: beta * m / ( (alpha+beta)*x)

lb,ub = .5,5
x = np.linspace(lb,ub, 100)

fig = plt.figure(dpi=160)
ax = fig.add_subplot(111)
ax.plot(x,f1(x),label=r"$x_1^\star(p_1)$")
ax.plot(x,f2(x),label=r"$x_2^\star(p_2)$")
plt.legend(loc='upper right', frameon=False)
plt.show()
_images/bcb00d219e6213b9355805aa6c0c90dbe16a22d819123d0bd6a860bcb9a91991.png

Fig. 111 Optimal choice of \(x_1\) and \(x_2\) as function of prices \(p_1\) and \(p_2\) (aka demand curve)#

Example

Maximization of log-linear utility subject to budget constraint

\[\begin{split} u(x_1, x_2) = \alpha x_1 + \beta \log(x_2) \to \max_{x_1, x_2} \\ \text{ subject to} \\ p_1 x_1 + p_2 x_2 \leq m \\ x_1 \geq 0, \; x_2 \geq 0 \end{split}\]
  • \(p_i\) is the price of good \(i\), assumed non-negative

  • \(m\) is the budget, assumed non-negative

  • \(\alpha>0\), \(\beta>0\)

Form the Lagrangian with 3 inequality constraints (have to flip the sign for non-negativity to stay within the general formulation)

\[\begin{split} \mathcal{L}(x_1,x_2,\lambda_1,\lambda_2,\lambda_3) = \\ = \alpha x_1 + \beta\log(x_2) - \lambda_1 (-x_1) - \lambda_2 (-x_2) - \lambda_3 (p_1 x_1 + p_2 x_2 -m) = \\ = \alpha x_1 + \beta\log(x_2) + \lambda_1 x_1+ \lambda_2 x_2 - \lambda_3 (p_1 x_1 + p_2 x_2 -m) \end{split}\]

The necessary KKT conditions are given by the following system of equations

\[\begin{split} \begin{cases} \frac{\partial \mathcal{L}}{\partial x_1} = 0 \implies \alpha + \lambda_1 - \lambda_3 p_1 = 0 \\ \frac{\partial \mathcal{L}}{\partial x_2} = 0 \implies \frac{\beta}{x_2} + \lambda_2 - \lambda_3 p_2 = 0 \\ x_1 \ge 0 \\ x_2 \ge 0 \\ x_1 p_1 + x_2 p_2 \le m \\ \lambda_1 \ge 0 \text { and } \lambda_1 x_1 = 0 \\ \lambda_2 \ge 0 \text { and } \lambda_2 x_2 = 0 \\ \lambda_3 \ge 0 \text { and } \lambda_3 (x_1 p_1 + x_2 p_2 -m) = 0 \\ \end{cases} \end{split}\]

The KKT conditions can be solved systematically by considering all combinations of the multipliers. The two cases where the system is consistent give the solution

\[\begin{split} \begin{cases} x_1^\star = \frac{m}{p_1} - \frac{\beta}{\alpha}, \; x_2^\star = \frac{\beta p_1}{\alpha p_2}, & \text{ if } p_1/m \le \alpha/\beta, \\ x_1^\star = 0, \; x_2^\star = \frac{m}{p_2}, & \text{ if } p_1/m > \alpha/\beta \\ \end{cases} \end{split}\]
_images/corner_sol_2.png

Fig. 112 Corner solution#

Applying the maximum theorem:

  1. Objective function is continuous in all arguments

  2. Constrained set is compact for all parameters (as they are defined)

  3. Budget correspondence is continuous

Therefore the theorem applies and we have: the value function is continuous in all parameters, and the set of maximizers is upper hemi-continuous.

Moreover, we can show that the utility function is strictly concave for all parameters, therefore the clause 2 of the maximum theorem under convexity applies, and the set of maximizers is a singleton-valued correspondence, i.e. a function. We have already found it above.

We can verify that the value function is indeed continuous by plugging the maximizer back into the objective function

\[\begin{split} V(p_1,p_2,m) = \begin{cases} \frac{\alpha m}{p_1} - \beta + \beta \log\left( \frac{\beta p_1}{\alpha p_2} \right), & \text{ if } p_1/m \le \alpha/\beta, \\ \beta \log\left( \frac{m}{p_2} \right), & \text{ if } p_1/m > \alpha/\beta \\ \end{cases} \end{split}\]
Hide code cell source
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm

alpha = 1
beta = 3
m = 10
cond = lambda p1: p1/m <= alpha/beta
f = lambda p: cond(p[0])*( alpha * m / p[0] - beta + beta * np.log( beta * p[0] / ( alpha * p[1] )) ) + (1-cond(p[0])) * (beta * np.log( m / p[1] ))

lb,ub = .1,5

x = y = np.linspace(lb,ub, 100)
X, Y = np.meshgrid(x, y)
zs = np.array([f((x,y)) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)

fig = plt.figure(dpi=160)
ax2 = fig.add_subplot(111)
ax2.set_aspect('equal', 'box')
ax2.contour(X, Y, Z, 50,
            cmap=cm.jet)
plt.setp(ax2, xticks=[],yticks=[])

fig = plt.figure(dpi=160)
ax3 = fig.add_subplot(111, projection='3d')
ax3.plot_surface(X, Y, Z, 
            rstride=2, 
            cstride=2,
            alpha=0.7,
            linewidth=0.25)
ax3.view_init(elev=5, azim=-34)

plt.show()
_images/6d35d765cbf114828b22aca5cabad2e37c059a5c65a2632598410e7888bd084b.png

Fig. 113 Value function in the space of prices \(p_1,p_2\)#

_images/d2bc18c65c71f84501e653493a835f8b4270db3cd55a27831b33492d60715c8a.png

Fig. 114 Value function in the space of prices \(p_1,p_2\)#

Hide code cell source
alpha = 1
beta = 3
m = 10
cond = lambda p1: p1/m <= alpha/beta
f1 = lambda x: cond(x)*(m/x - beta/alpha) + (1-cond(x))*0
lb,ub = .5,10
x = np.linspace(lb,ub, 100)
fig = plt.figure(dpi=160)
ax = fig.add_subplot(111)
ax.plot(x,f1(x),label=r"$x_1^\star(p_1)$")
plt.legend(loc='upper right', frameon=False)
plt.show()
_images/465b0211c675f71cf63290208fdff1a4f19c5e2ff7a72e2d41b2fd9434d0279b.png

Fig. 115 Optimal choice of \(x_1\) as a function of \(p_1\) (aka demand curve)#

Hide code cell source
alpha = 1
beta = 3
m = 10
cond = lambda p1: p1/m <= alpha/beta
f = lambda p: cond(p[0])*( beta * p[0] / (alpha *p[1]) ) + (1-cond(p[0])) * (m / p[1])
lb,ub = .5,5
x = y = np.linspace(lb,ub, 100)
X, Y = np.meshgrid(x, y)
zs = np.array([f((x,y)) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)
fig = plt.figure(dpi=160)
ax3 = fig.add_subplot(111, projection='3d')
ax3.plot_surface(X, Y, Z, 
            rstride=2, 
            cstride=2,
            alpha=0.7,
            linewidth=0.25)
plt.setp(ax3,xticks=[],yticks=[],zticks=[])
ax3.view_init(elev=22, azim=123)
plt.show()
_images/e0a532d62a7d7ffd43dc804523ed2b21b1d3cdc179913b448fa304aba510eab0.png

Fig. 116 Optimal choice of \(x_2\) as a function of \((p_1,p_2)\) (aka demand curve)#

Envelope theorems#

Next step after continuity – differentiability of the value function \(V(\theta)\) and the marginal effect of relaxing the constraint (derivatives!)

Let’s start with an unconstrained optimization problem

\[ V(\theta) = \max_{x} f(x,\theta) \]

where:

  • \(f(x,\theta) \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R}\) is an objective function

  • \(x \in \mathbb{R}^N\) are decision/choice variables

  • \(\theta \in \mathbb{R}^K\) are parameters

  • \(V(\theta) \colon \mathbb{R}^K \to \mathbb{R}\) is a value function

Envelope theorem for unconstrained problems

Let \(f(x,\theta) \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R}\) be a differentiable function, and \(x^\star(\theta)\) be the maximizer of \(f(x,\theta)\) for every \(\theta\). Suppose that \(x^\star(\theta)\) is differentiable function itself. Then the value function of the problem \(V(\theta) = f\big(x^\star(\theta),\theta)\) is differentiable w.r.t. \(\theta\) and

\[ \frac{\partial V}{\partial \theta_j} = \frac{\partial f}{\partial \theta_j} \big(x^\star(\theta),\theta\big), \forall j \]

In other words, the marginal changes in the value function are given by the partial derivative of the objective function evaluated at the maximizer.

Note

When \(K=1\), so that \(\theta\) is a scalar, the envelope theorem can be written as

\[ \frac{d f\big(x^\star(\theta),\theta)}{d \theta} = \frac{\partial f}{\partial \theta} \big(x^\star(\theta),\theta\big) \]

so that the meaning is carried only by the derivative sign change

Example

Consider \(f(x,a) = -x^2 +2ax +4a^2 \to \max_x\).
What is the (approximate) effect of a unit increase in \(a\) on the attained maximum?

FOC: \(-2x+2a=0\), giving \(x^\star(a) = a\).

So, \(V(a) = f(a,a) = 5a^2\), and \(V'(a)=10a\). The value increases at a rate of \(10a\) per unit increase in \(a\).

Using the envelope theorem, we could go directly to

\[ V'(a) = \frac{\partial f}{\partial a} (x^\star(a),a) = 2x + 8a \Big|_{x=a} = 10a \]

Envelope theorem for constrained problems

Consider an equality constrained optimization problem

\[\begin{split} V(\theta) = \max_{x} f(x,\theta) \\ \text {subject to} \\ g_i(x,\theta) = 0, \; i\in\{1,\dots,I\}\\ \end{split}\]

where:

  • \(f(x,\theta) \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R}\) is an objective function

  • \(x \in \mathbb{R}^N\) are decision/choice variables

  • \(\theta \in \mathbb{R}^K\) are parameters

  • \(g_i(x,\theta) = 0, \; i\in\{1,\dots,I\}\) where \(g_i \colon \mathbb{R}^N \times \mathbb{R}^K \to \mathbb{R}\), are equality constraints

  • \(V(\theta) \colon \mathbb{R}^K \to \mathbb{R}\) is a value function

Assume that the maximizer correspondence \(\mathcal{D}^\star(\theta) = \mathrm{arg}\max f(x,\theta)\) is single-valued and can be represented by the function \(x^\star(\theta) \colon \mathbb{R}^K \to \mathbb{R}^N\), with the corresponding Lagrange multipliers \(\lambda^\star(\theta) \colon \mathbb{R}^K \to \mathbb{R}^K\).

Assume that both \(x^\star(\theta)\) and \(\lambda^\star(\theta)\) are differentiable, and that the constraint qualification assumption holds. Then

\[ \frac{\partial V}{\partial \theta_j} = \frac{\partial \mathcal{L}}{\partial \theta_j} \big(x^\star(\theta),\lambda^\star(\theta),\theta\big), \forall j \]

where \(\mathcal{L}(x,\lambda,\theta)\) is the Lagrangian of the problem.

Note

What about the inequality constraints?

Well, if the solution is interior and none of the constrains are binding, the unconstrained version of the envelope theorem applies. If any of the constrains are binding, their combination can be represented as a set of equality constraint, and the constrained version of the envelope theorem applies. Care needs to be taken to avoid the changes in the parameter that lead to a switch in the binding constraints. Such points are most likely non-differentiable, and the envelope theorem does not apply there at all!

Example

Back to the log utility case

\[\begin{split} u(x_1, x_2) = \alpha \log(x_1) + \beta \log(x_2) \to \max_{x_1, x_2} \\ \text{ subject to} \\ p_1 x_1 + p_2 x_2 = m \end{split}\]

The Lagrangain is

\[ \mathcal{L}(x_1,x_2,\lambda) = \alpha \log(x_1) + \beta \log(x_2) - \lambda(p_1 x_1 + p_2 x_2 - m) \]

Solution is

\[\begin{split} x_1^\star(p_1,p_2,m) = \frac{\alpha}{\alpha + \beta} \cdot \frac{m}{p_1} \\ x_2^\star(p_1,p_2,m) = \frac{\beta}{\alpha+\beta} \cdot \frac{m}{p_2} \\ \lambda^\star(p_1,p_2,m) = \frac{\alpha + \beta}{m} \end{split}\]

Value function is

\[ V(p_1,p_2,m) = \alpha \log\left( \frac{\alpha}{\alpha + \beta} \cdot \frac{m}{p_1} \right) + \beta \log\left( \frac{\beta}{\alpha+\beta} \cdot \frac{m}{p_2} \right) \]

We can verify the Envelope theorem by noting that direct differentiation gives

\[ \frac{\partial V}{\partial p_1} = -\frac{\alpha}{p_1}, \quad \frac{\partial V}{\partial p_2} = -\frac{\beta}{p_2}, \quad \frac{\partial V}{\partial m} = \frac{\alpha+\beta}{m} \]

And applying the envelope theorem we have

\[ \frac{\partial V}{\partial p_1} = \frac{\partial \mathcal{L}}{\partial p_1}(x_1^\star,x_2^\star,\lambda^\star) = -\lambda^\star x_1^\star = -\frac{\alpha}{p_1} \]
\[ \frac{\partial V}{\partial p_2} = \frac{\partial \mathcal{L}}{\partial p_2}(x_1^\star,x_2^\star,\lambda^\star) = -\lambda^\star x_2^\star = -\frac{\beta}{p_2} \]
\[ \frac{\partial V}{\partial m} = \frac{\partial \mathcal{L}}{\partial m}(x_1^\star,x_2^\star,\lambda^\star) = \lambda^\star = \frac{\alpha+\beta}{m} \]

Lagrange multiplyers as shadow prices#

In the equality constrained optimization problem, the Lagrange multiplier \(\lambda_i\) can be interpreted as the shadow price of the constraint \(g_i(x,\theta) = a\), i.e. the change in the value function resulting from a change in parameter \(a\), in other words relaxing the constraint \(g_i(x,\theta) = 0\).

Exercise: prove this statement

Extra study material#