Constrained optimization#
ECON2125/6012 Lecture 8 Fedor Iskhakov
Announcements & Reminders
Submit Test 3 before October 4, 23:59
The test will include the material from the last two lectures (unconstrained and constrained optimization) with some additional material on fundamentals of optimization from the lecture before (properties of optima and convexity)
Next week you will have a guest lecture by Dr. Esben Scriver Andersen who is a postdoc at RSE working on structural estimation of dynamic discrete choice and matching models
Topic: numerical methods for practical optimization problems

Plan for this lecture
Constrained vs unconstrained optimization
Equality constrains and Lagrange method
Inequality constrains and Kuhn-Tucker method
Constrained optimization under convexity
Supplementary reading:
Simon & Blume: 16.2, whole of chapter 18, 19.3
Sundaram: chapters 5 and 6
Setting up a constrained optimization problem#
Let’s start with recalling the definition of a general optimization problem
Definition
The general form of the optimization problem is
where:
is an objective function are decision/choice variables are parameters where , are equality constraints where , are inequality constraints is a value function
Today we focus on the problem with constrants, i.e.
The obvious classification that we follow
equality constrained problems, i.e.
,inequality constrained problems, i.e.
, which can include equalities as special case
Note
Note that every optimization problem can be seen as constrained if we define the objective function to have the domain that coincides with the admissible set.
is equivalent to
Equality constrains and Lagrange method#
Constrained Optimization#
A major focus of econ: the optimal allocation of scarce resources
optimal = optimization
scarce = constrained
Standard constrained problems:
Maximize utility given budget
Maximize portfolio return given risk constraints
Minimize cost given output requirement
Example
Maximization of utility subject to budget constraint
is the price of good , assumed is the budget, assumed forlet
, ,

Fig. 86 Budget set when,

Fig. 87 Budget set when,

Fig. 88 Log utility with

Fig. 89 Log utility with
We seek a bundle
That is,
for all
Visually, here is the budget set and objective function:

Fig. 90 Utility max for
First observation:
Hence we need consider only strictly positive bundles
Second observation:
Never choose a point
withOtherwise can increase
by small change
Hence we can replace
Implication: Just search along the budget line
Substitution Method#
Substitute constraint into objective function
This changes
into
Since all candidates satisfy

Fig. 91 Utility max for

Fig. 92 Utility max for
First order condition for
gives
Exercise: Verify
Exercise: Check second order condition (strict concavity)
Substituting into

Fig. 93 Maximizer for
Substitution method: recipe#
How to solve
Steps:
Write constraint as
for some function
Solve univariate problem
to get
Plug
into to get
Limitations: substitution doesn’t always work
Example
Suppose that
Step 1 from the cookbook says
use
But
In other words,
As we meet more complicated constraints such problems intensify
Constraint defines complex curve in
spaceInequality constraints, etc.
We need more general, systematic approach
Tangency#
Consider again the problem
Turns out that the maximizer has the following property:
budget line is tangent to an indifference curve at maximizer

Fig. 94 Maximizer for
In fact this is an instance of a general pattern
Notation: Let’s call
In general, any interior maximizer
Otherwise we can do better:

Fig. 95 When tangency fails we can do better#
Necessity of tangency often rules out a lot of points. We exploit this fact to
Build intuition
Develop more general methods
Diversion: derivative of an implicit function
To continue with the tangency approach we need to know how to differentiate an implicit function
Let’s approach this task in the following way
utility function is given by
the contour line is given by
which defines an implicit functionlet map
given by
The total derivative of
Differentiating the equation
Using Tangency: Relative Slope Conditions#
Consider an equality constrained optimization problem where objective and constraint functions are differentiable:
How to develop necessary conditions for optima via tangency?

Fig. 96 Contours for

Fig. 97 Contours for

Fig. 98 Tangency necessary for optimality#
How do we locate an optimal

Fig. 99 Slope of contour lines#
Let’s choose
That is, choose
Equivalent:
Also need to respect

Fig. 100 Condition for tangency#
Tangency condition algorithm#
In summary, when
(Impose slope tangency) Set
(Impose constraint) Set
Solve simultaneously for
pairs satisfying these conditions
Example
Consider again
Then
Solving simultaneously with
Same as before…
Slope Conditions for Minimization#
Good news: The conditions are exactly the same
In particular:
Lack of tangency means not optimizer
Constraint must be satisfied

Fig. 101 Lack of tangency#

Fig. 102 Condition for tangency#
The method of Lagrange multipliers#
Lagrange theorem
Let
Let
Suppose that
Then there exists a vector
Proof
See Sundaram 5.6, 5.7
Note
is called the vector of Lagrange multipliersThe assumption that the gradients of the constraint functions
are linearly independent at is called the constraint qualification assumptionLagrange theorem provides necessary first order conditions for both maximum and minimum problems
Definition
The function
is called a Lagrangian (function)
Different sources define the Lagrangian with minus or plus in front of the second term: mathematically the two definitions are equivalent, but the choice of the sign affects the interpretation of the Lagrange multipliers
I like to have a minus in the Lagrangian for consistency with the inequality constrained case where the sign is important
We come back to this in lecture 10
Fact
First order conditions for the Lagrangian coincide with the conditions in the Lagrange theorem, and together with the constraint qualification assumption they provide necessary first order conditions for the constrained optimization problem.
Example
Form the Lagrangian
FOC
From the first two equations we have
Lagrangian as tangency condition#
The “standard machine” for optimization with equality constraints
Set
and solve the system of
We have
Hence
Finally
Hence the method leads us to the same conditions as the tangency condition method!
Second order conditions#
similar to the unconstrained case, but applied to the Lagrangian
have to do with local optima only
similarly, provide sufficient conditions in the strict case, but only necessary conditions in the weak case
Using the rules of the differentiation from multivariate calculus we can express
the gradient and the Hessian of the Lagrangian w.r.t.
Fact (necessary SOC)
Let
Suppose
Then:
If
has a local maximum on at , then defined above is negative semi-definite on a linear constraint set , that is
If
has a local minimum on at , then defined above is positive semi-definite on a linear constraint set , that is
Fact (sufficient SOC)
Let
Suppose there exists
Then:
If
defined above is negative definite on a linear constraint set , that is
then
If
defined above is positive definite on a linear constraint set , that is
then
Thus, the structure of both the necessary and the sufficient conditions is very similar to the unconstrained case.
The important difference is that the definiteness of the Hessian is checked on the linear constraint set
The linear constraint set represents the tangent hyperplane to the constraint set at
But how do we check for definiteness on a linear constraint set?
Definition
The Hessian of the Lagrangian with respect to
Similarly to the Hessian w.r.t.
the subscripts in the definition above denote the shape of the blocks in the bordered Hessian
some textbooks place the “border”, i.e. Jacobians of the constrained in upper left corner and the Hessian w.r.t.
of the objective in the lower right corner, with corresponding changes in the propositions belowthe sign for the second term in the Lagrangian also appears and may differ between different texts
Definition
Consider a
The leading principal minor of order
Example
The leading principal minors of oder
Fact
In the same settings as propositions above, namely for
Similarly to the unconstrained case, when none of the listed conditions for the bordered Hessian hold, it is indefinite and the point tested by the second order conditions is not a constrained optimizer.
Example
Let’s check the second order conditions for the consumer choice problem.
Solving FOC we have as before
To check the second order conditions compute the Hessian at
Because
We have that the determinant of the bordered Hessian has the same sign as
Hence,
Lagrangian method: recipe#
Write down the Lagrangian function
Find all stationary points of
with respect to and , i.e. solve the system of first order equationsDerive the bordered Hessian
and compute it value at the stationary pointsUsing second order conditions, check if the stationary points are local optima
Compare the function values at all identified local optima to find the global one
in practice second order conditions (steps 3 and 4) may be skipped when for example the global shape of the function is known (convex case)
When does Lagrange method fail?#
Example
Proceed according to the Lagrange recipe
The FOC system of equations
does not have solutions! From second equation
However, we can deduce that
Why does Lagrange method fail?
The Lagrange method fails because the constraint qualification assumption is not satisfied at
Lagrange method may not work when:
Constraints qualification assumption is not satisfied at the stationary point
FOC system of equations does not have solutions (thus no local optima)
Saddle points – second order conditions can help
local optimizer is not a global one – no way to check without studying analytic properties of each problem
Conclusion: great tool, but blind application may lead to wrong results!
Inequality constraints and Kuhn-Tucker method#
Turning now to the inequality constrained optimization problems
Example
Maximization of utility subject to budget constraint
is the price of good , assumed non-negative is the budget, assumed non-negative ,
Apply the Lagrange method neglecting the non-negativity requirement and assuming no money are wasted, and so the budget constraint binds:
Solving FOC, from the first equation we have
Hence, for some admissible parameter values, for example,

Fig. 103 Tangent point is infeasible#
Interpretation: No interior solution
Put differently
Every interior point on the budget line is dominated by the infeasible solution
Hence solution must be on the boundary
Since

Fig. 104 Corner solution#
Let’s look at the systematic solution approach where the corner solutions will emerge naturally when they are optimal.
Karush-Kuhn-Tucker conditions (maximization)
Let
Let
Suppose that
Then there exists a vector
and
Proof
See Sundaram 6.5
Karush-Kuhn-Tucker conditions (miminization)
In the settings of KKT conditions (maximization), suppose that
Then there exists a vector
and
Very similar to the Lagrange theorem, but now we have inequalities!
The last set of conditions is called the complementary slackness conditions, and they play the following role:
if for a given
, that is the -th constraint is binding , then the corresponding acts as a Lagrange multiplier for an equality constraintif on the other hand for a given
, the corresponding must be zero, removing the term with from the first condition
This way the KKT conditions combine the unconstrained and equality constrained conditions in one

Fig. 105 Binding and non-binding constraint at
Karush - Kuhn-Tucker method: recipe#
Essentially the same as for the Lagrange method
Write down the Lagrangian function
Solve the KKT first order conditions to find the stationary points of
with respect to , together with the non-negativity of KKT multipliers and complementary slackness conditionsUsing second order conditions using the subset of binding constraints, check if the stationary points at the boundary are local optima. Similarly, using the second order conditions for the unconstrained problem to investigate the interior stationary points
Compare the function values at all identified local optima to find the global one
Possible issues with KKT method are similar to the Lagrange method:
constraint qualification assumption
existence of constrained optima
local vs global optimality
Example
Returning to the utility maximization problem with budget constraint and non-negative consumption
Form the Lagrangian with 3 inequality constraints (have to flip the sign for non-negativity to stay within the general formulation)
The necessary KKT conditions are given by the following system of equations
The KKT conditions can be solved systematically by considering all combinations of the multipliers:
The first equation becomes which is inconsistent with the initially set
This is the exact case we looked at with the Lagrange method ignoring the non-negativity conditions on consumption. The solution is and if it also holds that and , i.e.
The case of is outside of the domain of the utility function and could in fact be excluded from the start. and
Inconsistent similarly to the previous case
The second equation becomes which is inconsistent with the and and
We have the following system in this case
From the last equation
and
Inconsistent similar to case 3
Inconsistent similarly to the previous case
To summarize, the solution to the KKT conditions is given by the following cases (it’s easy to see that the two solutions coincide for the equality in the parameter condition):
Thus, the corner solution is included in the solution set of the KKT conditions.
Second order conditions#
Nearly identical to the secondary order conditions in the equality constraints case, except written only for the constraints which are binding/active at
Example
Show code cell source
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
f = lambda x: x[0]**3/3 - 3*x[1]**2 + 5*x[0] - 6*x[0]*x[1]
x = y = np.linspace(-10.0, 10.0, 100)
X, Y = np.meshgrid(x, y)
zs = np.array([f((x,y)) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)
a,b=4,8
# (x/a)^2 + (y/b)^2 = 1
theta = np.linspace(0, 2 * np.pi, 100)
X1 = a*np.cos(theta)
Y1 = b*np.sin(theta)
zs = np.array([f((x,y)) for x,y in zip(np.ravel(X1), np.ravel(Y1))])
Z1 = zs.reshape(X1.shape)
fig = plt.figure(dpi=160)
ax2 = fig.add_subplot(111)
ax2.set_aspect('equal', 'box')
ax2.contour(X, Y, Z, 50,
cmap=cm.jet)
ax2.plot(X1, Y1)
plt.setp(ax2, xticks=[],yticks=[])
fig = plt.figure(dpi=160)
ax3 = fig.add_subplot(111, projection='3d')
ax3.plot_wireframe(X, Y, Z,
rstride=2,
cstride=2,
alpha=0.7,
linewidth=0.25)
f0 = f(np.zeros((2)))+0.1
ax3.plot(X1, Y1, Z1, c='red')
plt.setp(ax3,xticks=[],yticks=[],zticks=[])
ax3.view_init(elev=18, azim=154)
plt.show()
Extra material#
Story of William Karush and his contribution the KKT theorem by Richard W Cottle (download pdf)