Exercise set F#
Please, see the general comment on the tutorial exercises
Question F.1#
Consider a function
Hint
Recall that the definition of a quadratic form calls for the symmetry of matrix
Hint
Given a square matrix
Question F.2#
Consider a quadratic form
Using the product rule of multivariate calculus, derive the gradient and Hessian of
Hint
You can assume that
Question F.3#
This exercise takes you on a tour of a binary logit model and its properties.
Consider a model when a decision maker is making a choice between
To rationalize the data the econometrician assumes that the utility of each alternative is given by a scalar product of a vector of parameters
In line with the random utility model, the econometrician also assumes that the utility of each alternative contains the additively separable random component which has an appropriately centered type I extreme value distribution, such that the choice probabilities for the two alternatives are given by a vector function
In order to estimate the vector of parameters of the model
where the individual log-likelihood contribution is given by a scalar product function
Assignments:
Write down the optimization problem the econometrician is solving. Explain the meaning of each part.
What are the variables the econometrician has control over in the estimation exercise?
What variables should be treated as parameters of the optimization problem?
Elaborate on whether the solution can be guaranteed to exist.
What theorem should be applied?
What conditions of the theorem are met?
What conditions of the theorem are not met?
Derive the gradient and Hessian of the log-likelihood function. Make sure that all multiplied vectors and matrices are conformable.
Derive conditions under which the likelihood function has a unique maximizer (and thus the logit model has a unique maximum likelihood estimator).
Solutions
Question F.1
See the last anwer in this math stackexchange post
Question F.2
A possible answer:
Represent the quadratic form as a dot product of two functions
The last Jacobian can be easity derived by representing matrix multiplication as a linear combination of columns. Differentiating with respect to each element of
Applying the dot product rule of differentiation we have
The last transformation is transpose of a product + utilizing symmetry of
The final answer is the
Question F.3
The optimization problem is:
We can control
(coefficients to be estimated).We treat
as parameters (data).
We can try to apply Weierstrass theorem.
The objective function is continuous.
But the domain is not compact (
is closed but unbounded).
Denote
as the Jacobian of with respect to , as the Hessian of with respect to .
Notice that
We calculate the three terms on the r.h.s. one by one:
Thus,
The Jacobian (gradient) of the MLE objective function is:
We set
Thus, the Hessian of the MLE objective function is:
If the Hessian
is negative definite for all , then always holds, we know there must be at least one solution to the first order conditions by the Inverse function theorem. Moreover, if the Hessian is negative definite, then the MLE objective is strictly concave, i.e., there is a unique maximizer of the log likelihood function, which is the unique solution to the first order conditions.
Let us find the conditions under which the Hessian is negative definite.
Notice that
Thus, we get a sufficient condition for the unique maximizer:
, for some .
It’s easy to show this condition is also a necessary condition. Since if
Thus, the logit model has a unique ML estimator if and only if
The intuition is that if the model can be estimated (identifiable in econometrics jargon), the two alternatives cannot be the same
(