🔬 Tutorial problems epsilon#

\(\epsilon\).1#

Consider a function \(f : \mathbb{R}^N \ni x \mapsto x^{T}Bx \in \mathbb{R}\), where \(N \times N\) matrix \(B\) is square but not symmetric.

Show that the same function can be represented as \(x^{T}Ax\) where \(A\) is symmetric.

Given a square matrix \(M\), you can use the identity \(M = \tfrac{1}{2}(M+M') + \tfrac{1}{2}(M-M')\) where the first component is symmetric and the second is not symmetric.

Fact

If \(A\) and \(B\) are conformable for matrix multiplication, then

\[ (AB)^{T} = B^T A^T \]

\(\epsilon\).2#

Consider a function \(f : \mathbb{R}^N \ni {\bf x} \mapsto {\bf x}'A{\bf x} \in \mathbb{R}\), where \(N \times N\) matrix \(A\) is symmetric.

Using the product rule of multivariate calculus, derive the gradient and Hessian of \(f\). Make sure that all multiplied vectors and matrices are conformable.

You can assume that \({\bf x}\) is a column vector, and that any vector function of \({\bf x}\) is also a column vector.

Definition

Let \(A\) denote an open set in \(\mathbb{R}^N\), and let \(f \colon A \to \mathbb{R}\). Assume that \(f\) is twice differentiable at \(x \in A\).

The total derivative of the gradient of function \(f\) at point \(x\), \(\nabla f(x)\) is called the Hessian matrix of \(f\) denoted by \(Hf\) or \(\nabla^2 f\), and is given by a \(N \times N\) matrix

\[\begin{split} Hf(x) = \nabla^2 f(x) = \left( \begin{array}{ccc} \frac{\partial^2 f}{\partial x_1 \partial x_1}(x) & \cdots & \frac{\partial^2 f}{\partial x_1 \partial x_N}(x) \\ \vdots & \ddots & \vdots \\ \frac{\partial^2 f}{\partial x_N \partial x_1}(x) & \cdots & \frac{\partial^2 f}{\partial x_N \partial x_N}(x) \end{array} \right) \end{split}\]

\(\epsilon\).3#

In which direction should one move from a given point in order to increase the value of the function most rapidly:

  1. \(\quad\) \(f(x,y) = 4x^2y\) from the point \((2,3)\)

  2. \(\quad\) \(f(x,y) = y^2 e^{3x}\) from the point \((0,3)\)

Present your answer as a vector of length 1.

[Simon and Blume, 1994]: Exercises 14.18, 14.19

Review the definition and facts about the gradient of a multivariate functions.

\(\epsilon\).4#

A critical point of a multivariate function is the point at which all partial derivatives are zero.

Compute the critical points of the following functions:

  1. \(\quad\) \(x^4+x^2-6xy + 3y^2\)

  2. \(\quad\) \(x^2-6xy+2y^2+10x+2y-5\)

  3. \(\quad\) \(xy^2+x^3y-xy\)

  4. \(\quad\) \(3x^4+3x^2y-y^3\)

  5. \(\quad\) \(x^2+6xy+y^2-3yz+4z^2-10x-5y-21z\)

  6. \(\quad\) \((x^2+2y^2+3z^2) e^{-(x^2+y^2+z^2)}\)

[Simon and Blume, 1994]: Exercises 17.1, 17.2

\(\epsilon\).5#

Compute directional derivative of \(f(x,y) = xy^2 + x^3y\) at the point \((4,-2)\) in the direction of the vector \((1/\sqrt{10},3/\sqrt{10})\).

Proceed in two different ways:

  • first, using the definition of the directional derivative, write down a function \(g \colon \mathbb{R} \to \mathbb{R}\) of \(h\) given as a slice of the original function \(f(x,y)\) through the given point in the given direction; then differentiate this function and compute the derivative at \(h=0\)

  • second, use the gradient formula; and verify that the same answer is obtained

[Simon and Blume, 1994]: Exercise 14.20

Follow the example in the lecture notes