Differentiation converts rate-of-change intuition into rigorous objects such as gradients, Jacobians, and directional derivatives. For a scalar field , the gradient is the vector that best linearizes near , yielding the approximation . For vector-valued functions , the Jacobian matrix collects partial derivatives and acts as the linear map that transports perturbations. This perspective powers sensitivity propagation in Dynamic Programming and Bellman Equations and local linearization inside EKF and UKF Overview.
Directional derivatives specify how costs change along a search direction : . Second derivatives capture curvature through the Hessian , which determines whether a stationary point is a minimum, maximum, or saddle via eigenvalue signatures. When constraints enter, we pair gradients with Lagrange multipliers; the resulting stationarity conditions foreshadow the optimality system derived in Karush-Kuhn-Tucker Conditions.
import jax.numpy as jnp
from jax import grad, jacfwd
f = lambda x: jnp.sin(x[0]) + x[1] ** 2
x0 = jnp.array([1.2, -0.4])
grad_f = grad(f)(x0)
print("Gradient:", grad_f)
h = lambda x: jnp.array([x[0] * x[1], jnp.exp(x[0])])
print("Jacobian:\n", jacfwd(h)(x0))
Chain rules stitch together component derivatives: keeps cropping up in optimal control and reinforcement learning. Recognizing when gradients vanish (stationary points) and how to exploit sparsity in Jacobians lays groundwork for Line Search Methods, shooting formulations in Direct Transcription, and policy updates in Policy Gradient Methods.