Calculus — Differentiation

Differentiate scalar and vector functions, with gradients and Jacobians ready for optimization and estimator design.

Differentiation converts rate-of-change intuition into rigorous objects such as gradients, Jacobians, and directional derivatives. For a scalar field f:RnRf: \mathbb{R}^n \to \mathbb{R}, the gradient f(x)\nabla f(\mathbf{x}) is the vector that best linearizes ff near x\mathbf{x}, yielding the approximation f(x+Δx)f(x)+f(x)Δxf(\mathbf{x}+\Delta \mathbf{x}) \approx f(\mathbf{x}) + \nabla f(\mathbf{x})^\top \Delta \mathbf{x}. For vector-valued functions g:RnRmg: \mathbb{R}^n \to \mathbb{R}^m, the Jacobian matrix Jg(x)J_g(\mathbf{x}) collects partial derivatives and acts as the linear map that transports perturbations. This perspective powers sensitivity propagation in Dynamic Programming and Bellman Equations and local linearization inside EKF and UKF Overview.

Directional derivatives specify how costs change along a search direction p\mathbf{p}: Dpf(x)=f(x)pD_{\mathbf{p}} f(\mathbf{x}) = \nabla f(\mathbf{x})^\top \mathbf{p}. Second derivatives capture curvature through the Hessian 2f(x)\nabla^2 f(\mathbf{x}), which determines whether a stationary point is a minimum, maximum, or saddle via eigenvalue signatures. When constraints enter, we pair gradients with Lagrange multipliers; the resulting stationarity conditions foreshadow the optimality system derived in Karush-Kuhn-Tucker Conditions.

import jax.numpy as jnp
from jax import grad, jacfwd

f = lambda x: jnp.sin(x[0]) + x[1] ** 2
x0 = jnp.array([1.2, -0.4])
grad_f = grad(f)(x0)
print("Gradient:", grad_f)

h = lambda x: jnp.array([x[0] * x[1], jnp.exp(x[0])])
print("Jacobian:\n", jacfwd(h)(x0))

Chain rules stitch together component derivatives: ddtf(x(t))=f(x(t))x˙(t)\frac{d}{dt} f(\mathbf{x}(t)) = \nabla f(\mathbf{x}(t))^\top \dot{\mathbf{x}}(t) keeps cropping up in optimal control and reinforcement learning. Recognizing when gradients vanish (stationary points) and how to exploit sparsity in Jacobians lays groundwork for Line Search Methods, shooting formulations in Direct Transcription, and policy updates in Policy Gradient Methods.

See also