Linear Algebra

Null Space and Image Space

What the map destroys, and where it can land

01 · First principlesTwo questions you can ask any map

A matrix A is a machine: vectors in, vectors out. Before computing anything, there are exactly two structural questions worth asking about such a machine. What does it destroy? And what can it produce? Linear algebra gives each answer a subspace.

Working slogan: the null space is the machine's blind spot; the image is its reach.

02 · The pictureA rank-1 map of the plane

Take a 2×2 matrix of rank 1. Geometrically it flattens the entire plane onto a single line (the image), and one entire line of inputs (the null space) lands on the origin. Every other point of the plane shares its output with a whole line of accomplices: x and x + n give the same Ax whenever n is in the null space.

INPUT PLANE ℝ² NULL SPACE → 0 A A(null) = 0 IMAGE: A LINE OUTPUT: 2D HAS COLLAPSED TO 1D

A rank-1 map: one input direction (terracotta) is crushed to the origin; all outputs land on one line (green). Whole lines of inputs become single points.

03 · The payoffSolving Ax = b is two membership tests

Every question about solving a linear system reduces to these two subspaces. Solvability is a question about the image; uniqueness is a question about the null space. They are independent questions, which is why all four combinations occur.

b ∈ image?Null spaceSolutions of Ax = b
yestrivial {0}exactly one
yesnontrivialinfinitely many: x* + (anything in null space)
noeithernone — b is out of reach (least squares finds the nearest reachable point)

The structure of the infinite case deserves a sentence: solutions form a translated copy of the null space. Find one particular solution x*, and every other solution is x* plus a null vector — the ambiguity in the answer is exactly the blind spot of the map. The unreachable case is where least squares lives: project b onto the image and solve for that instead.

04 · Conservation lawRank–nullity

The two subspaces are not independent in size. For A mapping ℝⁿ → ℝᵐ:

rank(A) + dim null(A) = n
dimensions that survive (dim of image) dimensions destroyed

Read it as a conservation law: every one of the n input dimensions is either transmitted to the output or annihilated — none go missing, none are created. A map cannot crush a direction and keep it. This single identity ties the note together: a wide matrix (n > m) must have a null space, because at most m dimensions can survive; a tall matrix can be injective but can never fill its codomain. (Whether the surviving dimensions are as many as possible is the subject of rank; whether the inputs being crushed were "redundant" columns is linear dependence in disguise.)

05 · Why ML caresFlat directions and unidentifiable parameters

The named connection: the null space is where unidentifiability lives.

  1. Flat directions of the loss. Near a minimum, moving parameters by δ changes the model output by roughly Jδ, where J is the Jacobian of outputs with respect to parameters. Any δ in the null space of J changes nothing: the loss is exactly flat along it. Overparameterised networks have enormous such null spaces — minima are not points but high-dimensional valleys.
  2. Unidentifiable parameters. In linear regression with collinear features, XᵀX has a nontrivial null space; infinitely many weight vectors fit the data identically, and the data cannot distinguish them. Ridge regression adds λI precisely to remove the null space and pick one answer (see singular matrices).
  3. Reach as a ceiling. A linear layer's outputs live in the image of its weight matrix: rank bounds what the layer can express. Low-rank adapters (LoRA) exploit the converse — restrict the update to a small image and you constrain, cheaply, how much the map can change.
Mental Model