DEV Community

Priyam Jain
Priyam Jain

Posted on

Understanding Row Space, Column Space, Rank, and Nullity with a Simple House Price Dataset

Linear algebra is everywhere in machine learning, even if it’s often hidden behind frameworks like PyTorch or TensorFlow. Concepts like row space, column space, rank, and nullity aren’t just abstract math — they help us understand what our model can learn and predict.

In this blog, we’ll explore these ideas using a simple house price prediction dataset.


1. The Dataset

Suppose we have a dataset of houses with 3 features:

House Size (sq ft) Bedrooms Age (years)
1 1200 3 10
2 1500 4 5
3 2700 7 25

We can represent this as a matrix (X):

[
X =
\begin{bmatrix}
1200 & 3 & 10 \
1500 & 4 & 5 \
2700 & 7 & 25
\end{bmatrix}
]

Rows = houses
Columns = features


2. Column Space: The Space of Predictions

The column space is the set of all linear combinations of the columns of a matrix.

In simple terms:

  • Each column represents a feature across all houses.
  • Column space represents all possible outputs (predictions) the model can generate using these features.

For our dataset, column space captures the independent patterns in the features.

For example:

  • Column 1 = [1200, 1500, 2700] → “Size pattern”
  • Column 2 = [3, 4, 7] → “Bedrooms pattern”
  • Column 3 = [10, 5, 25] → “Age pattern”

If one column can be written as a combination of others, it does not add new information to the column space.


3. Row Space: The Feature Patterns in the Dataset

The row space is the set of all linear combinations of the rows.

  • Each row represents one house (all its features).
  • Row space captures all feature patterns that exist in the dataset.

Let’s look closely:

  • (r_1 = [1200, 3, 10])
  • (r_2 = [1500, 4, 5])
  • (r_3 = [2700, 7, 25])

Notice:

[
r_3 = r_1 + r_2
]

(or, more generally, (r_3) can be expressed as a linear combination of (r_1) and (r_2)).

✅ This means:

  • r3’s features (size, bedrooms, age) can be predicted exactly from r1 and r2
  • Mathematically, it adds no new direction in the row space

So, in essence:

Row space = all house patterns that can be represented by combining existing houses

In ML terms, the model can only learn relationships within the row space. A house lying outside it (say, extremely large or unusual) may produce unreliable predictions.


4. Rank: How Many Independent Directions Exist

  • Rank = number of independent rows or columns
  • In our example:

    • r1 and r2 are independent
    • r3 is dependent
  • So, rank = 2

Rank tells us how many independent patterns exist in the dataset.


5. Null Space: What the Model Ignores

  • Null space = all vectors (x) such that (Xx = 0)
  • Intuition: combinations of features that do not affect predictions
  • If a column is in the null space, the model ignores that feature because it can be written as a combination of other features

6. Why This Matters in Machine Learning

  1. Column space = all possible outputs the model can produce
  2. Row space = all feature patterns the model sees during training
  3. Rank = how much independent information is in the data
  4. Null space = directions that don’t affect predictions

Understanding these helps with:

  • Detecting redundant features
  • Avoiding multicollinearity in regression
  • Understanding limitations of your model for prediction

7. Summary Table

Concept Intuition (House Price)
Column space All possible patterns of predictions using features
Row space All feature patterns that exist in the dataset
Rank Number of independent feature/house patterns
Null space Feature combinations that don’t influence prediction

8. Key Takeaways

  • A dependent row like r3 doesn’t add new information, but we usually keep it in training for stability
  • Column space tells you what predictions are possible
  • Row space tells you what the model can learn from the dataset

By looking at row space and column space, you can visualize:

  • What your dataset contains
  • What your model can learn
  • Which features or houses are redundant

Understanding these concepts helps make linear regression and deep learning models more interpretable.


If you want, I can also add a small 3D plot showing row space vs column space for this house dataset — it’s super intuitive and looks great in a blog.

Do you want me to do that next?

Top comments (0)