DEV Community

Michael
Michael

Posted on • Originally published at gbase.cn

count(*) vs count(1) vs count(column_name) in GBase 8c

In GBase 8c, the China‑domestically developed database from GBASE, count is one of the most frequently used functions. Developers often mix count(*), count(1), and count(column_name) without fully understanding the differences. This article breaks down their similarities and critical distinctions with concrete examples.

What They Have in Common

  • Purpose: All three aggregate and return the number of rows as bigint.
  • Usage: They all work with WHERE, GROUP BY, and HAVING clauses.
  • Result without NULLs: When no NULLs exist in the specified columns, all three return identical results.

Test table:

CREATE TABLE student(id INT PRIMARY KEY, name VARCHAR(50), age INT);
INSERT INTO student VALUES(1, 'aaa', 20), (2, 'bbb', 21), (3, 'ccc', 22);
Enter fullscreen mode Exit fullscreen mode
SELECT count(*) FROM student;   -- 3
SELECT count(1) FROM student;   -- 3
SELECT count(id) FROM student;  -- 3
Enter fullscreen mode Exit fullscreen mode

Key Differences

1. NULL Handling

  • count(*) and count(1): Count row existence, unaffected by NULLs in any column.
  • count(column_name): Counts only non‑NULL values in the specified column; NULLs are silently ignored.

Insert a row with NULLs:

INSERT INTO student VALUES(4, NULL, NULL);
Enter fullscreen mode Exit fullscreen mode
SELECT count(*) FROM student;   -- 4
SELECT count(1) FROM student;   -- 4
SELECT count(name) FROM student;-- 3
SELECT count(age) FROM student; -- 3
SELECT count(id) FROM student;  -- 4
Enter fullscreen mode Exit fullscreen mode

2. Semantics and Readability

  • count(*): The SQL standard way — "count all rows". Clear and universal.
  • count(1): An extension — "count if the record exists" via the constant 1. Less readable.
  • count(column_name): Means "count non‑NULL values in this column", not simply "count rows".

3. Performance

  • count(*) and count(1): GBase 8c's optimizer transforms both into the same execution plan. No full table scan is required; both are equally fast.
  • count(column_name): Performance depends on indexing. With an index, it's efficient. Without an index, a full table scan plus row‑by‑row NULL checks is needed — the larger the table, the bigger the gap.

Recommendations

  • To count rows: Use count(*). It's clear, standard, and optimized in a gbase database.
  • To count non‑NULL values: Use count(column_name) and ensure the column is indexed.
  • count(1) works as a substitute, but it's not the best choice when readability matters.

Understanding these distinctions helps you write more precise and performant queries on GBASE's GBase 8c, avoiding subtle statistical errors caused by NULLs.

Top comments (0)