The launch post for php_clickhouse 0.6.0 covered the framing: native binary protocol, soft fork of the stalled SeasClick, modern ClickHouse types, 30-40% faster than HTTP at high throughput. That post landed April 25, 2026. Today (May 1, 2026) the current tag is 0.8.1, and I'm calling the extension stable.
The six days in between were a focused quality cycle, not a feature sprint. Three buckets:
-
Performance. Insert and write paths build native ClickHouse columns one at a time directly from row-major input. Peak intermediate PHP memory dropped from
N_rows × N_colszvals to one column. -
Security. Strict full-consumption parsers across
Map, narrow-int, Int128 / UInt128, geo, DateTime64, Time64, hex literals, and typed parameters. Wrong-type input throws instead of corrupting memory or coercing silently to zero. Recursive type-conversion gained a depth cap so adversarial server schemas can't blow the stack. -
Stability. Per-Client state moved from file-scope
std::mapbanks onto thezend_objectitself. Unblocks ZTS, plugs leaks on bailout, fixes a refcount bug on the progress callback. Insert path recovers the native handle on every server-side rejection point so a thrown insert no longer wedges the connection.
Three releases (0.7.0, 0.8.0, 0.8.1) closed the API gap with the most-used HTTP client, refactored the extension's state model, hardened the insert surface, and surfaced one upstream UB fix that has since merged into clickhouse-cpp.
Here's the work.
0.7.0: Closing the Ergonomics Gap with smi2/phpClickHouse
The native binary protocol gives you 30-40% throughput. Most teams won't trade a familiar API for that, so the native client has to match the ergonomic surface of the most-used PHP HTTP client (smi2/phpClickHouse). 0.7.0 is the release that actually does that.
What landed:
-
setSettings(array)for client-wide ClickHouse settings (max_execution_time,max_memory_usage,async_insert). Per-call settings as a 5th array argument onselect()/insert()/execute()/writeStart(). Per-call overrides global. - Server-side typed parameters via the
{name:Type}placeholder syntax. Routed throughQuery::SetParamso the server quotes and parses according to the declared type. Plain{name}placeholders keep their existing client-side identifier-substitution behavior. Arrays format as ClickHouse array literals soArray(UInt32),Array(String)round-trip cleanly. -
setProgressCallback(?callable)invoked for everyProgresspacket during a query (rows,bytes,total_rows,written_rows,written_bytes). -
getStatistics()returningrows_read,bytes_read,total_rows,written_rows,written_bytes,blocks,rows_before_limit,applied_limit,elapsed_msfrom the last completed query. Reset at the start of each query. - Structured
ClickHouseException:server_code(e.g. 159 forTIMEOUT_EXCEEDED),server_name(DB::Exception),query_id. Populated on server errors and on any throw with a query-id context. -
insertAssoc(table, rows)derives the column list from the keys of the first row. - SQL helpers:
databaseSize(),tablesSize(),partitions(),showTables(),showCreateTable(),getServerUptime(). Each validates identifiers against the safe-character set. - Sub-second timeouts via
connect_timeout_ms,receive_timeout_ms,send_timeout_msconfig keys. Override the existing seconds-based keys when present. - Per-client query log accumulator:
enableLogQueries(bool)toggles,getLogQueries()returns and clears. Each entry carriessql,query_id,elapsed_ms,rows_read,bytes_read,error_code,error_message.
The other under-the-hood change in 0.7.0 was migrating to a stub-driven arginfo workflow (clickhouse.stub.php → generated clickhouse_arginfo.h). Method parameter and return types are now declared at the engine boundary and visible to Reflection, IDEs, and static analyzers. Behavior is unchanged for correctly-typed callers; wrong-type callers now hit ZPP at the boundary instead of a custom thrown exception inside the method body.
None of 0.7.0 is novel on its own. The point is that without these the native client made you pay an ergonomics tax to get the speed. 0.7.0 settles that tab.
0.8.0: Per-Object State, ZTS, and Streaming
The 0.6.0 / 0.7.0 surface stored per-Client state in seven file-scope std::map<int, ...> banks keyed on Z_OBJ_HANDLE: the Client*, the in-flight insert Block, the ClientStats, the global settings, the progress and profile callbacks, the log toggle, the query log buffer.
That works, and it has three durability problems baked in:
-
No ZTS support. Threaded SAPIs share that file-scope state across threads. The 0.6.0 code gated MINIT with a hard error when
--enable-ztswas on. ClickHouse from RoadRunner / FrankenPHP / Swoole / php-pm was a non-starter. -
Leaks on bailout. PHP's userspace
__destructdoesn't run on fatal errors, so the map entries (and the underlyingClient*and any half-open insert stream) leaked. - Refcount bug on the progress callback. A struct copy of the registered callable went stale when the calling scope went out of scope, and the next progress packet hit a freed zval.
0.8.0 moved the per-Client state onto the zend_object itself via custom create_object / free_obj handlers. The seven file-scope maps disappear entirely. ZTS gating at MINIT was deleted in the same release.
The refactor unblocks three things at once:
-
Threaded SAPIs. No global state to thread-isolate, so ZTS Linux is a first-class target now. CI grew a
linux-ztsjob (PHP 8.4 ZTS built from source). -
Cleanup on bailout.
free_objruns unconditionally, including on fatal errors. TheClient*and any half-open insert stream get torn down properly. -
The progress-callback fix lands.
setProgressCallbacknow usesZVAL_COPYinstead of a struct copy, so the callable doesn't get freed out from under the next packet.
A Windows config.w32 shipped in the same release, rewritten from a 9-line warning stub to a full Windows build script that mirrors config.m4's source list and flags. Optional --enable-clickhouse-openssl plumbing is mirrored via CHECK_LIB("libssl.lib", ...). CI exercises Windows as a build + extension-load smoke test (no live ClickHouse on Windows yet).
Streaming reads
0.8.0 introduced two new read paths for result sets that don't fit comfortably in a single PHP array:
$it = $ch->selectStream("SELECT id, payload FROM events WHERE day = today()");
foreach ($it as $row) {
process($row);
}
selectStream() returns a ClickHouseRowIterator (Iterator + Countable) that walks blocks lazily. The iterator survives unset($client) because blocks own their column data via shared_ptr.
For unbounded streams where you don't want to count or rewind:
$ch->selectStreamCallback(
"SELECT id, body FROM events_unbounded",
fn(array $row) => writeToS3($row),
);
The callback fires once per row as blocks arrive, never accumulating the full result.
The plain select() path is unchanged and remains the faster choice when you actually want a full PHP array. The streaming variants exist for the row-millions case where you don't.
Geo, LowCardinality(Nullable), and the Map matrix
The type surface expanded too:
- Geo types Point, Ring, Polygon, MultiPolygon round-trip via
ColumnGeo. Point as[Float64, Float64], the others as nested arrays. -
LowCardinality(Nullable(String))andLowCardinality(Nullable(FixedString))round-trip on read and write. - The insert path now accepts any
Map(K, V)over scalar K and V (String, all signed/unsigned integer widths, Float32/64, UUID) plusLowCardinality(String)keys and values. The read path mirrors the same matrix except forLowCardinalitykeys (vendor gap). Previously only five hardcoded combinations worked. -
SimpleAggregateFunction(f, T)reads transparently asT.
Geo support unblocks one of the two large reasons people stayed on the HTTP client. The other was streaming.
Other 0.8.0 surfaces worth naming
-
selectStatement()returns aClickHouseStatementresult wrapper:Iterator,Countable,ArrayAccess,JsonSerializable, plusfetchOne()/fetchKeyPair()/fetchColumn()/toArray()/statistics(). Read-only (offsetSet/offsetUnsetthrow). Carries a per-call stats snapshot so it survives the client running other queries afterwards. -
setVerbose(bool|callable)for protocol-level lifecycle tracing. Passtruefor JSON lines on STDERR, or a callable invoked with($eventName, $context). Events:select_start,data_block,select_finish,execute_start,execute_finish,server_exception. No-op when off, so the hot path stays cheap on production. - DDL helpers:
isExists(),showDatabases(),showProcesslist(),getServerVersion(),tableSize(),truncateTable(),dropPartition(). All identifier args validated;dropPartitionSQL-escapes the partition value. - Client introspection:
resetConnection(),getServerInfo()(name, version, revision, timezone, display_name),getCurrentEndpoint()(host/port of the active endpoint when an endpoints[] pool is in use),setProfileCallback(),ping_before_queryconfig key. -
query_idechoed throughgetStatistics()so callers can correlate a stats snapshot to a server-side query insystem.query_log. - smi2-style sugar:
setSettings()returns$thisfor chaining,setSetting(key, value)for the single-key form,setDatabase(string)issuesUSEand updates the cached default used bydatabaseSize()/showTables(), getter aliases (getServerCode(),getServerName(),getQueryId()) onClickHouseException.
IPv4 / IPv6 crash, fixed
This one's worth calling out as a bug-of-the-release. clickhouse-cpp v2.6.1 made ColumnIPv4 / ColumnIPv6 siblings of (not subclasses of) ColumnUInt32 / ColumnFixedString. The 0.6.0 / 0.7.0 read paths were doing As<ColumnUInt32>() / As<ColumnFixedString>() on IP columns, which now returned null instead of dispatching. The next dereference segfaulted the worker.
Fixed by switching to ColumnIPv*::AsString(row) for canonical dotted-quad / ::1 form. If you hit a crash on IP column reads pre-0.8.0, this is why.
Distribution: pre-built binaries via PIE
Binaries for Linux glibc (x86_64 + arm64) and macOS (x86_64 + arm64) are now available. On a supported platform the install collapses to one line:
pie install iliaal/php_clickhouse
No vendored clickhouse-cpp build, no abseil compile, no five-minute make. TLS still requires the source build (pie install iliaal/php_clickhouse --enable-clickhouse-openssl), but that's a smaller set of users.
0.8.1: The Insert Path That Recovers
0.8.0 was the architecture release. 0.8.1 was the hardening pass: nine rounds of reviewer-driven fixes, mostly on the insert and write surface plus the type-conversion boundary.
The headline bug:
ClickHouseException: cannot execute query while inserting
If a server-side insert rejection (missing table, bad column, CHECK constraint, schema drift) threw out of BeginInsert / SendInsertBlock / EndInsert, the vendored client's inserting_ flag stayed set. Subsequent select / execute on the same handle threw the message above until the caller manually called resetConnection().
0.8.1 wraps every server-side rejection point in a connection-reset-then-rethrow. Same handle stays usable.
Destructor cleanup mirrors the same dirty/clean recovery split: an in-flight streaming insert with sent blocks is dropped via ResetConnection on unset() rather than committed via EndInsert. Clean sessions still EndInsert. Avoids partial commits on script bailout.
Memory: column-at-a-time insert
Pre-0.8.1, insert() and write() materialized a full column-major PHP zval matrix from the user's row-major input before building the native ClickHouse columns. For a 1M-row × 30-column insert that's 30M zvals sitting in PHP memory while the column build runs.
0.8.1 builds native columns one at a time directly from the row-major input. Peak intermediate PHP memory drops from N_rows × N_cols to one column.
insertAssoc() benefited from the same change: no more positional copy of input rows. The column gatherer reads each column directly from the original associative rows, and key validation uses zend_hash_exists against the first row's HashTable instead of allocating a new std::string for every row key.
Strict parsers across the type surface
Map, narrow-int (Int8 / Int16 / Int32 / their unsigned siblings), Int128 / UInt128, geo, DateTime64, Time64 insert paths now use full-consumption strict parsers. Non-numeric strings, fractional doubles, non-finite floats, and out-of-range values throw instead of silently coercing to 0 / 0.0 inside the column.
UInt64 inserts gained a shared strict_zval_u64 parser that accepts decimal and hex strings above ZEND_LONG_MAX on both the scalar and Map(*, UInt64) paths. Reads continue to surface upper-half values as decimal strings.
The class of bug strict parsing eliminates is the worst kind of insert bug: the string "foo" lands in an Int32 column as 0, no error, no audit trail. Now it throws.
Validation and reentry
A few smaller fixes worth naming:
-
write()rejects rows narrower or wider than thewriteStartcolumn count. The previous path took the first row's element count as authoritative, so[1]againstwriteStart(t, ['a','b'])landed1into columnawithbdefaulted server-side. -
insert()rejects rows with extra positional or named cells. A row like[1, 99]against a single-column table previously landed as1with99lost. - A failed later
write()no longer commits previously sent blocks. The catch path tracks whether any block has been sent in the currentwriteStart()session and choosesResetConnection(discard) overEndInsert(commit) on a dirty session. -
insertAssoc()rejects integer-keyed later rows and any key-set drift from the first row. The first row defines the column set; every later row must match. -
Enum8/Enum16inserts reject undeclared integers, NULL on non-Nullable columns, and unknown string names. - Single-token placeholder validator:
{name}placeholders accept exactly one identifier and reject comma-separated lists. Comma-list callers must use array form. - Same-client reentry guard: a userland progress / profile callback that fires another query on the same handle now throws cleanly instead of crashing the worker on the next
ReceiveData. - Recursive type-conversion depth cap (32) keeps deeply nested structures (
Array(Array(...)),Map(K, Tuple(...))) from blowing the stack.
23 new PHPTs (072–094) pin all of the above.
Upstream: One Fix Merged Back to clickhouse-cpp 🎆
The ASan job added in 0.8.0 caught a latent UB in the vendored library that nobody had been hitting in production, but UBSan flagged on every empty LowCardinality(String) value:
runtime error: null pointer passed as argument 2,
which is declared to never be null
ColumnStringBlock::AppendUnsafe was calling memcpy(pos, str.data(), str.size()) unconditionally. When str was constructed from an empty std::string, str.data() is allowed to be NULL, and libc declares memcpy's second argument with __attribute__((nonnull)) regardless of the size. Every libc no-ops memcpy(_, NULL, 0) in practice, so the bug was benign on real workloads, but the false-positive UBSan trip was noising the extension's ASan job and obscuring real findings.
Patch: guard the memcpy with if (str.size() > 0). Submitted upstream as clickhouse-cpp#489, merged 2026-04-27. The local patch in lib/clickhouse-cpp/LOCAL_PATCHES.md will drop the next time the vendored library bumps.
What's Still Missing
Two limitations carry forward from clickhouse-cpp v2.6.1:
-
SELECT ... WITH TOTALSandSETTINGS extremes=1throwunimplemented 7from the cpp layer. The vendored library does not dispatch the Totals / Extremes packet types (upstream issue #297).getTotals()/getExtremes()are deferred. -
Map(LowCardinality(K), V)reads are not yet decoded by the vendored library (writes succeed).showProcesslist()selects a fixed projection of standard columns to avoid the unsupported Map columns (ProfileEvents,Settings,used_*).
If either blocks your workload, file an issue at github.com/iliaal/php_clickhouse with the schema and a minimal repro. Both are upstream and tracked.
The repo is at github.com/iliaal/php_clickhouse. Install via PIE: pie install iliaal/php_clickhouse (add --enable-clickhouse-openssl for TLS). The original launch post that framed the fork story sits at ilia.ws.
Top comments (0)