If you've worked with PostgreSQL or MySQL, you're probably spoiled by tools like Prisma, Knex, or Flyway. You write a migration, run a command, and your schema updates. Simple.
Then you try ClickHouse, and you realize: there's no built-in migration tool. No clickhouse migrate. Nothing.
This is the story of how I built my own migration system for ClickHouse, why it works surprisingly well, and what I learned about replicated vs distributed tables along the way.
Why ClickHouse Doesn't Have Built-in Migrations
ClickHouse is an OLAP database. It's designed for analytics, not for the kind of schema evolution that OLTP databases deal with daily.
The ClickHouse philosophy is different:
-
Schema changes are rare. In analytics, you typically design your schema upfront. You're not adding columns every sprint like you might in a product database.
-
ALTER is limited. You can add columns, but you can't easily modify existing ones. ClickHouse optimizes for read performance, not schema flexibility.
-
Different priorities. The team focuses on query performance, compression, and distributed computing. Migration tooling is left to the ecosystem.
-
Immutable-ish data. Analytics data is often append-only. You're not updating rows constantly, so migrations matter less.
Info
If you're building a product on ClickHouse, you will need migrations. Your schema will evolve. And you need a way to manage that across dev, staging, and production.
Evaluating the Options
ClickHouse officially states: "There is no standard schema migration tool for ClickHouse."
That said, they do list nine community tools that support ClickHouse:
| Tool | Language | Notes |
|---|---|---|
| Goose | Go | Simple, file-based migrations |
| golang-migrate | Go | Popular, supports many databases |
| Atlas | Go | Declarative, HashiCorp-style |
| Flyway | Java | Enterprise-grade, widely used |
| Liquibase | Java | XML/YAML based, change tracking |
| Bytebase | Go | Full platform with UI, approval workflows |
| Sqitch | Perl | Dependency-based, no ORM |
| clickhouse-migrations | Node.js | Community tool, lightweight |
| Houseplant | Python | Python-native approach |
When I evaluated these, here's what I found:
golang-migrate is solid and popular. But it's a separate Go binary to manage, and I wanted something that lived in my TypeScript monorepo.
Atlas offers declarative schema management where you define the end state and it figures out the diff. Interesting, but I wanted imperative migrations where I control exactly what SQL runs.
Bytebase is a full database CI/CD platform with ClickHouse support. Great for larger teams needing approval workflows, but overkill for my needs.
Flyway / Liquibase are enterprise Java tools. Powerful, but heavy. Not a natural fit for a TypeScript/Node.js stack.
clickhouse-migrations (Node.js) is a TypeScript-based tool that looked promising at first. Same language, SQL file-based, supports multi-query files. I started using it, but when I dug deeper:
| Feature | clickhouse-migrations | What I Needed |
|---|---|---|
| Rollbacks | No (migrations are immutable) | Yes (down() methods for safe rollbacks) |
| Cluster logic | Plain SQL only | Programmatic (same migration, different behavior) |
| Distributed table helpers | Manual SQL | Built-in helpers |
| ZooKeeper cleanup | No | Yes (clean up replica paths on reset) |
| Dev/prod parity | Separate SQL files | Single migration with conditional logic |
The Dealbreaker
I need the same migration to behave differently in single-node dev vs clustered production. In dev, I want a simple MergeTree. In prod, I need ReplicatedReplacingMergeTree with ZooKeeper paths, plus a Distributed table on top. Pure SQL files can't do that without maintaining two separate migration sets.
So I decided to roll my own. Build something simple, TypeScript-native, that does exactly what I need.
My Approach: A Custom Migration System
Here's the migration system I built. It's roughly 300 lines of TypeScript and handles everything I need.
The Core Idea
Each migration is a class with up() and down() methods:
export class CreateEventsTable extends Migration {
name = '001_events';
async up(client: ClickHouseClient): Promise<void> {
await client.command({
query: `
CREATE TABLE IF NOT EXISTS ${this.tableName('events')} (
id UUID,
project_id String,
event LowCardinality(String),
timestamp DateTime64(3, 'UTC'),
properties String,
-- ... more columns
)
ENGINE = ReplacingMergeTree(inserted_at)
PARTITION BY toYYYYMM(timestamp)
ORDER BY (project_id, timestamp, id)
`
});
}
async down(client: ClickHouseClient): Promise<void> {
await this.dropTable(client, 'events');
}
}
Migration Tracking
I store applied migrations in a _migrations table:
CREATE TABLE _migrations (
name String,
applied_at DateTime64(3, 'UTC') DEFAULT now64(3)
)
ENGINE = MergeTree()
ORDER BY name
The runner checks this table, compares against registered migrations, and applies what's pending.
The CLI
Simple commands that do what you expect:
pnpm migrate # Apply all pending
pnpm migrate:up # Apply next one
pnpm migrate:down # Rollback last one
pnpm migrate:status # Show what's applied/pending
pnpm migrate:reset # Nuclear option - drop everything
Environment Support
I run the same migrations against dev, staging, and prod:
pnpm migrate --env=prod
Each environment has its own database (userloom_dev, userloom_staging, userloom_prod) on the same cluster.
Info
This setup is suitable for getting stating and testing easily. For production environments, a common best practice is to maintain a single env database per ClickHouse cluster and manage separation and scalability through table design and cluster configuration.
The Tricky Part: Replicated vs Distributed Tables
This is where ClickHouse gets interesting. In a clustered setup, you don't just have one table. You have two:
Replicated Tables (The Real Data)
CREATE TABLE events_replicated ON CLUSTER production_cluster (
-- columns
)
ENGINE = ReplicatedReplacingMergeTree(
'/clickhouse/tables/{shard}/userloom_prod/events_replicated',
'{replica}',
inserted_at
)
This creates the table on every node. Each shard holds a portion of the data, and replicas within a shard keep copies for redundancy.
The ZooKeeper path (/clickhouse/tables/{shard}/...) is crucial. It's how replicas coordinate.
Distributed Tables (The Query Layer)
CREATE TABLE events ON CLUSTER production_cluster
AS userloom_prod.events_replicated
ENGINE = Distributed(
production_cluster,
userloom_prod,
events_replicated,
xxHash64(distinct_id)
)
This is a virtual table that routes queries to the right shards. When you query events, ClickHouse figures out which shards have the data and aggregates results.
The sharding key (xxHash64(distinct_id)) determines how data is distributed. Same user = same shard = faster queries.
The Migration Gotcha
Here's what bit me: Distributed tables cache the schema at creation time.
When I added new columns to events_replicated:
ALTER TABLE events_replicated ON CLUSTER production_cluster
ADD COLUMN is_bot UInt8 DEFAULT 0
The replicated table updated fine. But the distributed table? Still had the old schema. Queries for is_bot failed.
ClickHouse Gotcha
When you ALTER a replicated table, the distributed table doesn't automatically pick up the new schema. You must drop and recreate the distributed table.
The fix:
export class RecreateEventsDistributed extends Migration {
name = '006_recreate_events_distributed';
async up(client: ClickHouseClient): Promise<void> {
// Only needed in cluster mode
if (!this.config.cluster) return;
// Drop the distributed table
await client.command({
query: `DROP TABLE IF EXISTS ${this.tableName('events')} ON CLUSTER ${this.config.cluster} SYNC`
});
// Recreate it (picks up new schema from replicated table)
await this.createDistributedTable(client, 'events', 'events_replicated', 'distinct_id');
}
}
This is a ClickHouse quirk you learn the hard way. Now I know: after ALTER on replicated tables, recreate the distributed layer.
What I Love About This Approach
It's simple. No external tools. No separate processes. Just TypeScript that runs SQL.
It's explicit. I see exactly what SQL runs. No magic. No ORM trying to be clever.
It handles clustering. The same migration works in dev (single node) and prod (clustered). Helper methods abstract the differences:
// Returns "ON CLUSTER production_cluster" in prod, empty string in dev
this.onCluster()
// Returns ReplicatedMergeTree in prod, MergeTree in dev
this.getEngine('ReplacingMergeTree', 'events', 'inserted_at')
Rollbacks work. Every migration has a down(). I can roll back confidently.
It's fast. No network calls to external services. No waiting for a platform to figure out the diff. Just SQL.
Lessons Learned
Key Takeaways
- Track migrations in the database itself. Don't rely on file timestamps or external state.
- Use
IF NOT EXISTSeverywhere. Makes migrations idempotent. Safe to re-run. - Understand the replicated/distributed split. It's not obvious coming from single-node databases.
- ZooKeeper paths matter. Mess them up and you'll have replication issues.
SYNCis your friend. When dropping tables in a cluster,DROP TABLE ... SYNCwaits for all nodes. Without it, you might try to recreate before the drop completes.- Clean up ZooKeeper on reset. Dropping a replicated table doesn't remove its ZooKeeper path. You need to explicitly delete it, or the next CREATE will fail.
The Code Structure
The whole system is about 300 lines across a few files:
| File | Purpose |
|---|---|
migration.ts | Base class with helpers |
runner.ts | Finds and executes migrations |
cli.ts | Command-line interface |
config.ts | Environment configuration |
Each migration is another 50-100 lines depending on complexity.
Is it as polished as Prisma? No. Does it do exactly what I need with zero dependencies? Yes.
Would I Recommend This?
If you're building a product on ClickHouse and you're comfortable with SQL: yes, roll your own.
If you want something off-the-shelf and don't mind Go: golang-migrate is solid.
If you're a larger team and want approval workflows and audit logs: look at Bytebase.
But for a small team building fast? A simple TypeScript migration runner gets the job done. It's been running in production for months now, handling schema changes across a 3-shard, 3-replica cluster without issues.
Sometimes the best tool is the one you build yourself.
Next up: Building the JavaScript SDK for client-side event tracking and batching.