The most precious things in life are memories and reflection.
Time flies — ten years have passed. On April 1, 2015, Max asked me, very seriously on April Fools’ Day, “Do you want to start a company together?” From that moment, I jumped on the TiDB train. The ride has been bumpy and brilliant.
In these ten years, I’ve been part of every step — from zero to one, from a small open-source project to a global product. Standing at this point in time, I want to write down the real story and lessons from this journey. It’s a record of the past, and I also hope it helps friends who love technology and entrepreneurship.
This series isn’t about bragging or sugar-coating. I want to share, honestly:
- How a simple technical dream slowly became real
- How a programmer painfully — but necessarily — changes when tech meets business
- How open source and customer success became TiDB’s core DNA
- How to win global trust with technology and product
- How a programmer grows into a manager — and then a technical leader
I believe these real stories are the most valuable — and most worth recording.
Alright, enough preface. Let’s begin this trip down memory lane.
I. The Journey Begins — An April Fools’ Invitation
The April Fools’ Day message
April 1, 2015. For programmers, it’s a day to joke around. I didn’t expect a message that felt too crazy to be true.
It was from Max. We’d collaborated in the open-source community, but had never met in person. The message was short:
“We’re going to start a company. We want to build a distributed database — open source. Want to join?”
My brain froze. On April Fools’ Day? Was he messing with me?
I replied without thinking: “Are you kidding me?”
Max replied immediately: “I’m serious.”
I was stunned. The “joke” felt very real.
Why me?
Why did Max think of me? Looking back, the answer is open source.
We had worked together through issues and PRs. Even without meeting, we already knew each other’s coding style and personality. That’s the magic of open source: in our world, you don’t rely on looks; your code is your face. Your skill and your character are all on GitHub.
Without open source, we would’ve been strangers. Even if we’d met for coffee, trust wouldn’t come so fast. Open source made us trust each other before we ever shook hands.
In a way, open source is like a matchmaking site for programmers — way more effective than coffee chats.
Two shocks: open source and remote work
I still had doubts, but I kept chatting with Max. The more we talked, the more I felt they weren’t joking.
“Open-source distributed database, change the world.” Ten years ago, that sounded… well, a bit “teenage-dream.” Big dream, maybe a bit silly. Especially since none of us had written database source code before.
I initially said no — the company would be in Beijing, and I lived in Zhuhai. I didn’t want to move.
Then Max said: “You can work from home.”
Remember, this was ten years ago. A founder actively offering remote work? That was bold — even crazy.
At that moment I thought: this company will either fail fast, or be very, very cool.
First in-person meeting felt like old friends
I booked a flight to Beijing and finally met Max — and the other two founders, Ed and Dylan.
Believe it or not, it was our first time meeting, but it felt like a reunion. No awkwardness at all. Pretty magical.
Why so much trust so fast? Again — open source.
From day one, open source became part of TiDB’s bones — and later our strongest card when we went global.
“True open source,” almost fully transparent
From the first day, we chose “true open source.” What does that mean?
We made almost everything public on GitHub — code, issues, bugs, PRs — totally transparent. Customers and peers could see everything, complain about anything.
People asked, “Aren’t you afraid customers will see all your bugs?”
We felt the opposite. Radical transparency builds trust.
Customers could watch how we work, how we find and fix bugs, and how the product grows. In the end, this transparency brought us more trust than companies who try to hide their problems.
Why build a distributed database?
TiDB started with one pain: sharding.
As internet developers, we suffered from MySQL sharding. Every schema change meant a long night. It was painful.
We wanted a truly scalable distributed database that would end sharding nightmares.
At the start, the idea was simple: solve our own pain. Later, we learned that this pure intention became our core strength — and a big reason customers chose us.
The brave and the clueless
Back then, none of us had database development experience. We were heavy users, not creators. Why did we dare to build one?
Sometimes, not knowing is a gift. As the saying goes: “Ignorance is fearless.”
With that courage, I became PingCAP’s first full-time employee and began a ten-year journey building TiDB.
To this day, I’m still proud of that decision.
II. The Joy of Building
When is building a product the happiest?
Programmers often ask: “When is coding the happiest?”
My answer is simple: before the product has users.
Before customers show up, you can just write code. No angry calls, no urgent tickets, no midnight firefights. Pure joy.
But products are for users. That happy time can’t last forever.
We weren’t that crazy
Building a database from scratch sounds insane. But we didn’t start blind. Google had published papers like Spanner. CockroachDB had already started a year earlier. HBase existed.
Building wheels isn’t shameful. Building them with your eyes closed is. We stood on giants’ shoulders — and borrowed many proven ideas.
Why Go?
First step: pick a language. We chose Go — no hesitation.
Why? We loved Go and knew it well. In unknown territory, the safest tool is the one you know.
Our strategy: get it running first.
MySQL compatibility
We decided to be MySQL-compatible. That choice made TiDB a great MySQL alternative later.
But wow, the pitfalls:
- Weird MySQL syntax corners
- Endless compatibility quirks
It’s a double-edged sword. But on balance, it helped more than it hurt.
Simple optimizer, classic executor
Today TiDB has an advanced optimizer and engine. Back then, we had no optimizer background.
So we followed the programmer’s rule: “If you don’t know, build the simplest thing.” We wrote a rule-based optimizer. For execution, we used the classic Volcano model — pull-based next layers. Simple and effective.
That fearless simplicity gave us a usable first version quickly.
Test-driven for real
From day one, we took testing seriously. It’s a database — people’s data is on the line. One big mistake, and we’d be out of a job.
We once aimed for 100% test coverage. Sounds crazy, but it found tons of bugs.
Besides unit tests, we did “extreme” things:
- Ported SQLite’s sqllogictest — felt painful at the time, but later we were so glad we did
- Learned Clojure to run Jepsen tests — caught many hidden transaction bugs
- Used TLA+ to model-check core algorithms — hard work, better sleep
- Built Chaos engineering tools — eventually open-sourced Chaos Mesh, which many people love
Rust and TiKV: dodging the C++ trap
Next challenge: storage. We refused to use C++. Too complex. Too risky.
Rust had just hit 1.0, so we took the leap:
- Great performance and memory safety; if it compiles, it tends to run well
- Strong community, which helped TiKV attract contributors
But Rust had downsides back then:
- Slow compile times
- Young ecosystem — lots of wheels to build
We also made a naming mistake: Region. In HBase it means a data shard. Cloud providers also use “Region” for geography. Endless confusion with customers. Lesson: naming is not trivial.
Spanner dreams, real-world compromises
We borrowed ideas from Spanner, but our customers didn’t have Google’s infrastructure — no TrueTime, no Colossus. Most ran on IDC hardware or VMs.
So we chose a shared-nothing design. It fit the time. But as cloud took over, its limits became clearer. We’d have to evolve beyond it.
Summary: the golden, tech-driven days
Looking back, those early days were a programmer’s paradise:
- Focused days, simple joy
- New technical challenges every week
- Passion that drew brilliant hackers to join us
That foundation carried us forward.
III. Commercialization: How Do We Survive?
The soul question: “Where are the customers?”
Early on, we loved the tech high. We didn’t seriously ask:
“How will we make money?”
We thought great code was enough. “Business” felt dirty to our idealistic programmer minds.
But a company isn’t a charity. No revenue, no future. The earlier you answer “who buys, and why,” the fewer punches you take later.
I regret not thinking about this sooner. Focusing on customer needs earlier would have helped the company — and my own growth.
From “tech-first” to “customer-first” — the painful switch
The hardest transition for engineers? Moving from tech-first to customer-first.
We learned this the hard way.
TiDB had a parameter that, when false, gave great performance but risked data loss on crash. In 2.0, we changed the default to true.
Sounded reasonable. But some open-source users upgraded and saw performance fall off a cliff. Their businesses broke.
We realized: upgrades should keep old behavior for existing users, while new installs get the new default. A simple thought — but we missed it because we were stuck in our tech logic instead of real customer scenarios.
We repeated similar mistakes a few times. Each time, it reminded us:
“Build for real customer scenarios — not for your own elegance.”
The painful game launch failure
One game company chose TiDB for a global launch. On launch day, a single hot SQL brought TiDB to its knees. We watched it happen. It was brutal.
If we had:
- Reviewed their SQL earlier
- Built tools to split hotspots
…maybe we could’ve avoided it.
But there’s no “maybe” in production. They failed to launch. That taught us:
- Real customer scenarios are complex
- Databases are not just tech — you also need delivery, service, support
- Always respect the customer’s business
After that, we never “tech-showed-off” lightly again.
A bank launch success: a turning point
Soon we got another chance: a bank core system project.
This time we were fully pragmatic:
- We walked every business path with the customer
- Reviewed everything that might break
- Fixed weak spots fast
- Prepared full launch and rollback plans
It worked. The launch succeeded. The customer praised us. That success built our confidence — and pushed us from “tech-first” to “customer success-first.”
“A product is good only if the customer says it is. When customers win, the product wins.”
The long road to business maturity
The switch from tech-driven to customer- and business-driven was long and painful. Honestly, even now, the “tech-driven bug” is still inside us.
But databases are different:
- They’re not just products — they’re services
- Product quality, delivery, and support all matter
- Customer success is the real north star
Summary: making money isn’t shameful
One sentence to fellow programmers:
“Making money isn’t shameful. Customers paying you is the best proof your tech creates value.”
We write code so customers can succeed. Then the company succeeds. And so do we.
IV. The Power of Scalability
From complaints to growth: TiDB 3.0’s turning point
Early TiDB had poor performance. Customers complained: “It scales, but it’s slow.”
We knew we had to fix performance. In 3.0, we optimized key paths with multi-threading. Performance jumped several times.
After 3.0, customer adoption grew fast.
“Once performance crosses the threshold, scalability becomes a superpower.”
The bike that wouldn’t unlock
One morning, I tried to unlock a shared bike. It wouldn’t open. People around me complained too.
Later, I learned our TiDB had gone down for that system. The app couldn’t unlock bikes.
I felt terrible. Our product messed up my daily life. But I also realized:
“TiDB is running core systems at many companies now.”
Scalability earns trust — but when you fail, the blast radius is big.
Midnight disk-full horror
At a customer with 100+ TB clusters, disks filled up overnight. TiKV crashed repeatedly. Chaos.
Deleting logs was too slow. Migrating data produced snapshots that also ate disk.
We stayed until 4 a.m., and found a way:
- Slow down scheduling to reduce snapshots
- Throttle writes so the system could rebalance
It worked. But it reminded us: big scale means big responsibility.
Why customers came back from competitors
Overseas, we lost to big-name competitors a few times. A few months later, customers returned.
The reason: at real scale, the competitors couldn’t keep up. TiDB could.
Scalability isn’t a slogan — it decides real-world choices.
Fast growth, hidden risks
After 3.0, growth was almost too smooth. Confidence turned into overconfidence. Later, quality problems exploded.
But that’s a later chapter.
Summary: power and responsibility
Scalability won us customers — and more duty. The more core the workload, the less room for error.
“Scalability is our superpower — and a heavy responsibility.”
V. From Chaos to First Light (Product- and Customer-Driven)
The pain of TiDB 4.0
We moved fast — too fast. 4.0 exposed the cost. We kept a “one big release per year” cadence, but internally it was chaos: too many features, not enough testing, quality drifting out of control.
We shipped 4.0 and followed it with 12 patch releases to stabilize. Customer complaints hit hard. My mornings began with “Did another customer blow up overnight?”
We had to change.
The arrogance of “no PMs”
We made a huge mistake: we basically had no PMs.
We even said, proudly, “We don’t need PMs. Engineers are the best PMs.”
Terrible idea.
Without PMs:
- No one set feature priorities
- No one drove product strategy
- No one deeply studied customer scenarios
We ended up with a stew of features and no clear direction.
Introducing PMs: the real turn
We finally accepted reality: engineers can’t do everything. Product needs product owners.
When PMs joined:
- Feature priorities became clear
- Each release had a theme
- We focused on customer problems and regained trust
Release cadence: from 1 year → 2 months → 6 months
After 4.0, we tried a “train model” every two months — small, fast releases with better quality.
But too many versions confused customers. Bug fixes required endless cherry-picks. Dev cost ballooned.
We eventually settled on 6-month releases:
- Good balance of stability and speed
- Reasonable version count
- Enough time to ensure quality
Quality first: starting with 6.0
After the 4.0 pain, we made quality #1 in 6.0:
- Reduced memory usage, fewer OOMs
- Smoothed disk I/O
- Focused on stability over fancy features
Customer feedback improved quickly.
In the cloud era: back to one LTS per year
Cloud changed the game:
- We can ship features faster on cloud
- Use early cloud feedback to validate
- Then roll stable features into an annual LTS
Innovation speed + LTS stability = a strong combo.
Back to basics: what is a database?
We asked ourselves:
“What is a database, really?”
Answer:
- It doesn’t need a thousand fancy features
- It must make data handling safe, stable, and easy
That insight reshaped our roadmap.
Summary: from chaos to clarity
We shifted from messy growth to product and customer success focus:
- PMs matter
- Quality comes first
- Cadence must fit reality
- Cloud lets us innovate safely
- And we re-learned the simple truth of databases
VI. TiDB Cloud: A Hard, Rewarding Journey
The early misunderstanding: “Just run it on the cloud?”
We thought cloud was simple:
“Just run TiDB on the cloud.”
Reality slapped us. Hard. Many times.
Same as early Kubernetes: “Just write a YAML.” But production is a different world.
Early Kubernetes pain
In 2018, we built TiDB Cloud on Kubernetes. AWS EKS wasn’t mature yet. We chose an open-source project called Gardener and ran K8s ourselves.
That decision became a nightmare:
- Poor stability
- Huge maintenance cost
- Engineers suffered daily
It took years to migrate to managed EKS. Lesson:
“In the cloud era, trust the cloud vendors’ pace of progress.”
From local disks to cloud disks: a mindset shift
We were stubborn about local disks:
- Better performance, we said
- We spent huge effort making K8s schedule local disks
- Later cloud disks improved so fast that local disk ops became a nightmare
As customers grew, local disk Ops alone could drown us.
Looking back, it was a fight between programmer perfectionism and cloud reality — and reality won.
Running a cloud service: we sell service, not software
With software, you ship and leave. Ops is the customer’s job.
With cloud, the customer buys service:
- SLA
- Maintenance windows
- Security and compliance
- When there’s an issue, you jump in — now
Doing cloud Ops well is way harder than writing good software. But that pain grew us into a service-oriented team.
Next-gen TiDB Cloud: from shared-nothing to shared-everything
Classic TiDB was shared-nothing — great for IDC days, but cloud exposed limits:
- Node failures trigger data rescheduling
- Storage expansion ties to physical nodes, expensive and slow
We began a big shift:
- Move data from cloud disks to S3 object storage
- Evolve from shared-nothing to a more shared-everything style
- Split into micro-services to improve elasticity and resource efficiency
“Won’t S3 be slow?” Not if you layer caching and optimize the access paths. (More on that another time.)
Splitting services helped a lot:
- Heavy I/O tasks (like compaction) became independent
- Scale up under load, scale down when idle
- Lower cloud costs
After this, we finally found the right posture for the cloud.
Summary: pain, lessons, and a clearer future
We paid a lot of tuition on cloud:
- K8s choices, disk choices, architecture changes
- Service Ops and customer support systems
Today, TiDB Cloud is already over half of company revenue. That says it all: cloud isn’t just the future — it’s the present.
The road was hard, but we found our way. And there’s much further to go.
VII. Going Global: A Programmer’s International Challenge
Global trust is hard
When we faced the global market, a question appeared:
“Why would companies trust a database from a bunch of unknowns?”
Databases hold a company’s life. Trust doesn’t come easy.
Open source: our ace card
TiDB was open source from day one. That gave us an advantage:
- Customers could try it themselves
- They could read the code and see how bugs were handled
- Trust barriers dropped
A Japanese payments company found us this way. They used AWS Aurora at first, but big promos crashed it. They searched open-source options, tried TiDB, and switched. Today, TiDB is one of the mainstream choices in Japan’s payment sector.
Open source is powerful.
When a TiKV finally crashed — we smiled
The same customer later reported: “A TiKV node crashed!”
We checked and found it hadn’t been restarted in three years.
We were shocked — and thrilled. That kind of stability is a strong selling point.
In another incident, an AWS AZ in Japan went down. Their services had issues, but TiDB kept serving. That boosted global trust.
Donating to CNCF: more than code dumps
To boost global awareness, we did two big things:
- Donated TiKV to CNCF; it became a CNCF graduated project
- Donated Chaos Mesh to CNCF, helping many engineers test reliability
These donations increased trust and visibility worldwide.
Rust: a surprise global booster
Choosing Rust for TiKV helped global adoption:
- The Rust community is global and active
- Engineers loved contributing
- TiKV became a star project in the Rust world
That indirectly helped us win overseas customers.
Real internationalization is local
We used to think “internationalization” meant translating docs and flying a salesperson over.
We learned:
“The best internationalization is localization.”
Culture matters. Tech alone won’t save you if support isn’t local and responsive.
So we:
- Built local teams in the US, Europe, Japan, and Southeast Asia
- Offered 24/7 global-local support
- Hired local sales and presales to understand culture and needs
That’s how we built a real global presence.
A dream in the Computer History Museum
In 2018, Ed and I visited the Computer History Museum in Mountain View. We joked:
“I hope TiDB will be in a display case here someday.”
If that day comes, it means TiDB truly earned global recognition.
That dream still pushes us forward.
Summary: from programmers to a truly global team
Our global path was tough but full of gains:
- Open source opened doors
- Rust and CNCF raised our global reputation
- Localization is the heart of internationalization
- Global teams are the foundation of global customer success
Today, with many customers worldwide, we can proudly say:
“We are an open-source database trusted around the world.”
VIII. Customer Success: A Programmer’s Highest Pride
What is customer success?
Customer success has been PingCAP’s core value from day one.
At first, we believed success meant:
“Write elegant, efficient, awesome code.”
Over time, we learned the real standard:
Customer success.
If our database makes a customer’s business better, then we truly succeeded.
Customers are the best teachers
Why do we emphasize customer success?
Yes, it’s practical — customers pay the bills.
But more importantly, customers teach us:
- They know their business best
- Real needs push tech forward
- They help us break our own mental limits
Pushing limits: importing a 50 TB single table
Another North American customer asked:
“Can you import a 50 TB single table?”
We failed the first few times. The customer was furious: “Fix it in a week or we cancel.”
We worked day and night, optimized, and made it happen.
Then another customer asked: “What about 100 TB?” Thanks to the 50 TB work, they succeeded on their own.
We realized:
“TiDB can go further than we thought.”
SaaS and one million tables
A top SaaS customer asked:
“Can TiDB support 1,000,000 tables?”
We were shocked. Typical OLTP doesn’t have that many. But for multi-tenant SaaS, each tenant has its own database with a few tables. With enough tenants, one cluster needs a million tables.
We had never designed for that.
We refactored deeply:
- Schema layer
- Optimizer
- Memory use per table
We made it. After that, more SaaS customers came.
Should engineers support customers directly?
Engineers prefer quiet coding. Few want customer calls and on-site visits.
But we decided engineers must support customers. We created Customer Advocate:
- Assign an engineer owner to key customers
- They understand the scenario deeply
- They coordinate help when issues arise
One engineer met with the same customer 200+ times in a year. Sounds crazy. But the result was great:
- Customers got expert help
- Engineers got real feedback
- Satisfaction and loyalty rose
That customer moved from HBase to TiDB, and now is cutting over larger Aurora workloads, too.
The value is in real scenarios
We learned:
- Customers are experts
- Real scenarios beat paper designs
- Only by going to the front lines can we build what customers actually need
That approach won us broad recognition.
Summary: the highest pride
Switching from tech-first to customer-first isn’t easy. But once you do, you see:
“Customer success is the programmer’s highest pride.”
Every “thank you,” every bit of trust, every customer win — that’s our fuel.
We’ll keep doing one thing:
Put customer success first, and keep building a better database.
IX. My Ten Years of Growth: From Programmer to Technical Leader
Ten years as “Employee No1”
I joined as PingCAP’s first official employee and lived through TiDB’s whole journey from zero to global. The company changed fast, and so did I — moving from coder to technical leader.
First management lesson: respect Conway’s Law
Conway’s Law:
“Organizations design systems that mirror their communication structures.”
To build a great product, build the right organization.
TiDB is distributed, so our organization had to be distributed. That brings challenges:
- Different time zones — how do we communicate well?
- How to avoid silos?
- How to keep collaboration fast?
We suffered early, then learned to collaborate asynchronously.
The art of async: code and docs beat meetings
With a global team, meetings are hard. We centered communication on GitHub — code and docs:
- Every feature has clear docs
- Every major decision has a design doc and review
- Daily work happens in issues and PRs
It’s far more efficient than endless meetings.
The power of terminology
We didn’t respect terminology early on. Global teams got confused by words that meant different things to different people.
I led many standardization efforts. Once we aligned terms, communication sped up.
Leveling up: from team lead to department head
Going from IC to lead is tough. Going from lead to department head is a different game:
- Delegate and trust
- Focus on culture and environment
- Motivate managers and their teams
These skills go beyond pure tech. They stretched me a lot.
Cultural shift: product- and customer-first
Culture matters more than process. Our culture focus: customer success.
We repeated it in meetings, reviews, and plans:
“We write code for real customer needs.”
That culture improved quality and satisfaction.
What is a true technical leader?
I used to think great tech was enough. Now I know:
- Tech is the baseline
- Communication, organization, empathy, business sense, and customer view are just as important
A real technical leader understands people, business, and customers — not just code.
Summary: the next ten years start now
In ten years, I grew from engineer to leader. I watched our product grow and changed with it.
I’m still excited:
- To keep growing
- To lead the team to bigger wins
- To become a better technical leader
Just like the day I joined:
“My next ten-year journey is just beginning.”
X. Goodbye, First Decade — Hello, Next Journey
There’s so much more than I can fit here. Ten years hold too many moments.
Looking back: there were surprises, challenges, and so much growth. From Max’s April Fools’ message to today’s globally known open-source distributed database; from chasing technical perfection to holding onto customer success as the root; from a simple startup dream to a global, cloud-era product — this journey gave me far more than I imagined.
If I must summarize:
- Open source is a belief that wins trust from users and community
- Product focus + customer success = real growth
- Scalability is not only a technical edge — it’s a growth engine for customers and the business
- Internationalization requires localization — and expands both culture and thinking
- A programmer’s real pride isn’t lines of code — it’s customers succeeding because of your product
- Growing from tech to management is painful but necessary — it teaches lessons beyond code
TiDB has walked ten years. My journey is just starting. I believe the next decade will bring more growth, more challenges, and — always — customer success. We’ll keep building a better product for customers around the world.
Thank you to every customer, partner, and colleague over these ten years. Your trust and company made all this possible.
See you in the next decade.
Acknowledgments
This article came from my spoken words, recorded with ChatGPT, then turned into text and organized with GPT-5. I guess GPT-5 picked up some of my writing style from the recording, so I only did light edits. AI is getting stronger and stronger. In the next decade, TiDB will also have many new stories in AI — but that’s for another time.
元の記事を確認する