Snowflake Database Kernel Engineer Interview: Storage Engine and Transaction Implementation Deep Dive
3 years of database kernel development experience, complete review of Snowflake Database Kernel Engineer three technical interview rounds covering C++, B+ trees, MVCC, WAL, and storage engine design, with real questions and preparation tips.
Background
I've been doing database kernel development for 3 years, previously working at a database startup on storage engine-related work, primarily writing storage layer and transaction layer code in C++. I'm fairly familiar with data structures like B+ trees and LSM-Trees. Snowflake's Database Kernel Engineer position has always been my target — after all, Snowflake is a benchmark in cloud databases, with top-tier technical depth and engineering scale.
I was referred by a former colleague for the Database Kernel Engineer position at Snowflake. About a week after the referral, I received an interview invitation. The entire process consisted of three technical rounds with no separate HR round, spanning about three weeks.
Interview Process Review
Round 1: C++ + Data Structures (~60 minutes)
The first interviewer was a senior developer on the team. After briefly discussing my project experience, we dove right into technical questions.
C++ Section: The interviewer asked about smart pointer implementation principles, how shared_ptr's reference counting ensures thread safety — I explained using atomic operations. Then came questions about RAII mechanism, move semantics, and perfect forwarding. There was also a detailed question: why can't unique_ptr be copied? I explained that the copy constructor and copy assignment operator are deleted. The interviewer followed up asking how to pass unique_ptr between functions — I said using std::move.
Data Structures Section: The interviewer asked me to hand-write an LRU Cache with O(1) get and put operations. I used the classic hash table + doubly linked list approach. After writing it, the interviewer asked me to analyze time and space complexity. Then came a question about the differences between red-black trees and B+ trees, and why B+ trees are chosen for database indexes over red-black trees. I explained B+ tree advantages from the disk I/O perspective: larger fan-out, fewer levels, and range query friendliness.
Algorithm Section: Given a problem — find the first element greater than or equal to target in a sorted array, essentially implementing lower_bound. I wrote a binary search, and the interviewer asked me to prove correctness using loop invariants.
Round 1 overall felt foundational but detailed — not surface-level rote questions.
Round 2: B+ Trees + MVCC + WAL (~90 minutes)
Round 2 was the most hardcore round of the entire interview, with a senior database kernel expert as the interviewer.
B+ Tree Section: The interviewer asked me to detail the B+ tree insertion and deletion process, including node splitting and merging. I drew a diagram of the split and explained the logic of promoting the middle key. The interviewer followed up on how to handle B+ trees in concurrent scenarios — I discussed the latch coupling (crabbing protocol) approach and B-link Tree optimization ideas. Then came a deep question: what happens if a crash occurs during B+ tree node splitting? I explained using WAL logs to ensure atomicity — the split operation writes the log first before modifying pages.
MVCC Section: The interviewer asked me to detail InnoDB's MVCC implementation, including hidden columns (DB_TRX_ID, DB_ROLL_PTR), Undo Log version chains, and Read View visibility rules. I explained the different Read View generation timing for RC and RR isolation levels. The interviewer followed up on why RR level can't completely prevent phantom reads — I discussed the difference between current reads and snapshot reads, and how Next-Key Lock solves phantom reads. Then came an open-ended question: if you were to design an MVCC scheme, how would you approach it? I answered from three dimensions: version storage, garbage collection, and visibility determination.
WAL Section: The interviewer asked about WAL principles and purposes, Redo Log write flow (writing to Log Buffer first, then flushing to disk), and Group Commit optimization. Then came a key question: how to ensure consistency between Redo Log and Binlog? I discussed the two-phase commit approach, XA protocol flow, and how to recover if a crash occurs at a certain stage. The interviewer was satisfied with this answer and followed up on the difference between distributed transaction 2PC and internal database 2PC.
Round 2 lasted a full 90 minutes — mentally exhausting but genuinely satisfying.
Round 3: Storage Engine Design + Project Deep Dive (~75 minutes)
The Round 3 interviewer was likely at the technical director level, with questions leaning more toward architecture and design.
Storage Engine Design: The interviewer posed an open-ended challenge — if you were to design a storage engine supporting high-concurrency writes, how would you approach it? I started from the LSM-Tree architecture, discussing the layered structure of MemTable, Immutable MemTable, and SSTable, along with Compaction strategies. The interviewer followed up on how to solve LSM-Tree's read amplification problem — I discussed Bloom Filters, tiered indexing, and caching strategies. Then the interviewer asked me to compare LSM-Tree and B+ Tree pros and cons, and which scenarios suit each. I provided a comparative analysis across three dimensions: write amplification, read amplification, and space amplification.
Project Deep Dive: The interviewer asked me to detail a page compression feature I had previously built, from requirement background and design approach to performance testing. I focused on dictionary compression and variable-length page solutions. The interviewer was interested in the compression rate improvement and asked for specific test data. Then asked about the most difficult technical problem I'd encountered — I described a concurrency bug investigation process from symptom to diagnosis to fix. The interviewer listened attentively.
Comprehensive Assessment: The interviewer asked about my understanding of cloud-native databases, the architectural advantages of storage-compute separation, and the architectural differences between Snowflake and Aurora. I analyzed from the perspectives of log-as-data philosophy, shared storage implementation, and RO node scalability.
Real Questions Summary
1. How does shared_ptr's reference counting ensure thread safety?
2. Why can't unique_ptr be copied? How to pass it between functions?
3. Hand-write an LRU Cache with O(1) get and put
4. Why choose B+ trees over red-black trees for database indexes?
5. B+ tree node splitting and merging process?
6. How to handle B+ trees in concurrent scenarios? Latch coupling approach?
7. What happens if a crash occurs during B+ tree node splitting?
8. InnoDB MVCC implementation principles?
9. Why can't RR isolation level completely prevent phantom reads?
10. How would you design an MVCC scheme?
11. WAL principles and purposes? Redo Log write flow?
12. How to ensure consistency between Redo Log and Binlog?
13. Design a storage engine supporting high-concurrency writes
14. How to solve LSM-Tree's read amplification problem?
15. LSM-Tree vs B+ Tree pros and cons comparison?
16. Architectural differences between Snowflake and Aurora?
Tips and Advice
1. C++ fundamentals must be solid down to the details: Database kernel development has very high C++ requirements — it's not enough to just know how to write code; you need to understand underlying mechanisms. Smart pointers, RAII, and move semantics are almost guaranteed to come up. I recommend carefully reading "Effective C++" and "C++ Concurrency in Action."
2. Data structures must be hand-writable and clearly explainable: B+ trees are a core topic in database kernel interviews. You need to not only explain the principles clearly but also hand-write key operations. I recommend implementing a simplified B+ tree yourself.
3. Transactions and concurrency control are paramount: MVCC, WAL, and 2PC are the soul of database kernels and will almost certainly be deeply probed in interviews. I recommend reading the transaction chapters of "Database System Concepts" and "MySQL Internals: InnoDB Storage Engine."
4. System design requires a holistic perspective: Round 3's storage engine design doesn't test a specific component — it tests your understanding of the entire storage engine architecture. I recommend reading LevelDB and RocksDB source code to understand the complete LSM-Tree implementation.
5. Project experience must have depth: Interviewers will deep-dive into your project details, including the reasoning behind design decisions, problems encountered, and solutions. I recommend preparing 2-3 in-depth projects and organizing them using the STAR method.
FAQ
Q: How advanced must C++ skills be for Snowflake database kernel interviews?
A: Fairly advanced. It's not just "can write C++" — you need to understand C++'s memory model, concurrency mechanisms, and template metaprogramming. I recommend at least 30,000+ lines of C++ project experience.
Q: Can I pass without database kernel experience?
A: Very difficult. Snowflake's position explicitly requires database kernel development experience. If you only have application-layer experience, I recommend contributing to open-source database projects like TiDB or RocksDB first.
Q: Will the interview cover operating systems and computer architecture?
A: Yes. Database kernels are closely tied to OS — interviewers will ask about memory management, file systems, CPU caches, and related topics. I recommend reviewing key chapters from operating systems courses.
Q: Is the work intensity high on the Snowflake database team?
A: From what I understand, the work intensity is moderate to high, but the technical atmosphere is excellent and you can learn a lot. The interviewer also mentioned the team has a well-established technical sharing mechanism.