Skip to main content
MySQL DBA Interview Questions
MySQL DBA Interview Questions
1. MySQL Architecture & Internals (Advanced)
- Explain the internal architecture of InnoDB buffer pool, change buffer, double-write buffer, redo logs, undo logs, and adaptive hash index.
- What happens internally when you run this query?
UPDATE users SET balance = balance + 100 WHERE id = 10; - Explain the full journey of a transaction from BEGIN → COMMIT inside MySQL.
- How does the MySQL optimizer choose an execution plan?
- Why does MySQL sometimes prefer a full table scan even when an index exists?
- Explain InnoDB crash recovery phases in detail.
- Why is InnoDB redo logging physical and binlogging logical?
- Explain MVCC implementation in InnoDB (hidden transaction IDs, rollback pointers).
- What is the difference between
persistent statistics and histograms? - What happens in memory and on disk when a page split occurs?
🟦 2. Advanced Storage Engine Concepts (Senior Level)
- Why does InnoDB sometimes hold gap locks even for equality conditions?
- Explain next-key locking and how it prevents phantom reads.
- How does InnoDB purge process work? What causes purge lag?
- Explain fragmentation types in InnoDB and how to fix each.
- Why does InnoDB not reclaim space after DELETE or TRUNCATE in certain cases?
- What is the importance of
log sequence number (LSN)? - How is insert buffer merging handled during crash recovery?
- How does InnoDB handle index cardinality and stats updates?
🟦 3. Replication (Architect-Level)
- Explain GTID lifecycle in MySQL during write, relay, and apply stages.
- What is an “errant transaction”? How do you detect and fix it?
- Why can multi-threaded replication still run single-threaded?
- Explain replication lag root causes and deep-dive diagnosis steps.
- What is WRITESET-based parallel replication? When should you use it?
- What is the difference between logical clock and commit order?
- Why does DDL break replication? How do you prevent DDL drift?
- What are the challenges of multi-source replication?
- How do you rebuild a replica without stopping writes on master?
- Explain binlog crash recovery and binlog checksum.
🟦 4. High Availability, Group Replication, Failover
- Explain the internal working of MySQL Group Replication (GCS, certification layer, applier threads).
- Why do multi-primary clusters generate conflicts? How to prevent them?
- What causes cluster “expelled member” events?
- How does quorum calculation work in Group Replication?
- What is flow control and why can it freeze your cluster?
- How does MySQL Router detect primary role changes?
- What is a split-brain scenario? How to prevent it?
- When does Group Replication lose consistency guarantees?
- How do you design zero-downtime failover with GTID replication?
- How do you avoid data loss during failover?
🟦 5. Performance Optimization (Expert Level)
- What are the top 10 signals from Performance Schema that indicate bottlenecks?
- Why does MySQL perform poorly on write-heavy workloads? How to fix it?
- What is the impact of redo log size on performance?
- How do you detect CPU-based bottlenecks using PFS wait events?
- What leads to excessive temporary table creation?
- Why can an index degrade performance instead of improving it?
- How do you tune MySQL for:
- 90% reads / 10% writes
- 80% writes / 20% reads
- What is the impact of page flush, checkpoint age, and I/O capacity?
- How do you detect and fix plan regression in MySQL 8?
- How do histograms help the optimizer?
🟦 6. Query Optimization (Hard)
- Explain the scenarios where EXPLAIN output is misleading.
- How do you optimize multi-table JOIN with millions of rows?
- Why is ORDER BY + LIMIT slow even with index?
- How do you optimize queries using window functions?
- Why does MySQL use filesort? How do you eliminate it?
- What causes BLOATED B+Tree index? How to fix it?
- Why does MySQL ignore a composite index?
- How do you drastically reduce temp-table creation for analytic queries?
- Why does MySQL store temp tables on disk even when memory is available?
🟦 7. Locking, Deadlocks, Concurrency
- Explain in detail InnoDB deadlock internal detection.
- Why do deadlocks increase after adding an index?
- What is metadata lock? How does it impact production?
- How do you track down a long-running transaction holding locks?
- What causes “Waiting for flush” or “Waiting for table metadata lock”?
- Why do online ALTER operations cause application timeouts?
- How does MySQL handle lock escalation?
🟦 8. Partitioning, VLDB, Scaling
- Why does partition pruning fail?
- Should you partition or shard? When?
- What happens to indexes when table is partitioned?
- How do you scale MySQL beyond 1B rows?
- Why do partitions degrade write performance?
- What is the impact of global index in partitioning?
- How do you migrate a 5TB table with near-zero downtime?
🟦 9. Backup, Restore, and DR (Senior Level)
- Explain the architecture of XtraBackup.
- Why can a partial prepare stage make backup unrecoverable?
- How do you perform cross-region PITR?
- How do you restore a single table from physical backup?
- Explain backup consistency under heavy write load.
- How does MySQL handle encrypted backup and restore?
- What causes binlog corruption? How do you fix it?
🟦 10. Disaster Recovery, Failover, Chaos Testing
- Master crashes with unsent transactions — how do you avoid data loss?
- How do you detect and fix GTID gaps after failover?
- How do you simulate crash recovery for DR testing?
- How do you handle MySQL crash with half-written pages?
- How do you perform cluster-wide failover testing without downtime?
🟦 11. Cloud, Aurora, RDS MySQL
- When will Aurora replica lag even though hardware is superior?
- Why does RDS failover take 40–90 seconds?
- How is aurora redo logging different from InnoDB?
- How do you recover an RDS instance stuck in storage full state?
- How do you fix “InnoDB: Write throttled” in cloud environments?
- How does Aurora Global DB replication actually work?
- Why does RDS performance degrade after large parameter change?
🟦 12. Senior-Level Scenario Questions
- CPU suddenly spikes to 90% on master — walk through your step-by-step investigation.
- Replication lag increases from 1 sec → 3600 sec — what do you do?
- MySQL crashes during heavy writes — how do you diagnose crash cause?
- Application performance drops suddenly — how do you isolate DB bottleneck?
- A 400GB table ALTER is stuck for 12 hours — how do you recover?
- A table reports corruption in production — what are immediate next steps?
- Failover happens but replicas refuse writes — why?
- EXPLAIN shows correct index but query still slow — what next?
- MySQL slows down at peak time every day — how to find the reason?
- Your MySQL instances show “too many connections” — how to fix root cause?
Comments
Post a Comment