How it works: the magic of the B-Tree
Under the hood, MongoDB uses something called a B-Tree to store its indexes. Imagine a massive, sorted filing cabinet. Instead of looking through every drawer, you have a directory at the top. This directory tells you which shelf to go to, and that shelf tells you exactly which folder to grab. In computer science terms, this turns a slow O(n) search into a super-fast O(log n) search. Even if your data grows 10x larger, your query speed barely changes.Why the B-Tree is awesome
- Self-Balancing — It automatically reorganizes itself as you add or delete data, so it stays efficient and balanced.
- Sorted Data — Because the data is kept in order, it’s perfect for range queries (like “find all users aged 18 to 25”) and sorting.
- Memory Efficient — MongoDB’s storage engine (WiredTiger) uses “prefix compression” to save space. For example, if you index emails like
user1@gmail.comanduser2@gmail.com, it only stores the common@gmail.compart once to save RAM.
Choosing your weapons: common index types
Not all data is the same, so MongoDB gives you a few different index types to work with.1. Single field index
The bread and butter of indexing. You pick one field (likeusername) and index it. By default, every collection has a unique index on the _id field, but you’ll usually want to add more for your most-searched fields.
2. Compound index
This is where you index multiple fields together (e.g.,{ lastName: 1, firstName: 1 }). This is a lifesaver when you often filter by one field and then sort by another. Pro-tip: The order of fields matters! An index on { a, b } can help find { a } or { a, b }, but it won’t help you find just { b } on its own.
3. Multikey index (for arrays)
If you have a field that stores an array (liketags: ["tech", "news"]), MongoDB creates a “multikey” index. This creates a separate entry for every single item in the array, making it easy to find documents that contain a specific tag.
4. Partial & sparse indexes
Sometimes you don’t need to index everything.- Sparse Indexes only include documents that actually have the field you’re indexing.
- Partial Indexes let you use a filter expression. For example, you could index only the “Active” users to keep your index small and snappy.
The “Golden Rule”: the ESR framework
If you’re building a compound index, follow the ESR Rule (Equality, Sort, Range). This is the secret to perfect indexing:- Equality (E): Put fields you’re matching exactly (like
status: "active") first. - Sort (S): Put fields you use to sort results next. This lets MongoDB return results in order “for free” without using extra RAM for an in-memory sort.
- Range (R): Put fields you’re doing range queries on (like
price: { $gt: 100 }) last.
| Order | Type | Example |
|---|---|---|
| 1st | Equality | categoryId: 123 |
| 2nd | Sort | createdAt: -1 |
| 3rd | Range | price: { $gt: 50 } |
The “Holy Grail”: covered queries
A query is “covered” when every single piece of information you asked for is already inside the index itself. When this happens, MongoDB doesn’t even have to look at the documents on the disk — it just grabs the answer from RAM. This is the fastest possible way to get data. To do this, you have to use Projections to tell MongoDB exactly which fields you want, and make sure all those fields are included in your index.When to stop: the cost of indexing
Indexes are great, but they aren’t free. Every time you insert, update, or delete a document, MongoDB has to update every index associated with it. This is called Write Amplification. If you have 20 indexes on one collection, every single “write” is 20x more work for the server.Signs you have too many indexes
- Your insert and update operations are getting noticeably slower.
- Your “Working Set” (the data and indexes you use most) is bigger than your available RAM, causing “page faults” that lag your system.
Maintenance: keeping it clean
Database patterns change. An index you built a year ago might be dead weight today. Pruning these is the easiest way to boost your write performance.- Redundant Indexes: If you have an index on
{ a: 1, b: 1 }, you don’t need a separate index on{ a: 1 }. The first index already covers the second. - Unused Indexes: If an index hasn’t been used in 30 days, it’s just costing you disk space and write speed without helping anyone.
$indexStats aggregation stage. It tells you how many times each index has been touched since the server last started.
Quick troubleshooting: explain() is your friend
If a query is slow, run.explain("executionStats") on it.
- Look for IXSCAN — This means it’s using an index. Good!
- Look for COLLSCAN — This means it’s scanning everything. Bad!
- Compare
totalKeysExaminedtonReturned— If you’re looking at 10,000 keys just to find 5 documents, your index isn’t selective enough and needs a rethink.

