Lesson 16 of 20

Indexes & Performance

What Are Indexes?

An index is a data structure that speeds up data retrieval. Without an index, the database must scan every row in a table to find matching results (called a full table scan). With an index, the database can jump directly to the relevant rows, much like an index in a book helps you find a topic without reading every page.

Indexes dramatically improve SELECT query performance but come with trade-offs: they consume additional storage space and slow down INSERT, UPDATE, and DELETE operations because the index must be maintained.

  • Primary Key columns are automatically indexed
  • UNIQUE columns are automatically indexed
  • Foreign Key columns should usually be indexed manually
  • Columns frequently used in WHERE, JOIN, and ORDER BY are good candidates for indexes
Example
-- Create an index on a single column
CREATE INDEX idx_users_email ON users(email);

-- Create an index on the username column
CREATE INDEX idx_users_username ON users(username);

-- Create a composite index (multiple columns)
CREATE INDEX idx_users_active_date
ON users(is_active, join_date);

-- View all indexes on a table (MySQL)
SHOW INDEX FROM users;

-- Drop an index
DROP INDEX idx_users_username ON users;
Notes
  • Do not over-index. Each index takes up space and slows down writes. Only create indexes on columns that are frequently searched, filtered, or joined.

EXPLAIN & Query Optimization

The EXPLAIN statement shows how the database engine plans to execute a query. It reveals whether indexes are being used, how many rows are being scanned, and what join strategies are applied. This is your primary tool for diagnosing slow queries.

Query optimization is the practice of rewriting queries and adding indexes to make them run faster. Even small improvements can have a massive impact when your database grows to millions of rows.

  • EXPLAIN shows the query execution plan without running the query
  • Look for 'ALL' in the type column — it means a full table scan (usually bad)
  • Look for 'ref' or 'const' — these mean an index is being used (good)
  • The 'rows' column shows how many rows the engine expects to examine
Example
-- Analyze a query execution plan
EXPLAIN SELECT * FROM users WHERE email = 'alice@example.com';

-- Compare with and without index
-- Without index (likely full table scan):
EXPLAIN SELECT * FROM users WHERE bio LIKE '%developer%';

-- Tips for faster queries:
-- 1. Use specific columns instead of SELECT *
SELECT username, email FROM users WHERE age > 25;

-- 2. Avoid functions on indexed columns in WHERE
-- Slow (cannot use index):
SELECT * FROM users WHERE YEAR(join_date) = 2026;
-- Fast (can use index):
SELECT * FROM users
WHERE join_date >= '2026-01-01' AND join_date < '2027-01-01';

-- 3. Use LIMIT when you only need a few rows
SELECT * FROM users ORDER BY join_date DESC LIMIT 10;
Notes
  • A query that scans 1,000 rows using an index can be hundreds of times faster than scanning 1,000,000 rows without one. Always check EXPLAIN for slow queries.