Primary Keys, Foreign Keys & Relationships
Good database design starts with understanding how your data relates to itself. Tables are connected through relationships, and these relationships are enforced using primary keys and foreign keys.
There are three types of relationships: one-to-one (1:1), one-to-many (1:N), and many-to-many (N:N). The type of relationship determines how you structure your tables and where you place foreign keys.
- One-to-One (1:1): Each user has one profile. The foreign key goes in either table.
- One-to-Many (1:N): One customer has many orders. The foreign key goes in the 'many' table (orders).
- Many-to-Many (N:N): Students enroll in many courses, and courses have many students. Requires a junction table.
- Primary Key: Uniquely identifies each row. Usually an auto-incrementing integer or a UUID.
- Foreign Key: A column that points to the primary key of another table, establishing the relationship.
-- One-to-One: Users and Profiles
CREATE TABLE users (
id INT PRIMARY KEY AUTO_INCREMENT,
username VARCHAR(50) NOT NULL UNIQUE
);
CREATE TABLE profiles (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT UNIQUE, -- UNIQUE enforces 1:1
full_name VARCHAR(100),
avatar_url VARCHAR(255),
FOREIGN KEY (user_id) REFERENCES users(id)
);
-- One-to-Many: Users and Posts
CREATE TABLE posts (
id INT PRIMARY KEY AUTO_INCREMENT,
user_id INT,
title VARCHAR(200),
content TEXT,
FOREIGN KEY (user_id) REFERENCES users(id)
);
-- Many-to-Many: Students and Courses
CREATE TABLE students (id INT PRIMARY KEY AUTO_INCREMENT, name VARCHAR(100));
CREATE TABLE courses (id INT PRIMARY KEY AUTO_INCREMENT, title VARCHAR(100));
CREATE TABLE enrollments (
student_id INT,
course_id INT,
enrolled_date DATE,
PRIMARY KEY (student_id, course_id),
FOREIGN KEY (student_id) REFERENCES students(id),
FOREIGN KEY (course_id) REFERENCES courses(id)
); - The junction table (enrollments) uses a composite primary key made from both foreign keys. This prevents a student from enrolling in the same course twice.
Normalization: 1NF, 2NF, 3NF
Normalization is the process of organizing a database to reduce data redundancy and improve data integrity. It follows a series of rules called normal forms. Most well-designed databases aim for Third Normal Form (3NF).
Each normal form builds on the previous one. Normalization helps prevent problems like update anomalies (changing data in one place but forgetting another), insertion anomalies, and deletion anomalies.
- 1NF (First Normal Form): Each column contains only atomic (indivisible) values. No repeating groups or arrays.
- 2NF (Second Normal Form): Meets 1NF AND every non-key column depends on the entire primary key (relevant for composite keys).
- 3NF (Third Normal Form): Meets 2NF AND no non-key column depends on another non-key column (no transitive dependencies).
-- BAD: Violates 1NF (multiple values in one column)
-- CREATE TABLE orders (
-- id INT PRIMARY KEY,
-- customer_name VARCHAR(100),
-- products VARCHAR(500) -- 'Laptop, Mouse, Keyboard'
-- );
-- GOOD: 1NF - separate table for order items
CREATE TABLE orders (
id INT PRIMARY KEY AUTO_INCREMENT,
customer_id INT,
order_date DATE
);
CREATE TABLE order_items (
id INT PRIMARY KEY AUTO_INCREMENT,
order_id INT,
product_name VARCHAR(100),
quantity INT,
price DECIMAL(10, 2),
FOREIGN KEY (order_id) REFERENCES orders(id)
);
-- BAD: Violates 3NF (city depends on zip_code, not the primary key)
-- customer_id | name | zip_code | city
-- GOOD: 3NF - separate zip_codes table
-- customers: customer_id | name | zip_code
-- zip_codes: zip_code | city | state - Over-normalization can hurt performance by requiring too many joins. Sometimes a small amount of denormalization is acceptable for read-heavy applications. The key is finding the right balance.
