Loading source
Pulling the file list, source metadata, and syntax-aware rendering for this listing.
Source from repo
Optimize SQL queries with index strategies, query plan analysis, and rewrite patterns for faster database performance.
Files
Skill
Size
Entrypoint
Format
Open file
Syntax-highlighted preview of this file as included in the skill package.
references/details.md
1# sql-optimization-patterns — detailed patterns and worked examples23## Optimization Patterns45### Pattern 1: Eliminate N+1 Queries67**Problem: N+1 Query Anti-Pattern**89```python10# Bad: Executes N+1 queries11users = db.query("SELECT * FROM users LIMIT 10")12for user in users:13orders = db.query("SELECT * FROM orders WHERE user_id = ?", user.id)14# Process orders15```1617**Solution: Use JOINs or Batch Loading**1819```sql20-- Solution 1: JOIN21SELECT22u.id, u.name,23o.id as order_id, o.total24FROM users u25LEFT JOIN orders o ON u.id = o.user_id26WHERE u.id IN (1, 2, 3, 4, 5);2728-- Solution 2: Batch query29SELECT * FROM orders30WHERE user_id IN (1, 2, 3, 4, 5);31```3233```python34# Good: Single query with JOIN or batch load35# Using JOIN36results = db.query("""37SELECT u.id, u.name, o.id as order_id, o.total38FROM users u39LEFT JOIN orders o ON u.id = o.user_id40WHERE u.id IN (1, 2, 3, 4, 5)41""")4243# Or batch load44users = db.query("SELECT * FROM users LIMIT 10")45user_ids = [u.id for u in users]46orders = db.query(47"SELECT * FROM orders WHERE user_id IN (?)",48user_ids49)50# Group orders by user_id51orders_by_user = {}52for order in orders:53orders_by_user.setdefault(order.user_id, []).append(order)54```5556### Pattern 2: Optimize Pagination5758**Bad: OFFSET on Large Tables**5960```sql61-- Slow for large offsets62SELECT * FROM users63ORDER BY created_at DESC64LIMIT 20 OFFSET 100000; -- Very slow!65```6667**Good: Cursor-Based Pagination**6869```sql70-- Much faster: Use cursor (last seen ID)71SELECT * FROM users72WHERE created_at < '2024-01-15 10:30:00' -- Last cursor73ORDER BY created_at DESC74LIMIT 20;7576-- With composite sorting77SELECT * FROM users78WHERE (created_at, id) < ('2024-01-15 10:30:00', 12345)79ORDER BY created_at DESC, id DESC80LIMIT 20;8182-- Requires index83CREATE INDEX idx_users_cursor ON users(created_at DESC, id DESC);84```8586### Pattern 3: Aggregate Efficiently8788**Optimize COUNT Queries:**8990```sql91-- Bad: Counts all rows92SELECT COUNT(*) FROM orders; -- Slow on large tables9394-- Good: Use estimates for approximate counts95SELECT reltuples::bigint AS estimate96FROM pg_class97WHERE relname = 'orders';9899-- Good: Filter before counting100SELECT COUNT(*) FROM orders101WHERE created_at > NOW() - INTERVAL '7 days';102103-- Better: Use index-only scan104CREATE INDEX idx_orders_created ON orders(created_at);105SELECT COUNT(*) FROM orders106WHERE created_at > NOW() - INTERVAL '7 days';107```108109**Optimize GROUP BY:**110111```sql112-- Bad: Group by then filter113SELECT user_id, COUNT(*) as order_count114FROM orders115GROUP BY user_id116HAVING COUNT(*) > 10;117118-- Better: Filter first, then group (if possible)119SELECT user_id, COUNT(*) as order_count120FROM orders121WHERE status = 'completed'122GROUP BY user_id123HAVING COUNT(*) > 10;124125-- Best: Use covering index126CREATE INDEX idx_orders_user_status ON orders(user_id, status);127```128129### Pattern 4: Subquery Optimization130131**Transform Correlated Subqueries:**132133```sql134-- Bad: Correlated subquery (runs for each row)135SELECT u.name, u.email,136(SELECT COUNT(*) FROM orders o WHERE o.user_id = u.id) as order_count137FROM users u;138139-- Good: JOIN with aggregation140SELECT u.name, u.email, COUNT(o.id) as order_count141FROM users u142LEFT JOIN orders o ON o.user_id = u.id143GROUP BY u.id, u.name, u.email;144145-- Better: Use window functions146SELECT DISTINCT ON (u.id)147u.name, u.email,148COUNT(o.id) OVER (PARTITION BY u.id) as order_count149FROM users u150LEFT JOIN orders o ON o.user_id = u.id;151```152153**Use CTEs for Clarity:**154155```sql156-- Using Common Table Expressions157WITH recent_users AS (158SELECT id, name, email159FROM users160WHERE created_at > NOW() - INTERVAL '30 days'161),162user_order_counts AS (163SELECT user_id, COUNT(*) as order_count164FROM orders165WHERE created_at > NOW() - INTERVAL '30 days'166GROUP BY user_id167)168SELECT ru.name, ru.email, COALESCE(uoc.order_count, 0) as orders169FROM recent_users ru170LEFT JOIN user_order_counts uoc ON ru.id = uoc.user_id;171```172173### Pattern 5: Batch Operations174175**Batch INSERT:**176177```sql178-- Bad: Multiple individual inserts179INSERT INTO users (name, email) VALUES ('Alice', '[email protected]');180INSERT INTO users (name, email) VALUES ('Bob', '[email protected]');181INSERT INTO users (name, email) VALUES ('Carol', '[email protected]');182183-- Good: Batch insert184INSERT INTO users (name, email) VALUES185('Alice', '[email protected]'),186('Bob', '[email protected]'),187('Carol', '[email protected]');188189-- Better: Use COPY for bulk inserts (PostgreSQL)190COPY users (name, email) FROM '/tmp/users.csv' CSV HEADER;191```192193**Batch UPDATE:**194195```sql196-- Bad: Update in loop197UPDATE users SET status = 'active' WHERE id = 1;198UPDATE users SET status = 'active' WHERE id = 2;199-- ... repeat for many IDs200201-- Good: Single UPDATE with IN clause202UPDATE users203SET status = 'active'204WHERE id IN (1, 2, 3, 4, 5, ...);205206-- Better: Use temporary table for large batches207CREATE TEMP TABLE temp_user_updates (id INT, new_status VARCHAR);208INSERT INTO temp_user_updates VALUES (1, 'active'), (2, 'active'), ...;209210UPDATE users u211SET status = t.new_status212FROM temp_user_updates t213WHERE u.id = t.id;214```215216## Advanced Techniques217218### Materialized Views219220Pre-compute expensive queries.221222```sql223-- Create materialized view224CREATE MATERIALIZED VIEW user_order_summary AS225SELECT226u.id,227u.name,228COUNT(o.id) as total_orders,229SUM(o.total) as total_spent,230MAX(o.created_at) as last_order_date231FROM users u232LEFT JOIN orders o ON u.id = o.user_id233GROUP BY u.id, u.name;234235-- Add index to materialized view236CREATE INDEX idx_user_summary_spent ON user_order_summary(total_spent DESC);237238-- Refresh materialized view239REFRESH MATERIALIZED VIEW user_order_summary;240241-- Concurrent refresh (PostgreSQL)242REFRESH MATERIALIZED VIEW CONCURRENTLY user_order_summary;243244-- Query materialized view (very fast)245SELECT * FROM user_order_summary246WHERE total_spent > 1000247ORDER BY total_spent DESC;248```249250### Partitioning251252Split large tables for better performance.253254```sql255-- Range partitioning by date (PostgreSQL)256CREATE TABLE orders (257id SERIAL,258user_id INT,259total DECIMAL,260created_at TIMESTAMP261) PARTITION BY RANGE (created_at);262263-- Create partitions264CREATE TABLE orders_2024_q1 PARTITION OF orders265FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');266267CREATE TABLE orders_2024_q2 PARTITION OF orders268FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');269270-- Queries automatically use appropriate partition271SELECT * FROM orders272WHERE created_at BETWEEN '2024-02-01' AND '2024-02-28';273-- Only scans orders_2024_q1 partition274```275276### Query Hints and Optimization277278```sql279-- Force index usage (MySQL)280SELECT * FROM users281USE INDEX (idx_users_email)282WHERE email = '[email protected]';283284-- Parallel query (PostgreSQL)285SET max_parallel_workers_per_gather = 4;286SELECT * FROM large_table WHERE condition;287288-- Join hints (PostgreSQL)289SET enable_nestloop = OFF; -- Force hash or merge join290```291