Knowing SELECT and WHERE is just the entry fee. What separates a junior analyst from someone who can answer real business questions fast is fluency with the 20 SQL patterns that appear again and again in actual dashboards, reports, and ad-hoc requests. This post covers those patterns — not theoretical — with scenarios drawn from Kozhikode textile distributors, Kochi IT company HR data, and e-commerce funnel analytics that Indian analysts actually encounter.
1–5: Window Functions You Cannot Avoid
1. RANK and DENSE_RANK for Leaderboards
A Kozhikode textile distributor wants to rank their stockists by quarterly fabric purchases. Ties happen when two stockists buy exactly the same volume — which the regional manager wants to see as a shared rank, not skipped numbers.
SELECT
stockist_name,
district,
quarterly_purchase_value,
RANK() OVER (ORDER BY quarterly_purchase_value DESC) AS rank_with_gaps,
DENSE_RANK() OVER (ORDER BY quarterly_purchase_value DESC) AS rank_no_gaps
FROM stockist_quarterly_summary
WHERE fy_quarter = 'Q3-FY2026';
Use DENSE_RANK when the business context is "position on a leaderboard". Use RANK when you need to know exactly how many peers outperformed a given row.
2. ROW_NUMBER for Deduplication
An e-commerce order table often has duplicate entries from webhook retries. To keep only one row per order ID — the most recent one:
WITH ranked_orders AS (
SELECT *,
ROW_NUMBER() OVER (
PARTITION BY order_id
ORDER BY created_at DESC
) AS rn
FROM raw_orders
)
SELECT * FROM ranked_orders WHERE rn = 1;
3. LAG and LEAD for Period-over-Period Comparison
A Kochi IT company wants to compare each month's headcount against the previous month to flag departments with unusual attrition:
SELECT
dept_name,
report_month,
headcount,
LAG(headcount) OVER (PARTITION BY dept_name ORDER BY report_month) AS prev_month_headcount,
headcount - LAG(headcount) OVER (PARTITION BY dept_name ORDER BY report_month) AS net_change
FROM monthly_headcount_snapshot
ORDER BY dept_name, report_month;
LEAD does the same in the opposite direction — useful for calculating how many days until the next event, like the next purchase by a customer.
4. Running Totals with SUM OVER
Cumulative revenue by day for an e-commerce order funnel report:
SELECT
order_date,
daily_revenue,
SUM(daily_revenue) OVER (ORDER BY order_date ROWS UNBOUNDED PRECEDING) AS cumulative_revenue
FROM daily_revenue_summary
WHERE order_date BETWEEN '2026-04-01' AND '2026-06-30';
5. Moving Averages for Smoothing Noisy Data
A 7-day rolling average of daily orders removes weekend dips from the trend line:
SELECT
order_date,
daily_orders,
AVG(daily_orders) OVER (
ORDER BY order_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_7day_avg
FROM daily_order_counts;
6–9: CTEs and Subqueries
6. Basic CTE for Readable Multi-Step Logic
CTEs make complex queries readable by naming intermediate steps. For an employee performance analysis at a Kochi IT company with multiple sub-calculations:
WITH dept_avg AS (
SELECT
dept_id,
AVG(performance_score) AS avg_score
FROM employee_reviews
WHERE review_year = 2026
GROUP BY dept_id
),
above_avg_employees AS (
SELECT e.emp_name, e.dept_id, e.performance_score, d.avg_score
FROM employee_reviews e
JOIN dept_avg d ON e.dept_id = d.dept_id
WHERE e.performance_score > d.avg_score AND e.review_year = 2026
)
SELECT * FROM above_avg_employees ORDER BY dept_id, performance_score DESC;
7. Recursive CTE for Hierarchy Traversal
An org chart query — find all direct and indirect reports under a given manager:
WITH RECURSIVE org_tree AS (
-- Base: the manager themselves
SELECT emp_id, emp_name, manager_id, 0 AS depth
FROM employees
WHERE emp_id = 1042 -- Target manager's emp_id
UNION ALL
-- Recursive: their reports, and reports of reports
SELECT e.emp_id, e.emp_name, e.manager_id, ot.depth + 1
FROM employees e
INNER JOIN org_tree ot ON e.manager_id = ot.emp_id
)
SELECT * FROM org_tree ORDER BY depth, emp_name;
8. Subquery in WHERE for Existence Checks
Find customers who placed an order this quarter but not last quarter — useful for "returning after lapse" campaign lists:
SELECT DISTINCT customer_id, customer_name
FROM orders
WHERE order_date BETWEEN '2026-04-01' AND '2026-06-30'
AND customer_id NOT IN (
SELECT DISTINCT customer_id
FROM orders
WHERE order_date BETWEEN '2026-01-01' AND '2026-03-31'
);
9. Correlated Subquery for Row-by-Row Logic
For each product in a Kozhikode distributor's catalogue, find the most recent purchase price paid:
SELECT
p.product_id,
p.product_name,
(
SELECT purchase_price
FROM purchase_history ph
WHERE ph.product_id = p.product_id
ORDER BY purchase_date DESC
LIMIT 1
) AS latest_purchase_price
FROM products p;
10–13: CASE WHEN for Segmentation
10. Customer Value Segmentation
Classify customers as High, Medium, or Low value based on 12-month spend — a common input for CRM campaign targeting:
SELECT
customer_id,
customer_name,
annual_spend,
CASE
WHEN annual_spend >= 500000 THEN 'High Value'
WHEN annual_spend >= 100000 THEN 'Medium Value'
ELSE 'Low Value'
END AS customer_segment
FROM customer_annual_summary
WHERE summary_year = 2026;
11. CASE WHEN in Aggregation (Conditional Count)
Count orders by status in a single row per customer — instead of joining multiple filtered aggregations:
SELECT
customer_id,
COUNT(*) AS total_orders,
SUM(CASE WHEN status = 'delivered' THEN 1 ELSE 0 END) AS delivered,
SUM(CASE WHEN status = 'returned' THEN 1 ELSE 0 END) AS returned,
SUM(CASE WHEN status = 'cancelled' THEN 1 ELSE 0 END) AS cancelled
FROM orders
WHERE order_date >= '2026-01-01'
GROUP BY customer_id;
12. Indian Fiscal Year Grouping
Indian businesses report on an April–March fiscal year. This query correctly assigns each transaction to its fiscal year:
SELECT
CASE
WHEN MONTH(transaction_date) >= 4
THEN YEAR(transaction_date)
ELSE YEAR(transaction_date) - 1
END AS fiscal_year,
SUM(amount) AS total_revenue
FROM transactions
GROUP BY
CASE
WHEN MONTH(transaction_date) >= 4
THEN YEAR(transaction_date)
ELSE YEAR(transaction_date) - 1
END
ORDER BY fiscal_year;
A January 2026 transaction gets fiscal_year = 2025 (FY2025-26), while an April 2026 transaction gets fiscal_year = 2026 (FY2026-27). This aligns with how Indian CA firms and finance teams label fiscal years.
13. Quarter within Indian Fiscal Year
SELECT
CASE
WHEN MONTH(order_date) IN (4,5,6) THEN 'Q1'
WHEN MONTH(order_date) IN (7,8,9) THEN 'Q2'
WHEN MONTH(order_date) IN (10,11,12) THEN 'Q3'
ELSE 'Q4'
END AS fiscal_quarter,
COUNT(*) AS order_count,
SUM(order_value) AS total_value
FROM orders
WHERE YEAR(order_date) IN (2025, 2026)
GROUP BY fiscal_quarter
ORDER BY MIN(order_date);
14–17: Joins, Self-Joins, and Set Operations
14. Self-Join for Employee–Manager Pairs
SELECT
e.emp_name AS employee,
m.emp_name AS manager,
e.dept_name
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.emp_id
ORDER BY e.dept_name, e.emp_name;
15. Finding Gaps with OUTER JOIN
Identify which days in a date range had zero orders — useful for spotting payment gateway outages at Indian e-commerce companies:
WITH date_series AS (
SELECT DATE('2026-04-01') + INTERVAL seq DAY AS dt
FROM (SELECT 0 seq UNION SELECT 1 UNION SELECT 2 /* ... to 89 */) nums
WHERE DATE('2026-04-01') + INTERVAL seq DAY <= '2026-06-30'
)
SELECT ds.dt, COALESCE(COUNT(o.order_id), 0) AS orders
FROM date_series ds
LEFT JOIN orders o ON DATE(o.created_at) = ds.dt
GROUP BY ds.dt
HAVING orders = 0
ORDER BY ds.dt;
16. UNION vs UNION ALL for Combining Data Sources
Combine offline and online sales into a single analysis table:
SELECT order_id, customer_id, order_date, order_value, 'online' AS channel
FROM online_orders
WHERE order_date >= '2026-04-01'
UNION ALL -- Use UNION if you need deduplication; UNION ALL is faster
SELECT bill_no, customer_id, bill_date, bill_value, 'offline' AS channel
FROM pos_transactions
WHERE bill_date >= '2026-04-01'
ORDER BY order_date;
17. Subquery vs JOIN Performance
A subquery in WHERE often performs differently from a JOIN to the same data. For large tables in MySQL, prefer EXISTS over IN with subqueries:
-- Often faster on large tables than IN (subquery)
SELECT c.customer_id, c.customer_name
FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders o
WHERE o.customer_id = c.customer_id
AND o.order_date >= '2026-01-01'
);
18–20: Advanced Aggregation and Query Optimisation
18. GROUP BY with ROLLUP for Subtotals
Sales analysis by region and district with automatic subtotals — useful for MIS reports sent to Kerala-based FMCG distributors:
SELECT
COALESCE(region, 'ALL REGIONS') AS region,
COALESCE(district, 'ALL DISTRICTS') AS district,
SUM(sales_value) AS total_sales
FROM distributor_sales
WHERE fy_year = 2026
GROUP BY ROLLUP(region, district)
ORDER BY region, district;
ROLLUP produces subtotal rows automatically — you get a row for each district, a row for each region total, and a grand total row, all in a single query.
19. PIVOT Pattern Using Conditional Aggregation
Turn rows into columns — monthly revenue by product category, formatted for pivot-style reporting:
SELECT
product_category,
SUM(CASE WHEN MONTH(order_date) = 4 THEN order_value ELSE 0 END) AS Apr,
SUM(CASE WHEN MONTH(order_date) = 5 THEN order_value ELSE 0 END) AS May,
SUM(CASE WHEN MONTH(order_date) = 6 THEN order_value ELSE 0 END) AS Jun,
SUM(CASE WHEN MONTH(order_date) = 7 THEN order_value ELSE 0 END) AS Jul,
SUM(CASE WHEN MONTH(order_date) = 8 THEN order_value ELSE 0 END) AS Aug,
SUM(CASE WHEN MONTH(order_date) = 9 THEN order_value ELSE 0 END) AS Sep
FROM orders
WHERE order_date BETWEEN '2026-04-01' AND '2026-09-30'
GROUP BY product_category;
20. EXPLAIN PLAN for Query Optimisation
When a dashboard query is slow, the first step is understanding what the database engine is actually doing:
EXPLAIN SELECT
c.customer_id,
c.customer_name,
SUM(o.order_value) AS lifetime_value
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.order_date >= '2025-01-01'
GROUP BY c.customer_id, c.customer_name
HAVING SUM(o.order_value) > 50000;
Look for "Full Table Scan" or "Using temporary; Using filesort" in the EXPLAIN output — these indicate missing indexes. For the query above, an index on orders(customer_id, order_date) will typically reduce execution time by an order of magnitude on tables with millions of rows.
An e-commerce funnel analysis at a Kochi company that takes 40 seconds to run often drops to under 2 seconds after adding a composite index on the join and filter columns. Always check EXPLAIN before adding an index — sometimes the query plan reveals a structural issue that an index alone cannot fix.
Frequently Asked Questions
What is the difference between RANK and DENSE_RANK in SQL?
RANK assigns the same rank to tied rows and then skips numbers — two rows tied at rank 2 are both ranked 2, and the next row is ranked 4. DENSE_RANK also assigns the same rank to ties but does not skip numbers — the next row after two tied rank-2 rows receives rank 3. For sales leaderboards where multiple representatives might hit the same revenue, DENSE_RANK gives cleaner results because rank numbers stay consecutive.
How do I calculate Indian fiscal year (April–March) groupings in SQL?
Use a CASE WHEN to shift months: when the month is April (4) or later, use the calendar year as the fiscal year; otherwise subtract 1. A January 2026 transaction belongs to FY2025, while an April 2026 transaction belongs to FY2026. Wrap this in a CTE for reuse across multiple aggregations in the same query.
When should I use a CTE instead of a subquery in SQL?
Use a CTE when you need to reference the same intermediate result more than once, when the logic has more than two levels of nesting, or when you want to make the query readable for the next analyst who inherits it. Use a subquery for simple one-off filters where creating a named reference adds no clarity. In PostgreSQL and BigQuery, the query optimiser treats them differently — profile both if performance matters on large datasets.