Aggregating and Pivoting Data in Microsoft SQL Server 2008 T-SQL
- 3/25/2009
Unpivoting
Unpivoting is the opposite of pivoting—namely, rotating columns to rows. Unpivoting is usually used to normalize data, but it has other applications as well.
In my examples, I’ll use the PvtCustOrders table, which you create and populate by running the following code:
USE tempdb; IF OBJECT_ID('dbo.PvtCustOrders') IS NOT NULL DROP TABLE dbo.PvtCustOrders; GO SELECT custid, COALESCE([2006], 0) AS [2006], COALESCE([2007], 0) AS [2007], COALESCE([2008], 0) AS [2008] INTO dbo.PvtCustOrders FROM (SELECT custid, YEAR(orderdate) AS orderyear, qty FROM dbo.Orders) AS D PIVOT(SUM(qty) FOR orderyear IN([2006],[2007],[2008])) AS P; UPDATE dbo.PvtCustOrders SET [2007] = NULL, [2008] = NULL WHERE custid = 'D'; -- Show the contents of the table SELECT * FROM dbo.PvtCustOrders;
This generates the following output:
custid 2006 2007 2008 ------ ----------- ----------- ----------- A 22 40 10 B 20 12 15 C 22 14 20 D 30 NULL NULL
The goal in this case is to generate a result row for each customer and year, containing the customer ID (custid), order year (orderyear), and quantity (qty).
I’ll start with a solution that does not use the native UNPIVOT operator. Here as well, try to think in terms of logical query processing as described in Chapter 1.
The first step in the solution is to generate three copies of each base row—one for each year. You can achieve this by performing a cross join between the base table and a virtual auxiliary table that has one row per year. The SELECT list can then return custid and orderyear and also calculate the target year’s qty with the following CASE expression:
CASE orderyear WHEN 2006 THEN [2006] WHEN 2007 THEN [2007] WHEN 2008 THEN [2008] END AS qty
You achieve unpivoting this way, but you also get rows corresponding to NULL values in the source table (for example, for customer D in years 2007 and 2008). To eliminate those rows, create a derived table out of the solution query and, in the outer query, eliminate the rows with the NULL in the qty column.
Here’s the complete solution, followed by its output:
SELECT custid, orderyear, qty FROM (SELECT custid, orderyear, CASE orderyear WHEN 2006 THEN [2006] WHEN 2007 THEN [2007] WHEN 2008 THEN [2008] END AS qty FROM dbo.PvtCustOrders CROSS JOIN (SELECT 2006 AS orderyear UNION ALL SELECT 2007 UNION ALL SELECT 2008) AS OrderYears) AS D WHERE qty IS NOT NULL; custid orderyear qty ------ ----------- ----------- A 2006 22 A 2007 40 A 2008 10 B 2006 20 B 2007 12 B 2008 15 C 2006 22 C 2007 14 C 2008 20 D 2006 30 D 2007 0 D 2008 0
As of SQL Server 2008, you can replace the current definition of the derived table D with a table value constructor based on the VALUES clause, like so:
SELECT custid, orderyear, qty FROM (SELECT custid, orderyear, CASE orderyear WHEN 2006 THEN [2006] WHEN 2007 THEN [2007] WHEN 2008 THEN [2008] END AS qty FROM dbo.PvtCustOrders CROSS JOIN (VALUES(2006),(2007),(2008)) AS OrderYears(orderyear)) AS D WHERE qty IS NOT NULL;
Either way, using the native proprietary UNPIVOT table operator is dramatically simpler, as the following query shows:
SELECT custid, orderyear, qty FROM dbo.PvtCustOrders UNPIVOT(qty FOR orderyear IN([2006],[2007],[2008])) AS U;
Unlike the PIVOT operator, I find the UNPIVOT operator simple and intuitive, and obviously it requires significantly less code than the alternative solutions. UNPIVOT’s first input is the target column name to hold the source column values (qty). Then, following the FOR keyword, you specify the target column name to hold the source column names (orderyear). Finally, in the parentheses of the IN clause, you specify the source column names that you want to unpivot ([2006],[2007],[2008]).