Snowflake PIVOT 101: How to Master PIVOT and UNPIVOT (2024)
Pivoting and unpivoting are essential techniques in Snowflake—they stand as crucial feature in data transformation. These techniques help you to transpose your data, converting rows into columns and vice versa. Such transformations are not just cosmetic; they pave the way for smarter data analysis and reporting.
In this article, we will explore what pivoting and unpivoting operations do, their use cases, and how to leverage built-in Snowflake PIVOT and Snowflake UNPIVOT functions to put these concepts into practice.
What Is PIVOT and UNPIVOT?
1) What Do You Mean by Pivoting?
Pivoting is the process of rotating data from row format to column format. Imagine having a table where each row represents a month, and you want to turn those months into individual columns. That's where pivoting comes into play.
Visually, this transformation can be represented as:
2) What Do You Mean by Unpivoting?
Unpivoting does the exact reverse operation—transforming columns into rows format. If you have a wide table with many columns that you'd rather represent as rows, unpivoting is your go-to operation.
Visually, this transformation can be represented as:
In general,
Common Use Case of PIVOT in Snowflake
- Data Summarization: Pivoting can help summarize data, turning detailed rows into summarized columns.
- Cross-Tabulation: Creating matrix formats for data representation.
- Data Reporting: Making data more readable and presentable for reports.
Common Use Case of Snowflake UNPIVOT
- Data Normalization: Turning wide tables with redundant columns into a more normalized format.
- Data Preparation: Preparing data for tools or applications that require a specific row-based format.
- Data Cleaning: Simplifying complex datasets by reducing the number of columns.
Note: Snowflake UNPIVOT is NOT exactly the reverse of Snowflake PIVOT as it cannot undo aggregations made by PIVOT.
Now let's look at how Snowflake PIVOT and Snowflake UNPIVOT commands can be used to accomplish pivoting and unpivoting data.
Save up to 30% on your Snowflake spend in a few minutes!
What Does PIVOT Do in Snowflake?
Snowflake's PIVOT is a SQL operation used to transform rows into columns. It's particularly useful when you want to convert unique row values from one column into multiple columns in the output, aggregating data in the process. This operation is common in data analysis and reporting tasks.
Using PIVOT, the unique values from a specific column that were previously organized into separate rows can be rotated to align related data points into columns instead. This pivoting of the data structure condenses the information, providing a more consolidated analytical view. Here's a basic syntax for using Snowflake PIVOT command:
Syntax for Snowflake PIVOT
Snowflake PIVOT must be used within the FROM clause exclusively. When dealing with a subquery, it should be written within the subquery's From clause.
The syntax is:
SELECT ...
FROM ...
PIVOT (
<aggregate_function>(<pivot_column>)
FOR <value_column>
IN (<pivot_values>)
)
The key parameters are:
- aggregate_function: Aggregation you want to apply, like AVG, COUNT, MAX, MIN, and SUM.
- pivot_column: Column you want to turn into new columns.
- value_column: Values you want to populate in the new columns.
- pivot_values: List of values in the pivot_column that you want to turn into new columns.
Aggregations Supported by PIVOT Command
Snowflake PIVOT command supports several built-in aggregate functions that can be used to combine and transform the data during the pivoting operation. These aggregate functions are:
1) AVG: Calculates the average value of the specified column for each group of rows. It can be used with numeric data types.
Example,
SELECT *
FROM table_name
PIVOT(AVG(pivot_column) FOR value_column IN (pivot_values))
2) COUNT: Counts the number of non-null values in the specified column for each group of rows. It can be used with any data type.
Example,
SELECT *
FROM table_name
PIVOT(COUNT(pivot_column) FOR value_column IN (pivot_values))
3) MAX: Returns the maximum value in the specified column for each group of rows. It can be used with numeric, string, date, and timestamp data types.
Example,
SELECT *
FROM table_name
PIVOT(MAX(pivot_column) FOR value_column IN (pivot_values))
4) MIN: Returns the minimum value in the specified column for each group of rows. It can be used with numeric, string, date, and timestamp data types.
Example,
SELECT *
FROM table_name
PIVOT(MIN(pivot_column) FOR value_column IN (pivot_values))
5) SUM: Calculates the sum of all values in the specified column for each group of rows. It can be used with numeric data types.
Example,
SELECT *
FROM table_name
PIVOT(SUM(pivot_column) FOR value_column IN (pivot_values))
Note that the aggregate function you choose should be compatible with the data type of the pivot column. For example, you cannot use the AVG function on a string column, or the SUM function on a date column.
Snowflake Pivot Examples
For example, consider the following example data table containing the school's student grades for different subjects:
First, let's create on table called StudentGrades and insert some values into that table.
CREATE TABLE StudentGrades (
StudentID INT,
Subject VARCHAR,
Grade INT
);
INSERT INTO StudentGrades (StudentID, Subject, Grade)
VALUES
(3, 'Math', 92),
(3, 'History', 89),
(3, 'Science', 94),
(3, 'English', 88),
(4, 'Math', 76),
(4, 'History', 84),
(4, 'Science', 82),
(4, 'English', 91),
(5, 'Math', 89),
(5, 'History', 87),
(5, 'Science', 90),
(5, 'English', 93),
(6, 'Math', 79),
(6, 'History', 85),
(6, 'Science', 88),
(6, 'English', 86),
(7, 'Math', 83),
(7, 'History', 88),
(7, 'Science', 85),
(7, 'English', 87),
(8, 'Math', 90),
(8, 'History', 86),
(8, 'Science', 89),
(8, 'English', 92),
(9, 'Math', 88),
(9, 'History', 87),
(9, 'Science', 86),
(9, 'English', 90),
(10, 'Math', 91),
(10, 'History', 85),
(10, 'Science', 87),
(10, 'English', 88);
Now, lets pivot the table so that each subject becomes a separate column.
SELECT * FROM StudentGrades
PIVOT (MAX(Grade) FOR Subject IN ('Math', 'History')) AS p;
This would output:
The pivot has aggregated each of the student's grades for Math and History as separate columns.
Getting column name without quotes(“ “) using PIVOT
Remember that you can also include the column names in the AS clause if you prefer them without quotes, or if you wish to display different column names, like this:
SELECT * FROM StudentGrades
PIVOT (MAX(Grade) FOR Subject IN ('Math', 'History')) AS p (student_id, math, history);
How to Pivot Multiple Columns?
Snowflake PIVOT function is really powerful, it's designed to pivot on a single aggregate function. But what if you want to pivot on multiple aggregates, like both SUM and AVERAGE? There's no direct way, but with the clever use of UNION, you can achieve this.
Let's say you have a table BookSales that tracks the number of books sold and the total revenue for different genres.
Step-by-Step Guide to Pivot Multiple Columns in Snowflake
Step 1—Create 'BookSales' Table
CREATE TABLE BookSales (
BookID INT,
Genre VARCHAR,
BooksSold INT,
Revenue FLOAT
);
Step 2—Inserting dummy data to 'BookSales' table.
INSERT INTO BookSales (BookID, Genre, BooksSold, Revenue)
VALUES
(1, 'Fiction', 100, 1500.00),
(2, 'Non-Fiction', 80, 1200.00),
(3, 'Fiction', 110, 1650.00),
(4, 'Non-Fiction', 85, 1275.00),
(5, 'Fiction', 105, 1575.00),
(6, 'Non-Fiction', 90, 1350.00),
(7, 'Fiction', 95, 1425.00),
(8, 'Non-Fiction', 88, 1320.00),
(9, 'Fiction', 108, 1620.00),
(10, 'Non-Fiction', 86, 1290.00);
Step 3—PIVOTing Multiple Columns in Snowflake
To pivot on both the SUM and AVERAGE of BooksSold and Revenue for each genre, you can use the following query:
-- Total Revenue
SELECT 'Total Revenue' AS Metric, *
FROM (
SELECT Genre, Revenue
FROM BookSales
) AS SourceTable
PIVOT (
SUM(Revenue)
FOR Genre IN ('Fiction', 'Non-Fiction')
) AS PivotTable
UNION ALL
-- Average Revenue
SELECT 'Average Revenue' AS Metric, *
FROM (
SELECT Genre, Revenue
FROM BookSales
) AS SourceTable
PIVOT (
AVG(Revenue)
FOR Genre IN ('Fiction', 'Non-Fiction')
) AS PivotTable;
Check out this comprehensive video to learn more in-depth about using the Snowflake PIVOT function with a dynamic list of columns.
What Does UNPIVOT Do in Snowflake?
Snowflake UNPIVOT function transforms columns into rows. It takes two columns plus a list of columns to unpivot and outputs a row for each specified column. UNPIVOT pivots table data from columns back into rows, reversing the effect of a pivot operation.
Note that Snowflake UNPIVOT is not exactly the reverse of PIVOT, as it cannot undo aggregations made by the Snowflake PIVOT function.
Here's a basic syntax on how to use the Snowflake UNPIVOT command:
The syntax is:
SELECT ...
FROM ...
UNPIVOT (
<value_column>
FOR <name_column>
IN (<column_list>)
)
The key parameters are:
- value_column: Column that will hold the values from the unpivoted columns.
- name_column: Column that will hold the names of the unpivoted columns.
- column_list: List of columns you want to unpivot.
How to Use UNPIVOT in Snowflake?
For example, consider the following hypothetical scenario where we track the quantity of fruits in two different stores.
1) Creating the Table and Inserting Dummy Values
Let's use a simple example of a FruitInventory table that tracks the quantity/number of fruits in two stores: "StoreA" and "StoreB".
CREATE TABLE FruitInventory (
FruitName VARCHAR,
StoreA INT,
StoreB INT
);
Let's insert data for different fruits.
INSERT INTO FruitInventory (FruitName, StoreA, StoreB)
VALUES
('Apple', 100, 120),
('Banana', 80, 90),
('Cherry', 50, 45),
('Date', 30, 35),
('Fig', 40, 38),
('Grape', 60, 65),
('Honeydew', 20, 18),
('Kiwi', 25, 28),
('Lemon', 70, 75),
('Mango', 55, 50);
2) Using Snowflake UNPIVOT to unpivot the data
Now, lets UNPIVOT it!
To transform the FruitInventory table to have a row for each store's fruit quantity, use the following query:
SELECT FruitName,
column_name AS Store,
Quantity
FROM FruitInventory
UNPIVOT (Quantity FOR column_name IN (StoreA, StoreB)) AS u;
As you can see, the result will be a table with columns FruitName, Store, and Quantity, showing the quantity of each fruit in each store.
Bonus—Advanced Pivot and Unpivot Techniques in Snowflake
There are several more advanced Snowflake pivot and unpivot techniques that can be applied:
1) Dynamic Pivoting using Stored Procedures
In many cases, the pivot column values might not be known in advance. While Snowflake doesn't natively support dynamic pivoting, you can still use stored procedures or scripting outside of Snowflake to dynamically create the PIVOT query based on the distinct values in the dataset.
2) Pivoting on Multiple Aggregates
As we discussed above, while the Snowflake PIVOT function is designed for a single aggregate function, you can use UNION or JOIN operations to combine results from multiple PIVOT operations, each with a different aggregate.
3) Handling NULLs and Defaults
Sometimes, not all pivot column values will have corresponding data. In such cases, you can use the IFNULL or COALESCE functions to handle NULL values and provide default values.
4) Pivoting on String Aggregates
Beyond numerical aggregates like SUM() or AVG(), you can also pivot on string aggregates, such as LISTAGG, to concatenate strings from rows into a single column.
5) Unpivoting Multiple Groups of Columns:
If you have a table with multiple sets of columns that need to be unpivoted into separate rows, you can use multiple UNPIVOT operations in combination with UNION.
6) Filtering During Unpivot
Sometimes, you might not want to unpivot all columns or all values. You can use WHERE clauses in conjunction with UNPIVOT to filter out specific rows or values during the unpivoting process.
7) Unpivoting with Additional Data:
If you need to include additional columns in the result that aren't part of the UNPIVOT operation, you can include them in the SELECT statement, which is very very useful for retaining context or additional data attributes.
8) Handling Aggregations during Unpivot
When a snowflake pivot applies aggregate functions like AVG(), COUNT(), MAX(), MIN(), and SUM(), performing an unpivot will lead to incorrect inflated values. Additional grouping and sums may be required after unpivoting to handle this.
9) Handling NULLs
After unpivoting, you might end up with rows where the value column has NULL values. You can use WHERE clauses to filter out these rows or use functions like IFNULL or COALESCE to provide default values.
What Is the Difference Between Snowflake PIVOT and UNPIVOT?
Snowflake PIVOT | Snowflake UNPIVOT |
Snowflake PIVOT rotates a table by turning unique values from one column into multiple columns, effectively transforming rows into columns. | Snowflake UNPIVOT rotates a table by transforming columns into rows, effectively transforming columns into rows. |
Converts a narrow table (e.g., empid, month, sales) into a wider table (e.g., empid, jan_sales, feb_sales, mar_sales). | Converts a wide table (e.g., empid, jan_sales, feb_sales, mar_sales) into a narrower table (e.g., empid, month, sales). |
Syntax: SELECT ... FROM ... PIVOT (< aggregate_function > (< pivot_column >) FOR < value_column > IN (< pivot_values >)) | Syntax: SELECT ... FROM ... UNPIVOT (< value_column > FOR < name_column > IN (< column_list >)) |
Requires an aggregate function (AVG, COUNT, MAX, MIN, SUM) to combine the grouped values from the pivot column. | Does not require an aggregate function, as it simply redistributes data into a new format. |
Supports the DEFAULT ON NULL clause to specify a default value for NULL values in the pivoted result. | Does not have a built-in mechanism to handle NULL values. |
Cannot directly undo the aggregations made by PIVOT using UNPIVOT. | Can potentially reverse the effects of PIVOT by unpivoting the pivoted columns, but may result in data duplication or loss of aggregated values. |
Useful for transforming data from a long format to a wide format, especially for reporting or analytical purposes. | Useful for transforming data from a wide format to a long format, which can be more suitable for certain data processing tasks or for denormalizing data. |
Want to take Chaos Genius for a spin?
It takes less than 5 minutes.
Conclusion
And that's a wrap! Pivoting and unpivoting are integral techniques for Snowflake data transformation and analysis. Snowflake PIVOT and UNPIVOT provide powerful functionality to pivot and unpivot data with simple SQL syntax.
In this article, we covered:
- Snowflake PIVOT / UNPIVOT syntax and parameters
- Common use cases for Snowflake PIVOT and UNPIVOT
- Pivoting/unpivoting example queries
- Advanced Snowflake Pivot and Unpivot Techniques
- Difference between Snowflake PIVOT and UNPIVOT
Pivoting and unpivoting data can help you reshape your data stored in tables for analysis and business intelligence needs.
FAQs
What is pivoting?
Pivoting is the process of rotating data from a row format to a column format.
What is unpivoting?
Unpivoting is the process of transforming data from a column format to a row format.
What is pivot in SQL?
In SQL, the PIVOT operation (sometimes called pivot table or crosstab) rotates rows into columns in a similar way to the Snowflake PIVOT function.
What does pivot do in Snowflake?
Snowflake PIVOT function is used to transform data from a row-level format to a columnar format, allowing for a more structured and readable presentation of data.
What is the opposite of pivot in Snowflake?
The opposite operation of PIVOT in Snowflake is UNPIVOT, which transforms columns back into rows.
How does the Snowflake UNPIVOT function differ from Snowflake PIVOT?
Snowflake PIVOT function transforms data from rows to columns, the UNPIVOT function does the opposite—converting data from a columnar format back to a row-level format.
How do you unpivot data in Snowflake SQL?
To unpivot data in Snowflake SQL, you use the UNPIVOT function along with specifying a value_column, name_column, and column_list parameters.
Can you rename the output column names in Snowflake PIVOT?
Yes, if you prefer different output column names than the input or without quotes, you can include the column names in the AS clause.
Can Snowflake UNPIVOT undo aggregations made by Snowflake PIVOT?
No, Snowflake UNPIVOT cannot undo aggregations made by Snowflake PIVOT.
How do you order the results after using the Snowflake PIVOT function?
After using Snowflake PIVOT function, you can order the results using the ORDER BY clause followed by the desired column name.
Can I use multiple aggregate functions in a Snowflake PIVOT query?
Yes, but it requires structuring your query carefully and might involve multiple Snowflake PIVOT operations.
What is dynamic pivoting in Snowflake?
Dynamic pivoting refers to pivoting when the pivot column values are not known in advance, often using stored procedures or external scripting.
Is there a performance difference between PIVOT and UNPIVOT?
The performance largely depends on the dataset size and structure. But, both commands are equally optimized for efficiency in Snowflake
How do I retain additional data columns when using the UNPIVOT function?
When using UNPIVOT, you can include additional columns in the SELECT statement to retain them in the result, providing context or additional data attributes.
How can you unpivot multiple groups of columns in Snowflake?
You can use multiple UNPIVOT operations combined with UNION to unpivot multiple groups of columns.
Can you filter data during unpivoting in Snowflake?
Yes, you can use WHERE clauses in conjunction with UNPIVOT to filter out specific rows or values during unpivoting.