Automating Pivot Queries
Suppose you want to create a result set where the columns of a table become the rows or the rows become the columns. In other words, you want to pivot your table, transforming columns into rows or rows into columns. How do you do that? It turns out it is quite easy using SQL. In this post I give a PL/SQL procedure that automates the creation of pivot queries to pivot columns into rows.
Tom Kyte has a great post at AskTom describing pivot queries (link). He shows how to pivot columns into rows and vice-versa using SQL queries. He also provides a PL/SQL procedure for rows to columns pivoting that returns a refcursor. Because I need to pivot columns into rows quite often when working with Bioinformatics data, I thought that it would be a good idea to automate the whole approach. The utility presented here (pivotc2r) complements that provided in AskTom. It has the following features:
- Automatic pivoting of columns into rows instead of rows into columns
- Automatic determination of number of columns to pivot
- Transform only columns of a given data type
- Support, through nested DECODE statements, as many columns as allowed by the RDBMS
- Generation of a table or view instead of a refcursor for output
SELECT deptno, job, ename, sal FROM emp;We would like to transform the data so that we keep ENAME and JOB as is and pivot SAL and DEPTNO into rows. The procedure pivotc2r does that and has the following specification:
DEPTNO JOB ENAME SAL
---------- --------- ---------- ----------
20 CLERK SMITH 800
30 SALESMAN ALLEN 1600
30 SALESMAN WARD 1250
20 MANAGER JONES 2975
30 SALESMAN MARTIN 1250
30 MANAGER BLAKE 2850
10 MANAGER CLARK 2450
20 ANALYST SCOTT 3000
10 PRESIDENT KING 5000
30 SALESMAN TURNER 1500
20 CLERK ADAMS 1100
30 CLERK JAMES 950
20 ANALYST FORD 3000
10 CLERK MILLER 1300
14 rows selected.
CREATE OR REPLACE PACKAGE pivot_pkg ASThe first two arguments are the name for the input and output tables or views. The argument P_OUT_TYPE indicates if the output should produce a table ('TABLE') or a view ('VIEW'). The next argument, P_DATA_TYPE, states the data type (default NUMBER) of the columns to be pivot. All columns of this type are pivoted with the exception of those in the P_ANCHOR argument. The P_ANCHOR argument is an array of column names. It indicates the columns that should stay CROSS RECORD (down the page). For our example P_ANCHOR = ('ENAME','JOB').
TYPE array IS TABLE OF VARCHAR2(30);
TYPE refcursor IS REF CURSOR;
PROCEDURE pivotc2r( p_input_name IN VARCHAR2,
p_output_name IN VARCHAR2,
p_out_type IN VARCHAR2 DEFAULT 'VIEW',
p_data_type VARCHAR2 DEFAULT 'NUMBER',
p_anchor IN array DEFAULT NULL,
p_anchor_label IN array DEFAULT NULL,
p_pivot_label IN VARCHAR2 DEFAULT 'COL_NAME',
p_value_label IN VARCHAR2 DEFAULT 'VALUE');
END;
/
The next three arguments allow assigning new names to the output columns. P_ANCHOR_LABEL is an array with the same number of elements as the P_ANCHOR array. If NULL, the original anchor column names are kept. P_PIVOT_LABEL is the name for the column in the output indicating the name of the original columns. P_VALUE_LABEL is the name of the column holding the values of the original columns.
Before describing the code, let's take a look at an example. First we create a view of the table EMP with the columns we want in the output:
CREATE VIEW emp1 ASNext we execute the procedure:
SELECT ename, job, deptno, sal
FROM emp;
DROP VIEW outemp1;As illustrated in the above results, the procedure automatically pivoted SAL and DEPTNO into rows by combining their values into a single column (VALUE). Another column (COLUMN_NAME) indicates the name of the original column for each value. The implementation of the package body is quite simple:
BEGIN
pivot_pkg.pivotc2r
( p_input_name => 'EMP1',
p_output_name => 'OUTEMP1',
p_anchor => pivot_pkg.array('ENAME','JOB'));
END;
/SQL> SELECT * FROM outemp1;
ENAME JOB COL_NA VALUE
---------- --------- ------ ----------
SMITH CLERK DEPTNO 20
ALLEN SALESMAN DEPTNO 30
WARD SALESMAN DEPTNO 30
JONES MANAGER DEPTNO 20
MARTIN SALESMAN DEPTNO 30
BLAKE MANAGER DEPTNO 30
CLARK MANAGER DEPTNO 10
SCOTT ANALYST DEPTNO 20
KING PRESIDENT DEPTNO 10
TURNER SALESMAN DEPTNO 30
ADAMS CLERK DEPTNO 20
JAMES CLERK DEPTNO 30
FORD ANALYST DEPTNO 20
MILLER CLERK DEPTNO 10
SMITH CLERK SAL 800
ALLEN SALESMAN SAL 1600
WARD SALESMAN SAL 1250
JONES MANAGER SAL 2975
MARTIN SALESMAN SAL 1250
BLAKE MANAGER SAL 2850
CLARK MANAGER SAL 2450
SCOTT ANALYST SAL 3000
KING PRESIDENT SAL 5000
TURNER SALESMAN SAL 1500
ADAMS CLERK SAL 1100
JAMES CLERK SAL 950
FORD ANALYST SAL 3000
MILLER CLERK SAL 1300
28 rows selected.
CREATE OR REPLACE PACKAGE BODY pivot_pkg ASThe procedure does mainly string manipulation for creating the dynamic SQL for the transformation. First, it obtains the names of the columns in the input that match the specified data type. Anchor columns are filtered out. The remaining columns are the ones that are pivoted. The rest of the code implements the pivot approach described in AskTom. If the number of columns exceeds 127, the code creates a new DECODE statement nested within the previous one. This is done to avoid exceeding the maximum (255) number of arguments allowed in a single DECODE statement. Each column contributes with two arguments to the DECODE statement
PROCEDURE pivotc2r( p_input_name IN VARCHAR2,
p_output_name IN VARCHAR2,
p_out_type IN VARCHAR2 DEFAULT 'VIEW',
p_data_type VARCHAR2 DEFAULT 'NUMBER',
p_anchor IN array DEFAULT NULL,
p_anchor_label IN array DEFAULT NULL,
p_pivot_label IN VARCHAR2 DEFAULT 'COL_NAME',
p_value_label IN VARCHAR2 DEFAULT 'VALUE')
AS
v_cursor_col refcursor;
v_cols array;
v_sql long;
v_sql1 long;
v_sql2 long;
v_max_cols NUMBER;
v_parens NUMBER := 1;
v_max_decode NUMBER := 127;
BEGIN
-- Get column name for the input table...
--
v_sql := 'SELECT column_name FROM user_tab_columns ' ||
'WHERE table_name=:1 AND data_type=:2 ';
-- Filter the anchor columns
IF (p_anchor IS NOT NULL) THEN
v_sql := v_sql || ' AND column_name NOT IN(''' ||
p_anchor(1) || '''';
FOR i IN 2..p_anchor.count LOOP
v_sql := v_sql || ',''' || p_anchor(i)||'''';
END LOOP;
v_sql := v_sql || ')';
END IF;
EXECUTE IMMEDIATE v_sql BULK COLLECT INTO v_cols
USING p_input_name, p_data_type;
v_max_cols := v_cols.count;
-- Now, construct the query to pivot the columns into
-- rows.
v_sql :='CREATE ' || p_out_type || ' ' || p_output_name ||
' AS SELECT ';
-- Add anchor columns and get maximum number of columns
IF (p_anchor IS NOT NULL) THEN
FOR i IN 1..p_anchor.count LOOP
v_sql := v_sql || p_anchor(i);
IF (p_anchor_label IS NOT NULL) THEN
v_sql := v_sql || ' ' || p_anchor_label(i) ||',';
ELSE
v_sql := v_sql || ',';
END IF;
END LOOP;
v_sql1 := rtrim(v_sql1,',');
END IF;
-- Pivot the columns
v_sql1 := ' DECODE(a.r, ';
v_sql2 := ' DECODE(a.r, ';
FOR i IN 1..v_cols.count LOOP
IF (MOD(i,v_max_decode) = 0) THEN
v_sql1 := v_sql1 ||' DECODE(a.r, ';
v_sql2 := v_sql2 ||' DECODE(a.r, ';
v_parens := v_parens + 1;
END IF;
v_sql1 := v_sql1 || i ||',''' || v_cols(i) || ''',';
v_sql2 := v_sql2 || i ||',' || v_cols(i) || ',';
END LOOP;
v_sql1 := rtrim(v_sql1,',');
v_sql1 := rpad(v_sql1,v_parens+length(v_sql1),')') ||' '
||p_pivot_label;
v_sql2 := rtrim(v_sql2,',');
v_sql2 := rpad(v_sql2,v_parens+length(v_sql2),')') ||' '
||p_value_label;
v_sql := v_sql || v_sql1 || ',' || v_sql2;
-- Add the original data
v_sql := v_sql ||
' FROM (SELECT * FROM ' || p_input_name;
-- Add the counter to drive the decode statements
v_sql := v_sql || '),' ||
'(SELECT rownum r FROM all_objects ' ||
' WHERE rownum <= ' || v_max_cols || ') a';
EXECUTE IMMEDIATE v_sql;
END pivotc2r;
END;
/
SHOW ERRORS;
The above procedure can be easily modified to take a refcursor or a query string as an input. In this case, we can use the DBMS_SQL package to handle the unknown number of columns in the input data. It can also be made into a pipelined table function.
EDIT: The maximum number of arguments allowed in a single decode, counting all arguments individually, is 255. I have changed the text to reflect this and the code as well. The only code change required was to initialize the v_max_decode variable to 127.
Hi on db 10.2 the package/procedure pivotc2r is compiled without errors.
When using the procedure to create outemp1 out of the emp1 view, we got :
BEGIN
*
ERROR at line 1:
ORA-00905: missing keyword
ORA-06512: at "SYSTEM.PIVOT_PKG", line 78
ORA-06512: at line 2.
Any Idea?
Posted by Anonymous | 11/09/2006 05:46:00 AM
I have just tried it out on a newly installed 10.2.0.1 database and it worked okay.
Make sure that you are able to create the header, the body, and the input view without a problem. Then run the PL/SQL snippet in the blog that invokes the procedure.
Posted by Marcos | 11/10/2006 06:28:00 PM