Un-Pivot Datasets

[Table of Contents]

Data Academy's Un-Pivot Datasets can be used to re-normalize pivoted data in your Data Warehouse. This process is simple, but it can seem intimidating at first, so we've created a five-step guide to help you through it.

Take the following example - a table showing annual sales by region:

yearnorthsouth
2005
10
20
2006
20
30
2007
20
40

Our aim here is to re-normalize this data thus:

yearregionsales
2005
North
10
2005
South
20
2006
North
20
2006
South
30
etc...


1. Create an Un-Pivot Dataset

Drill into your Transformation Space, and add DS_SalesByRegion as an "UnPivot" Dataset:


2. Set The Column-To-Value1 Properties

The "Column-To-Value Name" is the name of the output field which will contain the column identifiers from the source. In our case, the columns "North" and "South" are to be transformed into an output field called "Region". Set the "Column-To-Value Name" property to "Region", and save the Dataset. (You can also set the Column-To-Value Type, if your source column headers contain numeric data, such as years.)


3. Add Pass-Through Field(s) & Source Primary Key

The first field we are interested in is "Year" which we are going to pass-through straight from the source to the target. Select "Year" from the fields drop-down, and click "Add".

You will be taken straight to the Field Details, where you need to identify this field as the Source Primary Key. (It is always required to identify the Source Primary Key when running an un-pivot.)


4. Add Un-Pivot Columns

Back in the Dataset details, add the two un-pivot columns - North and South. We need to carry out the same operations for both, but Data Academy will help you by defaulting the previously used values when you come to add the second of them! Select "North" and click "Add".

In the detail screen, we need to do two things: ensure that "Pass Through Field" is un-checked; and set the "Target Field". The "Target Field" is the column which will contain the values from the source data - in our case, the number of sales - so ensure the value is set to "Sales".


Our completed field list should look like this. Note that the Source Primary Key is identified by a red icon!


5. Build!

You are now ready to build your Transformation Space! The resulting data should look like this:


More Complex Scenarios

In some cases, you will find yourself with a more complex scenario, where you have more than one column-type to un-pivot - your source table may contain "Sales Value North" and "Sales Quantity North", which need to be un-pivoted into fields called "Value" and "Quantity" respectively.

In this instance, you will need to set the Target Field to Quantity/ Value; and also set the Column-To-Value2 (which will default to the source column name - ie. "Sales Quantity North") to just "North".


Un-Pivoting Multiple (or Unknown) Fields

As of Data Academy 5.5, you can un-pivot multiple (or unknown) fields with a single field definition, by using Regular Expressions. To achieve this, you need to add an un-pivot field in the normal way, but set the "RegEx Match" property. If this property is set, Data Academy will also evaluate the Column-To-Value property as a Regular Expression.


This example matches columns that end in " Total" - e.g. "Aberdeen Total", "Birmingham Total", etc. - note the space before "Total".

RegEx Match: .+Total$

Column-To-Value: .+[^ Total]

To match columns that start with "Total" - e.g. "Total Aberdeen", "Total Birmingham", etc.

RegEx Match: ^Total

Column-To-Value: [^Total ].+

Because columns are processed in order, you can match any column that has not already been matched - e.g. "Aberdeen", "Brimingham" etc, with no qualifier.

RegEx Match: .+

Column-To-Value: .+

(You should be very careful with this last option - it really will attempt to un-pivot everything that is either not a pass-through field, or has not previously been matched.)

Additional Notes

  • In all cases where multiple un-pivot operations are carried out, the Column-to-Value results must match exactly, or you will get no data (zero rows) returned in your dataset.

 —

(1) In earlier versions of Data Academy, "Key Value Column".

(2) In earlier versions of Academy, "Pivot Value".