Tip of the Week

Unpivoting and Pivoting your Data to make it suitable for analysis

Data preprocessing describes the concept of processing raw data to prepare it for analysis. The preprocessing will transform the data into a format that will be more easily and effectively processed for the purpose of the data analysis. There are a number of different tools and methods used for preprocessing.  Some people will perform preprocessing in a database, others will perform it in a 3rd party application, and yet another solution is to use Spotfire to both pre process your raw data and also analyze the output.

Even though Spotfire is quite good at supporting large scale enterprise-wide analytical applications, it is also useful for smaller-scale analysis.  Typically, in this scenario, the data that needs to be quickly analyzed is not in a suitable form, but there is too much overhead pre-processing them in another application.  This is where some of the built-in capabilities come in handy. Spotfire provides a variety of capabilities to pre-process data before (and after) it is loaded inside Spotfire.   One such capability we will look at in this article is unpivoting and pivoting.

 
Assume you have survey data which asks a series of questions where responders can choose one of many possible responses.  When the data is collected, it may be stored in a data table as shown below:

 

However, we want to analyze the distribution of results for each individual question. To do this, we need to transform the data.  But how? This is actually a two step process.

The first process is we need to normalize the data so that we have a single column for the questions asked in the survey and then another single column for the responses. We do this by using the ‘unpivot’ transformation. We select the questions we want to analyze as the ‘Columns to transform’ as shown in the screen shot below:


 

 

At this point we have the normalized data we need. The next step is we need to then pivot the data to have a row for each unique question which has fields showing the count for each response for the question.

You can do this using the ‘pivot’ transformation to pivot as shown below.

 

 

Using TIBCO Spotfire Professional, you can combine both the unpivot and pivot into the data access process so this process happens before the data is even loaded into TIBCO Spotfire (or you can also do this to the raw data set after it is loaded in TIBCO Spotfire so you can have both the raw data and the transformed data into Spotfire at once):

 

 

The final result is one row for each question and a column for each response:

 

 

If you have many different questions with different responses, then you should unpivot and pivot each grouping so they end up as a different data table in Spotfire.

 

Comments

 

Unpivoting and Pivoting your Data to make it suitable for analysis … Get Pivot said:

Pingback from  Unpivoting and Pivoting your Data to make it suitable for analysis … Get Pivot

February 21, 2010 11:53 PM
 

georg said:

Thank you for the case study. As usual, it is a good use case, well presented. However, I personally try to avoid doing these kind of operations in Spotfire and try to do this before I load the data, for example inside a relational database. This can indeed be a lot tricker than doing it in Spotfire but the problem with Spotfire is (unless I am missing something) that you have to perform all the operations again if you want to replace the table. For anything except the simplest analyses, I find that I very rarely get everything right from the beginning, there is almost always a column that you want to include later on, or a piece of data that you want to transform after you have started building your analysis, or perhaps you made a mistake with the logic of your initial query, or the requirements slightly change at the data level.

If you have performed complex post-loading operations in Spotfire, like a lot of pivoting or using calculated columns, you need to do it all over again which can be very fiddly and you may not remember all the steps you did in the first place. Therefore, it would be good if you could "save" the operations that you do post-loading. This would be a very good improvement for a future release (forgive me if this is already possible and I missed it, how can this be done ?).

Georg Richter

February 23, 2010 3:45 AM
 

sumeetg said:

I agree. In fact, for operations such as "Change and replace Column", if someone later wants to see the actual operation performed, it is difficult to see what was actually done.

Sometimes, using Pivots at the information link level can help in creating a data table that can be re-used in limited use cases if you are not doing complex data transformations later on (Pivots in the information link can be changed).

It would definitely be a good improvement, if you could modify the transformation steps for previously created data tables - I know that it would save me a lot of re-work !

-Sumeet

February 23, 2010 5:20 PM
 

Crocs Women's MLB Beach | Ladies Casual Shoes | Shoes Beauty Wisdom said:

Pingback from  Crocs Women's MLB Beach | Ladies Casual Shoes | Shoes Beauty Wisdom

March 2, 2010 8:37 PM
 

First computer - First invented - Computer first invented said:

Pingback from  First computer - First invented - Computer first invented

August 13, 2010 4:45 PM
 

Web site of the state - State web - Texas state government web site said:

Pingback from  Web site of the state - State web - Texas state government web site

August 23, 2010 4:51 PM
 

Construction in management project - Management case - Construction management project management said:

Pingback from  Construction in management project - Management case - Construction management project management

September 15, 2010 2:32 AM
 

jplee said:

I am totally bummed out that you cannot go back in and modify a data transformation - that's exactly what I need to do, and recreating the transforms from scratch is not a workable option.

Filtering experiences a similar problem - you don't know what the state of filtering is at any time, which is exacerbated by having filtering schemes and bookmarks.

Can we please have this data transformation issue addressed?

thanks,

JPL

November 2, 2010 12:59 PM
 

jplee said:

Oh, one more thing - I'd like to change the data type while pivoting - not all the columns are the same type. Why do we need to add another data type change afterwards? And does this have an effect on performance? I'm teasing out 500K rows as it is, and that will grow...

JPL

November 2, 2010 1:01 PM
 

Renger said:

The ability to change the pivot settings is indeed essential for this to work. I often receive additional data (columns) and then need to redo the pivoting + add the calculated columns from scratch. That's a lot of work. I know that the clipboard shows the steps that have been taken to get to the pivot table and it can be redone but it's still a hassle.

December 6, 2011 12:26 PM
 

uggkensington said:

http://www.uggkensingtons.org/

http://www.uggkensingtons.org/ugg-fox-fur-short-boots-5531-c-54.html | UGG Fox Fur Short Boots 5531

http://www.uggkensingtons.org/ugg-kensington-boots-5678-c-2.html | UGG Kensington Boots 5678

http://www.uggkensingtons.org/ugg-rainier-eskimo-boots-5189-c-85.html | UGG Rainier Eskimo Boots 5189

http://www.uggkensingtons.org/ugg-retro-cargo-boots-1895-c-3.html | UGG Retro Cargo Boots 1895

December 27, 2011 8:17 PM
 

uggskensingtonsale said:

www.uggskensingtonsale.org

www.uggskensingtonsale.orgugg-adirondack-boots-ii-c-6.html | UGG Adirondack Boots II

www.uggskensingtonsale.orgugg-adirondack-tall-boots-c-11.html | UGG Adirondack Tall Boots

www.uggskensingtonsale.orgugg-amberlee-boots-c-34.html | UGG Amberlee Boots

www.uggskensingtonsale.orgugg-annabelle-boots-c-35.html |

December 28, 2011 4:22 AM
 

uggretrocargosale said:

January 4, 2012 8:51 PM

Leave a Comment

(required) 
(optional)
(required) 
Submit

Syndication

Other Spotfire Blogs

Spotfire's interactive information visualization and analytic solutions give users a remarkable experience for quickly and easily querying data and reporting results for superior business intelligence. From portfolio management and customer retention programs to key processes such as CRM, marketing, research, bioinformatics, yield and asset management and design for manufacturing, enterprises around the world rely on Spotfire's business analytics software to improve operational performance.

©Copyright 2000-2011 TIBCO Software Inc | Privacy Policy | Terms of Use I Blog I Contact Us I Content Center