Tip of the Week

Showing Distinct Rows in your Data Table

One of the most commonly asked questions I get in training is how to show distinct rows inside a Data Table.  People need the full set of rows for most of the analysis, but may also want to be able to display a table with just distinct values.

Assume you have a simple data set like the following:

And in at least one location, you want to show only distinct rows, as shown below:

What you need to do is create a new Data Table , which takes your original Data Table as the source and then pivots it.

To start, you goto File > Add Data Table.

 

 

You will then choose the source type to be ‘Existing Data Table in my Analysis’ (shown as Step 1 in the figure above), then if you have more than one Data Table loaded, make sure you select the right Data Table (shown in Step 2 in the figure above), and then select the  ‘Pivot’ Transformation and click ‘Add…’ (shown in Step 3 in the figure above). The Pivot Data dialog will appear.

 

The ‘Row identifier’ should be the first column in the data set, (or the key column).

The ‘Column titles’ and ‘Values and aggregation methods’ should both be ‘None’.

The ‘Transfer columns and aggregation methods’ should list all the remaining columns in the original Data Table, with the expression set to UniqueConcatentate for all columns.

You can then set the ‘Transfer column naming pattern’ to  %T (which just references the Column name and no aggregation method).

What this does is not really pivot the data, rather we use the pivot tool to transfer over all the columns in our original Data Table, but we apply a UniqueConcatenate aggregation to all the transfer columns. This will create the distinct effect we are looking for.

 
When you are done, you can click ‘OK’ to close the Pivot Data dialog and then ‘Finish’ to close the Add Data Table dialog.

When you are done, you will see a new data table added that has just distinct row.

You can also add a Data Relation so that the two Data Tables are related. This will allow users to mark rows in one of the data tables , and also mark related rows in the second Data Table. This is shown below where the user marked a row in the ‘Distinct’ Data Table, and then the related rows were also marked in the ‘Original Data Table’


Published Dec 06 2009, 08:57 PM by Kevin Hanegan
Filed under: ,

Comments

 

hvmarck said:

Whenever I need to show distinct rows only I simply use a Cross Table, using all the columns on the vertical axis, leaving the Horizontal Axes and the Cell values empty. For this data set this results in exactly the same thing.

In case a column in your data set contains different values for a given value in the key column, e.g. if your data looks like this

1 a c d

1 a c e

2 b f e

3 c r t

4 a f g

this can be done with a custom expression for the Vertical Axis of the Cross Table:

<[Column 1] NEST UniqueConcatenate([Column 2]) OVER ([Column 1]) as [Column 2] NEST UniqueConcatenate([Column 3]) OVER ([Column 1]) as [Column 3] NEST UniqueConcatenate([Column 4]) OVER ([Column 1]) as [Column 4]>

December 8, 2009 7:16 AM
 

Kevin Hanegan said:

Thanks for the comment. You are spot on, we even have an upcoming related tip that discussed the cross table approach. The only difference is some customer may want to use the raw data from the newly created data table in this approach in other visualizations or other ways than just the cross table.  However, if you just want to display it in a table, the cross table approach , as you said, is the same....great lead in for the future tips, as well, so we appreciate it:)

Kevin

December 8, 2009 10:50 AM
 

Showing Distinct Rows in your Data Table - Tip of the Week Get Pivot said:

Pingback from  Showing Distinct Rows in your Data Table - Tip of the Week Get Pivot

December 8, 2009 11:52 PM
 

Christof said:

Hi,

if you want to find out how often a column is in the dataset and still get them unique you can do the following:

Concatenate all columns into one (column name: [key]):

concatenate([Number2],[Number1],[Text2],[Text1])

Then count the keys "per" key (column name: [Duplicates]:

count([key]) OVER ([key])

Find the first line for each [key]:

if((rank(baserowid(),"asc",[key]))=1,"unique","duplicates")

You will get a new filter for to filter out duplicates.

Kind regards,

Christof

December 15, 2009 4:26 PM
 

Changing your Habits to Build Muscle | Bodybuilding Fitness Wisdom said:

Pingback from  Changing your Habits to Build Muscle | Bodybuilding Fitness Wisdom

December 18, 2009 2:25 AM
 

uggkensington said:

http://www.uggkensingtons.org/

http://www.uggkensingtons.org/ugg-fox-fur-short-boots-5531-c-54.html | UGG Fox Fur Short Boots 5531

http://www.uggkensingtons.org/ugg-kensington-boots-5678-c-2.html | UGG Kensington Boots 5678

http://www.uggkensingtons.org/ugg-rainier-eskimo-boots-5189-c-85.html | UGG Rainier Eskimo Boots 5189

http://www.uggkensingtons.org/ugg-retro-cargo-boots-1895-c-3.html | UGG Retro Cargo Boots 1895

December 27, 2011 8:17 PM
 

uggkensington said:

http://www.uggkensingtons.org/

http://www.uggkensingtons.org/ugg-fox-fur-short-boots-5531-c-54.html | UGG Fox Fur Short Boots 5531

http://www.uggkensingtons.org/ugg-kensington-boots-5678-c-2.html | UGG Kensington Boots 5678

http://www.uggkensingtons.org/ugg-rainier-eskimo-boots-5189-c-85.html | UGG Rainier Eskimo Boots 5189

http://www.uggkensingtons.org/ugg-retro-cargo-boots-1895-c-3.html | UGG Retro Cargo Boots 1895

December 27, 2011 8:17 PM
 

uggskensingtonsale said:

www.uggskensingtonsale.org

www.uggskensingtonsale.orgugg-adirondack-boots-ii-c-6.html | UGG Adirondack Boots II

www.uggskensingtonsale.orgugg-adirondack-tall-boots-c-11.html | UGG Adirondack Tall Boots

www.uggskensingtonsale.orgugg-amberlee-boots-c-34.html | UGG Amberlee Boots

www.uggskensingtonsale.orgugg-annabelle-boots-c-35.html |

December 28, 2011 4:31 AM
 

uggretrocargosale said:

January 4, 2012 8:52 PM

Leave a Comment

(required) 
(optional)
(required) 
Submit

Syndication

Tags

Other Spotfire Blogs

Spotfire's interactive information visualization and analytic solutions give users a remarkable experience for quickly and easily querying data and reporting results for superior business intelligence. From portfolio management and customer retention programs to key processes such as CRM, marketing, research, bioinformatics, yield and asset management and design for manufacturing, enterprises around the world rely on Spotfire's business analytics software to improve operational performance.

©Copyright 2000-2011 TIBCO Software Inc | Privacy Policy | Terms of Use I Blog I Contact Us I Content Center