Tip of the Week

Performing Logical Comparisons of Data

There are times when you want to create a visualization that will show the relationship of various sets of data. Some of this can be done using a details visualizations, but it is also nice to be able to do this in a single visualization. For this, we will use the Pie Chart.

Let’s assume we are analyzing stock data.  Inside our data table, we have a column called Signal, which is an indicator from analysts whether the stock should be bought or sold and also a column called Trend, which describes the current trend of the stock, whether it is going up (designated with a + sign) or down  (designated with a – sign).  If we would like to do a comparison of both of these columns to see their relationships, we can create a Pie chart which colors based off both the Signal and Trend columns.
 


When we do that, you can see the finished Pie Chart:


This chart shows us that 54.2% of the stocks signaling as Sell are on the down trend, and only 19% are on an upward trend. For stocks signaling as Buy, it is about even split between downward and upward trends.

What if your data is not structured properly to do this type of comparison? For example what if there was no column in your raw data that comprised what you wanted to compare? You do have a couple options.

First you can create an expression on one of the color axis. For example, let’s assume we are looking at the same stock data and we want to compare the Signal (which is in our dataset) and also which stocks have the highest volume.  We do have a column in the data table which is for the stocks Avg Volume, but its not categorical data, its continuous.  We can use an expression to make it categorical based off some logic.  We can say we want all stocks with average volumes over the UOF (Upper Outer Fence) to be labeled as Yes and then all others as No.

To do this we right click on one of the axis in the Color legend, and select Custom Expression.



In the expression dialog, we want to use the following expression:

  if(([Avg Vol])>=(UOF([Avg Vol])),"Yes","No") As [Highest Volumn]

Since this color axis also has another column in it, the Signal column, we will need to use the NEST keyword.  For more information on the NEST keyword, see the link on it to the help file or consider taking our Computational Analytics training course to master the entire expression language.

The final expression will look like the following:



 
And the resulting Pie Chart looks like the following:
 

Let’s show another example from another domain.  This data set contains generic sales values for 'My Product' and Total Market sales.  Customers are identified with a numeric CUSTOMER CODE, and the customers are divided into four different Class groups based on corporate opinion on how the customer should be approached.  In addition, the Potential of each customer is rated as High or Low, and information about the location (country and global region) of each customer is provided.



 
When then may want to identify ‘key customers’ as those who are higher than one standard deviation above the average series value.  For this , we can create an expression , either as a Calculated Column or inside the Color Axis:

if(([My Product Sales])>(Avg([My Product Sales]) + StdDev([My Product Sales])),"My Key Customers","No") As [My Product Customers]


We can then do the same thing for Total Market Sales instead of My Product Sales to get those sales which are higher than one standard deviation above the average series value for all the market sales.

if(([Total Market Sales])>(Avg([Total Market Sales]) + StdDev([Total Market Sales])),"Total Market Key Customers","No") As [Total Market Customers]

If we put these expressions on the color axis, we can compare the logical groupings of the two.



Representing a logical AND, how many customers are key in both My Product Sales and Total Market Sales?  The Answer is shown in blue and is 13.


If you want to know how many customers are key in My Product Sales but not in Total Market Sales, the answer is shown in red and is 33.

If you want to know how many customers are key in Total Market Sales but not in My Product Sales, the answer is shown in yellow and is 34.

If you want to know how many customers are not key in either My Product Sales or Total Market Sales, the answer is shown in green and is 220.

In our example above, we were able to categorize the column using an expression, but let’s say the column you want to use as pie of the logical comparison cannot easily be used in an expression. Then, we can rely on the Tag Panel to create a Tag Collection for this column.   Below is a picture showing the same exact data set, but the records above one standard deviation are manually marked and tagged:
 

 

Comments

 

The ???Mini-Rack??? Approach To Blade Server Design | M. Sean McGee | Chassis Coachwork Design said:

Pingback from  The ???Mini-Rack??? Approach To Blade Server Design | M. Sean McGee | Chassis Coachwork Design

May 4, 2010 2:55 AM
 

Casey said:

great posting Kevin.

I am trying to build a custom expression that only visualizes the number(i.e. count) of iterations of a string variable within a certain visualization.  

I've tried all sorts of combinations, but cant seem to come up with a winner.  I thought something like this would work, but it doesn't return data.

if([column 1] = "string", count([column 1]))

July 14, 2011 9:06 AM
 

uggkensington said:

http://www.uggkensingtons.org/

http://www.uggkensingtons.org/ugg-fox-fur-short-boots-5531-c-54.html | UGG Fox Fur Short Boots 5531

http://www.uggkensingtons.org/ugg-kensington-boots-5678-c-2.html | UGG Kensington Boots 5678

http://www.uggkensingtons.org/ugg-rainier-eskimo-boots-5189-c-85.html | UGG Rainier Eskimo Boots 5189

http://www.uggkensingtons.org/ugg-retro-cargo-boots-1895-c-3.html | UGG Retro Cargo Boots 1895

December 27, 2011 8:13 PM
 

uggskensingtonsale said:

www.uggskensingtonsale.org

www.uggskensingtonsale.orgugg-adirondack-boots-ii-c-6.html | UGG Adirondack Boots II

www.uggskensingtonsale.orgugg-adirondack-tall-boots-c-11.html | UGG Adirondack Tall Boots

www.uggskensingtonsale.orgugg-amberlee-boots-c-34.html | UGG Amberlee Boots

www.uggskensingtonsale.orgugg-annabelle-boots-c-35.html |

December 28, 2011 4:16 AM
 

uggretrocargosale said:

January 4, 2012 8:49 PM
 

Data Scientists Gaining Business Value from Data Mashups said:

Pingback from  Data Scientists Gaining Business Value from Data Mashups

January 17, 2012 11:22 PM
 

Data Scientists Gain Business Value from Data Mashups said:

Pingback from  Data Scientists Gain Business Value from Data Mashups

January 19, 2012 6:21 AM

Leave a Comment

(required) 
(optional)
(required) 
Submit

Syndication

Tags

Other Spotfire Blogs

Spotfire's interactive information visualization and analytic solutions give users a remarkable experience for quickly and easily querying data and reporting results for superior business intelligence. From portfolio management and customer retention programs to key processes such as CRM, marketing, research, bioinformatics, yield and asset management and design for manufacturing, enterprises around the world rely on Spotfire's business analytics software to improve operational performance.

©Copyright 2000-2011 TIBCO Software Inc | Privacy Policy | Terms of Use I Blog I Contact Us I Content Center