Big Data Analytics
with TIBCO Spotfire®
Interactive, Visual Analytics for Hadoop and other Big Data Stores
Democratizing Big Data with Visual Analytics
Turbo-charge your business analytics and address your routine to complex Big Data challenges with the Spotfire analytics platform. Spotfire is the only platform that empowers business users with an intuitive, easy-to-use interface to leverage the full spectrum of big data analytics technology, without requiring any data science or IT expertise.
The Spotfire interface remains consistent whether you are analyzing a small dataset or performing advanced analytics on a multi-terabyte big data cluster with complex data from sensor, social, Point-of-Sale (PoS), and geo-location sources. Users of any skill level navigate rich, insightful dashboards and analytical workflows simply by interacting with visualizations that represent aggregations of billions of data points.
Big Data Connectivity for High Performance Analytics
Spotfire offers three primary types of native integration with Hadoop and other big data sources:
- Visualizing Data: Native out-of-the-box data connectors that facilitate super fast interactive data visualizations.
- Performing Calculations:
- Bring the engine to the data: Integration with in-datasource distributed computing frameworks that enable data calculations of any complexity on big data.
- Bring the data to the engine: Integration with external statistical engines that get data directly from any data source, including traditional databases.
Together, these modes of integration offer a combination of visual data discovery and advanced analytics. They enable business users to access, combine, and analyze data from any underlying data structures with dashboards and workflows that are powerful and easy to use.
1. MapReduce, Spark, H2O, and TERR (TIBCO Enterprise Runtime for R) support distributed computing in Hadoop. Fuzzy Logix supports Teradata, Aster and Netezza.
2. TERR can be deployed as the advanced analytics engine in Hadoop nodes that are driven by MapReduce or Spark. TERR can also be called on Teradata nodes.
Big Data Connectors
Spotfire Big Data connectors support in-datasource, in-memory and on-demand data access modes. As a result of this data access flexibility, fast interactive visualizations are made possible such that data calculations occur within the data stores and the data is moved into client memory if and when it is needed. Spotfire native data connectors include:
- Certified Hadoop data connectors for Apache Hive, Apache Spark SQL, Cloudera Hive, Cloudera Impala, Databricks Cloud, Hortonworks, MapR Drill and Pivotal HAWQ
- Other certified big data connectors include Teradata, Teradata Aster and Netezza
- Connectors for OSI PI historical and real-time sensor data sources
Learn more about data access with Spotfire data connectors.
In-Datasource Distributed Computing
In addition to convenient Spotfire point-click SQL operations running distributed within the datasource, advanced statistical and machine learning algorithms can be initiated from Spotfire to be run in-datasource on very large datasets, only returning the results needed for visualizations in Spotfire:
- Users interact with point-and-click dashboards that call scripts using the TERR instance embedded in Spotfire.
- The TERR scripts initiate distributed computing jobs via Map/Reduce, H2O, SparkR, or Fuzzy Logix.
- These jobs drive high-performance engines deployed on the Hadoop or other datasource nodes.
- TERR can be deployed as the advanced analytics engine in Hadoop nodes that are driven by MapReduce or Spark. It can also be called on Teradata nodes.
- Results are visualized in Spotfire.
TERR for Advanced Analytics
TIBCO Enterprise Runtime for R (TERR) is TIBCO's commercially supported advanced analytic engine for the R language that is faster, more scalable and more robust than open source R. It is embedded in Spotfire and can also be deployed as the advanced analytics engine in Hadoop nodes, in TIBCO Spotfire Statistics Services and in TIBCO Event Analytics. Learn More
Putting it all together
Combining all these powerful functionalities means that very sophisticated and robust analytic use cases can be encapsulated in easy-to-use interactive workflows. This empowers business users to visualize, analyze, and share the results without worrying about the details of the underlying data architecture.
Example: Spotfire interface for configuring, running and visualizing the results of a model that identifies characteristics of lost shipments. Through this interface business users can perform calculations using both TERR and the H2O distributed computing framework against shipment transaction data stored in a Hadoop cluster.
Analytical Breadth for Big Data
Let's face it, in today's interconnected world you are faced with complex problems and difficult decisions. Spotfire offers powerful capabilities that empower you to transform vast amounts of diverse data into meaningful insights and make the best possible decisions, fast.
Advanced and Predictive Analytics: Users interact with point-and-click Spotfire dashboards to drive a rich array of advanced capabilities that enable prediction, simulation, and optimization. With big data, analysis can be performed in-datasource, only bringing back the aggregations and results needed to populate Spotfire visualizations. Learn More
Content Analytics: Spotfire provides visualization and analytics on the largely untapped dimension of big data: unstructured text that is captured but hidden in documents, reports, CRM notes, weblogs, social posts, and other sources. Spotfire allows you to visually analyze text-based data in 27 languages and blend it with structured data to add context and detail and obtain deeper insights. Learn More
Location Analytics: Multi-layer high resolution maps are an excellent way to visualize big data. Spotfire's rich mapping capabilities allow you to create maps with as many reference and feature layers as you need, including calculated advanced analytics features. In addition to geographical maps, Spotfire supports custom maps to visualize data for warehouses, factory floors, semiconductor wafers, and many others. View Demo to Learn More
Machine Learning: A broad class of machine learning methods are available in Spotfire as point and click data functions that users can invoke. Data scientists have access to the underlying R code and can extend the data function collection. The machine learning functions are shared with the user community for easy reuse.
- Linear and logistic regression
- Decision trees, random forests, gradient boosting machines (gbm)
- Generalized additive models
- Neural networks
Machine learning allows you to go beyond traditional analytics to identify structure in data that addresses business-critical problems, for example:
- Prediction of customer behavior: customer segmentation, customer churn, cross-sell/up-sell propensity. Learn More
- Fighting financial crime. Learn More
- Internet of Things (IoT):
- Production planning and value estimation of oil fields. Learn More
- Optimization of manufacturing equipment, processes and yield. Learn More
- Risk assessment
- Price optimization
Real-time Event Analytics: Insights from visual analytics and modeling in Spotfire can be deployed, at the press of a button, to event processing systems and scored/run on real-time streaming data. This allows you to monitor real-time data and alert end users, such as marketers or engineers, when an anomaly occurs or a new trend begin to emerge. The alerts can combine recent event data with historical data, providing context to enable users to investigate an event's importance and quickly decide on any necessary intervention.
TIBCO Streambase is integrated with Spotfire for such real-time streaming analytics. Streambase does real-time math on streaming data; using rules and models published in Spotfire. Streambase applies the Spotfire insights to streaming data in an automated manner, pushing notifications to a wide array of channels including text, email, database, and BPM systems. Learn More
- Scalable data visualizations Spotfire big data data visualizations can scale to represent billions of rows of data within an analysis.
- Intuitive user interface Spotfire dashboards and analytic workflows can encapsulate sophisticated use cases that enable business users to visualize, analyze, run calculations, and share the results.
- Flexible data architecture Spotfire's seamless user experience is made possible by the richness of options to access data of any size, perform calculations of any type, and efficiently visualize data aggregations or row-level details.
- Agile Platform Spotfire's agile platform empowers business analysts to drive advanced analytic workflows and applications for big data and become truly data-driven.
Discover other Spotfire Solutions
- Spotfire Content Analytics
Discover the human side of Big Data
Visualize and analyze text-based unstructured content and derive contextual value for deeper insights.
- Spotfire Predictive Analytics
Anticipate What’s Next
Forecast emerging trends, take preemptive action to minimize risk, and make educated decisions with greater confidence.
- TIBCO Cloud Compute (Grid Server + TERR)
High Performance Computing for Everyone
TIBCO Cloud Compute Grid is high performance computing on the public cloud, easy for everyone to use. For computational work like Monte Carlo simulations, Cloud Compute Grid is much easier than Map-Reduce. You can even run complex statistical models multiple orders of magnitude faster than open source R, on a single computer.