Tag Archives: Visualization

Resilient Business Software

The outline of the resilient (almost symbiotic) business software platform includes:
  • Capturing as much reality as possible. This is the untapped potential of the big-data revolution in business. It is also the critical benefit of “creating feedback loops”.
  • Representing reality with as little transformation as possible. Graph databases, naturally.
  • A deep concept of data provenance, meaning that attached to each piece of data is everything you may want to know about how it came to be. This is really applying the graph database concept to individual pieces of data, be they input, imported or calculated.
  • A related concept is deep versioning of code, data and processes. This enables agility in business processes and avoids further complicating exceptions due to institutional amnesia.
  • Visualization as an exploration of data, process and human interaction. See exceptions sticking out farther than their cohorts. 3D density and sparsity shows positive and negative space. Graphs don’t have one representation; play with the levers to change perspectives, change the domain and compare to the past.
  • Pervasive predictions and recommendations. This takes the modern approach of “look at a ton of data” rather than complex algorithms. Predict the data that is needed by the shop foreman at 12pm because it’s what has been requested in the past. In a graph database, paths become well worn like hiking trails, showing what activities and people successfully resolved the problem.

Still, I see friction. This is a resilient system that helps people fit software processes to the business needs. But it’s only an aid. Business won’t run on auto-pilot. Repeatable, accountable actions will always be necessary. Rule-breaking exceptions always happen, even with a system designed to minimize them. And there will be plenty of counter-examples where minimizing exceptions will destroy some competitive aspect of the company.

What would emerge? Some things like running an entire production facility from your phone because the system can anticipate patterns and predict needs. It then feeds you only what you likely need to know, with links for more details. Eventually the body of knowledge includes a good number of exceptions and the path to resolution gets recorded (and thus reinforced) time and again.

Low-Cost Data Analysis & Visualization: It’s Getting Better All The Time

Over the weekend I have revisited Tableau, enjoyed some success with MonetDB, tried to turn MySQL into a hundred million row data warehouse, been underwhelmed with Firebird, installed Greenplum and spent many frustrated hours with Talend Open Studio, Pentaho Kettle and Jitterbit.

Of course, I could just buy QlikView, but what can be done for less $money? Unfortunately data warehouses and BI front-ends are not sexy problems in the opensource community. Graphs and charts get a little more attention, but you’ll need to write your own code to glue them to your application.

In summary, what can I say about our options?

First, write your own ETL. Why do opensource ETL tools like Talend and Kettle work so hard to rebuild Informatica? It reminds me of Linux in the 1990s when the community wanted to beat Windows and kept working to look like Windows and wondering when victory would arrive. Informatica, like OLAP and mainframes, is from an era when memory was scarce; languages were low-level, slow to compile & run, abstracted little and were not at all portable. On top of that, ODBC drivers were tightly controlled and costly.

But now we can pick from many great scripting languages. Today’s languages abstract the hard parts, are easy to read, can be edited while executing and talk to any system, database, web service or application. I think the next direction for ETL will be a simple (but extensible) transformation language using an ORM wrapper… Rails on ETL. Until that arrives, you can achieve everything you need with PHP, Perl, Ruby and others.

Best option for low-cost data warehouse?

Continue reading

Create a Bullet Graph In QlikView + Video

What settings do you use for the gauge and bar charts? Watch the video!

Stephen Few, who spoke at the QlikView conference in April, devised the bullet graph a few years ago. A QlikView customer used bullet graphs and sparklines and was very generous to allow QlikTech to post a working demo of their application. I’m going to build the bullet graphs from that app. You can download and dissect a QVW copy of that app from the QlikView demo website.

Bullet Graph Demo App Example

The bullet graph in QlikView is a bar chart overlayed on a gauge chart. The demo app uses a technique of aligning the targets on all the graphs to 100% of current year budget. The formula for the black line is current year actuals divided by current year budget. The darker gauge section shows prior year actuals over current year budget.

Bullet Graph Diagram

This technique has several advantages:

  • Because PY and CY actuals are both divided by CY budget, they are still in harmony. You can visually see that current year sales is significantly less than last year’s sales.
  • Actual divided by Budget unifies many measures with wildly different scales, making chart maintenance easier without hurting accuracy.
  • Without this technique, you would need to write expressions for the gauge chart expression, and maximum values for bar and gauge charts. With this technique, they are 1 and 1.5.
  • There is additional context in answering the question, “If we were repeating last year’s performance, would we be beating our budget, and by how much?”

I hope you’ve enjoyed this tour through bullet graphs. Take a look at the demo app for sparklines in action as well.

What settings do you use for the gauge and bar charts? Watch the video!

Interactive Information Visualization

Enrico Bertini at Visuale asks how important is interactivity in information visualization? As a proponent of QlikView, Spotfire, Tableau and others, I think it’s extremely important. Interactivity is the future, it’s “make or break.”

I’ve been implementing speed-of-thought interactive BI tools for 6 years and I don’t want to do it any other way. When I watched my first seasoned executive lose restraint and laugh uncontrollably as he got instant answers to his hardest questions, I knew this was the only way to go. When my end-user training sessions end late because everyone is so excited about what they can do, it’s clear that people NEED interactivity.

Response to the Tableau 3.0 Webinar

I finally got around to watching the Tableau 3.0 webinar. I agree with their very excited presenter that Tableau 3.0 is a leap forward. The support of ad-hoc grouping of dimension elements is excellent as is the enhanced support of ad-hoc sets. The annotations look good and act sensibly. Generally, the new features are focused on ease of use, better statistical analysis, and report clarity. All good things. Here are 3.0 examples.

Annotations should be required in every BI tool. The ability to mark reference lines and data points on graphs and tables is critical to clear communication. Placing an annotation on a point in space does not require a data point to exist there, another nice feature. The smart BI vendors are focusing on collaboration and communication among users.

“Groups” stole their name from the “groups” of 2.x which are now the “sets” of 3.0 and can be used like so: similar dimensions such as coffee and tea, which may need to be represented in the database as separate product lines, can now be combined on the fly within Tableau by an end user under the simple heading “drinks”. This would make it easy to answer a question about food vs drink sales without the need to export to Excel and spend more time adding up the drink categories. In short, “groups” bring dimension values together and “sets” allow for separating special values from the rest of a dimensions values–and both can be done by the end user. Pretty nice.

I think the strongest competitor for visualization is Spotfire. However, Tableau’s use of live database interaction will become an advantage as data warehouse implementations shift to high-performance in-memory read-optimized databases. Was that over-hyphenated? Spotfire’s initial data loads are inflexible and I wouldn’t recommend it if you need to update a large dataset frequently.

Unlike QlikView, all of Tableau’s data needs to be in a single database. With good design, this is not a performance issue. The problem is that the extra expense of hardware and software to store a separate data warehouse and run ETL processing may push Tableau’s final price tag far above QlikView, which can easy pull from multiple sources and uses its own high-speed database.

Heartbeat

I am glad to hear in a presentation from Vertica, that they will be releasing their product for free use under a certain data set size. I do not know if this is intended to distinguish developers from production systems or so that smaller companies can run the product for free (and help establish a user base).

Also, I am evaluating Spotfire DXP as well as the upcoming features of QlikView 8. I’ll post a review of both when time and/or NDAs permit.