String vs. Numeric Comparisons

I don’t know if you use a lot of Sum-If logic like I do. It’s a blessing and a curse. For example: sum(if(Category=’Sales’,Amount)). It just doesn’t scale up to tens of millions of rows. Even with only a million rows, if you have a sheet loaded with these expressions, QlikView will calculate the chart objects one at a time. The user may have to wait half a minute for the sheet to update.

Until set analysis comes out in version 8.5 (and it won’t fix everything), try using numeric comparisons. sum(if(CategoryCode=40,Amount)). In practice you’ll see a 10X speed improvement, but it’s possible to have bigger gains. The reason, as I learned from Nik Bowman, is that numeric comparisons can be done natively on the processor, where the string comparisons have a lot of overhead code.

Using QlikView Tickets

Starting in Server 8.2, QlikView authentication is much more flexible with tickets. I’m just getting started with them myself, and I’ll be sharing the resources I find along the way.

Big thanks to Dan English, Johan Jeanson and Claes Linsefors for my education about tickets at Qonnections 2008.

If you have a working QV Server implementation including an IIS or QVS HTTP frontend, copy this URL and see the ticket number in the response:
http://localhost/QvAjaxZfc/qvsviewclient.asp?cmd=<Global%20method=”GetTicketForMe”/>
NOTE: The double-quotes don’t copy correctly. Be sure to fix them or you’ll get an “invalid syntax” error.

Copy the ticket value to the end of this request to open a file:
qvp://localhost/YourDocument.qvw?TICKET= (paste ticket number here)

Behind the scenes the QlikView Server (QVS) is passed the username that authenticated to the web server. QVS issues a ticket in an XML response. You apply the ticket to your document request. QVS confirms the ticket, looks up the username that was passed, and then checks if that username can open the document (using NTFS permissions or the QlikView Document Metadata Service [DMS]). You are presented with the document in your client.

Most likely you are running IIS and you were authenticated using the same Integrated Windows Authentication (IWA) or a user/password combination that you use every day. In that case, this demo isn’t very impressive because you didn’t see different behavior. The key difference is that unlike QlikView versions prior to 8.2, your authentication to the server is not tied to your Windows account. QVS did not rely on IWA to pass your credentials, as it normally does. Instead QVS trusts that if you are authenticated to the web server, then you are a real user.

What this all means is that you don’t need IIS to authenticate the user. Combine flexible authentication with flexible authorization using DMS, and you can use two different schemes simultaneously. For example, local users on the Windows network will authenticate transparently using IIS and IWA. Meanwhile, remote users that are not a part of your network can enter a user/pass combo, or be referred through an Apache server, or the ERP system.

Faster Reloads With “Core Control”

QlikView is multi-threaded in many areas but not during the load process. Next time you’re running a load script, open the Windows Task Manager. If you have 2 cores, the processor usage will never go above 50%. You can to do this while QlikView is running and see what actions are single- and multi-threaded.

Load times on large datasets can be dramatically improved by splitting the reload process into pieces and assigning pieces to separate cores. A good place to start is monitoring how your Windows installation allocates processes to cores. I’ve only seen one tool that shows processes on each core. If you know of any others, please post a comment.

Now that you have evaluated the automatic allocation, how can you specify the core directly? One tool that ships with Windows is the Task Manager. Unfortunately, it can only alter a running process and it does not save your configuration for the next run. NOTE: Vista allows command-line affinity setting and Server 2008 might as well.

There are several choices of utilities to control core affinity. Some are command-line such as BeyondLogic’s, others have a GUI and store your choice for the future such as Tom’s Hardware Guide’s. The downside with THG’s tool is that you will need to make copies of QV.exe with different names so each gets a unique process name. The upside is that the CPU affinity setting is automatic.

Obviously, trying to do this with Publisher would be a challenge. The upcoming Publisher 8.5 will support command-line execution which may work with these utilities, however you will lose visibility in the Publisher console. It’s all tradeoffs, but at least CPU affinity is another tool in your belt.

cpu1.png CPU Control

ttam_add_application.gif  Tom’s Hardware Guide Task Assignment Manager

Feedback For Expression Variables

My thanks to Henric Cronström for taking the time to review my post on expression variables. He said the name for this new feature is still being decided. And he had some valuable information for getting started.

Your text is good and everything is correct. But you may want to add the following

1) The “all” qualifier is removed from the documentation and is replaced by “{1} total” which does exactly the same. “all” will still work in formulas in documents, though.

2) The syntax {$<Year=2002>} replaces the list of selected values with “2002″.

3) The syntax {$<Year+=2002>} adds “2002″ to the list of selected values.

4) The syntax {$<Year+=-2002>} removes “2002″ from the list of selected values.

5) The syntax {$<-Year+=2002-2001>} adds “2002″ and removes “2001″ to the list of excluded values. (Forced exclusion – see And-mode or red selection in attached app.)

Cheers

You can download the application here.

Seeing Dollar Signs in QlikView 8.5

One of the more interesting features coming in QlikView 8.5 is “Dollar-Sign Expansion with an Expression”, Section 24.5 in the coming Reference Manual. One place in the manual they are called “on-the-fly variables”, but I prefer “expression variables”.

In many ways expression variables are similar to using the TOTAL qualifier because they are calculated across the set of selections and are not affected by a dimension value in a chart. If you use the ALL qualifier inside an expression variable, it looks at the entire dataset regardless of selections, as ALL normally does. For day-to-day use as part of set-analysis this is all you need to know.

For the most part, you can get the same results by using  the TOTAL or ALL qualifier. What makes expression variables different is that they are optimized for speed by being evaluated only when necessary. This is demonstrated with the expression variable $(=rand()). It is evaluated once and, as far as I can tell, will remain unchanged through reloads, saving and reopening. If you aren’t familiar with rand(), it generates a random decimal number between 0 and 1. Anytime QlikView reevaluates the expression, the value will change and we can clearly see it.

New caching algorithms make expression variables very efficient. Expressions that refer to field names are evaluated for each combination of selections–only as a new combination is queried–and the results are cached. You can see this behavior with the expression $(=Year+rand()). There are additional optimizations so that expressions that do not change are not reevaluated. For example, $(=max(ALL Year)+rand()) is evaluated once because max(ALL Year) will never change.

QlikView organizes the caching scheme by the string value of the expression. $(=rand()) and $(=   rand()) will have different values but both will maintain their values throughout the application. As another example, $(=max({1} Year)+rand()) and $(=max(ALL Year)+rand()) contain an equivalent “max” expression, but yield a different value.

By design, expression variables always give the same result for each row. This is half the reason why set analysis is so much faster than Sum-If logic. The other half is that QlikView compares the expression variable to the whole field contents and not another calculated value. As a result, this reduces the problem of set analysis to making a parallel set of selections, which is what QlikView does best.  On the other hand, an If statement can refer to the current row, and it can be affected by the dimensions for that row, so Sum-If logic must recalculate for each and every row.

This is my own analysis from watching the Beta software. Behavior may change for the final release and I might have misunderstood what’s happening inside the application. I will continue to post what I learn about expression variables. If you can confirm or disprove my analysis, or if you have an interesting use for expression variables, tell me about it.

Create a Bullet Graph In QlikView + Video

What settings do you use for the gauge and bar charts? Watch the video!

Stephen Few, who spoke at the QlikView conference in April, devised the bullet graph a few years ago. A QlikView customer used bullet graphs and sparklines and was very generous to allow QlikTech to post a working demo of their application. I’m going to build the bullet graphs from that app. You can download and dissect a QVW copy of that app from the QlikView demo website.

Bullet Graph Demo App Example

The bullet graph in QlikView is a bar chart overlayed on a gauge chart. The demo app uses a technique of aligning the targets on all the graphs to 100% of current year budget. The formula for the black line is current year actuals divided by current year budget. The darker gauge section shows prior year actuals over current year budget.

Bullet Graph Diagram

This technique has several advantages:

  • Because PY and CY actuals are both divided by CY budget, they are still in harmony. You can visually see that current year sales is significantly less than last year’s sales.
  • Actual divided by Budget unifies many measures with wildly different scales, making chart maintenance easier without hurting accuracy.
  • Without this technique, you would need to write expressions for the gauge chart expression, and maximum values for bar and gauge charts. With this technique, they are 1 and 1.5.
  • There is additional context in answering the question, “If we were repeating last year’s performance, would we be beating our budget, and by how much?”

I hope you’ve enjoyed this tour through bullet graphs. Take a look at the demo app for sparklines in action as well.

What settings do you use for the gauge and bar charts? Watch the video!

Missing Feature In Set Analysis

There needs to be a way to say “apply current selections, but ignore any selections in field A”. I submitted this as a feature request, but I hope someone from QlikTech sees this post. My proposed syntax is sum({$<!Year><!Month><!Week><!Date>} Sales). Otherwise, you need to use a positive assertion that lists each and every value, or calculates it on the fly. See my previous post and the use of concat() within a variable definition.

I guess I’ll eat those words. It seems, though I only have the beta and it may change, that to ignore selections in a given field, you can override the field value with an empty set. For example: sum({$<Year=>} Sales).

QlikView 8.5 Introduces “Set Analysis”

This post has been edited since being published to show the Set Analysis syntax as of QV 9.

“Set analysis is the best feature since the new charting engine in version 4,” says Jonas Nachmanson.

That may not reveal the excitement, but it’s true! Set analysis is a big feature.

Want to sum excluded values that are not a part of current selections? Want a daily sales report that doesn’t break if someone makes a conflicting data selection? Then set analysis is for you! Now you can calculate across any slice of the dataset, regardless of what fields and values are currently selected.

You have fine-grained control over the selections that apply to any expression. Add, change, eliminate, invert or tweak any selections. This is all without the speed penalty of Sum-If logic, which can only see its intersection with current selections and therefore breaks when any conflicting selection is made. The result is faster calculation, predictable dynamic behavior, better support for dynamic graphs and tables, and faster implementation for some very frequent requests.

sum(ALL Sales)-sum(Sales) could be rewritten with set analysis as sum({1} Sales)-sum(Sales) but should be written as sum({1-$} Sales). {1} means the entire data model. {$} means the current selections. {0} means no selections at all.

1) Apply current selections, but act as if the current year is always set to the variable vCurrentYear: sum({$<Year={$(vCurrentYear)}>} Sales)

2) Apply current selections, but ignore any selections in the Year field: sum({$<Year=>} Sales).

The syntax $(=expression) is also new to 8.5. It adds dynamic variable creation. I plan to cover ideas for this new feature in a future post.

3) Sum sales for the prior year, with the prior year being relative to what is currently selected: sum( {$<Year={$(=Only(Year)-1)}>} Sales ). No more clearing dates with macros!

Set analysis immediately sparked some new ideas for interacting with QlikView and I think we’ll see many more. For example, the combination of set analysis using bookmarks and input fields improves QlikView as a financial forecasting tool, the frequent first question from CFOs.