One of the most useful tricks shared at the QlikView conference was from Nik Boman on improving the data extraction from databases.
ODBC is a slow protocol, running orders of magnitude slower than the database or a typical Ethernet connection. Very pricey ETL tools for data warehousing get around this by extracting through multiple connections to the database, and there’s no reason that a QlikView infrastructure can’t take advantage of it.
For example, run two copies of QlikView at the same time and extract approximately half of the data set with each. First, make a copy of the QV.exe file and give it a unique name. You can open QV.exe and your unique copy at the same time. You can run three or more copies of QlikView with this method.
Next, decide how to divide your data set; it could be based on date, country, state, or half the alphabet, for example. What you want is to divide the data set into roughly equal segments, one for each copy of QlikView.
How does each copy of QlikView know which segment to load? One way to do this dynamically is to use the command-line to set a variable in the script. Reference this variable in the SQL SELECT statement in the script: WHERE YearField=$(vYearVariable). See the reference manual for command-line options.
Your mileage will vary. Some databases don’t do much better with simultaneous ODBC reads. Oracle does quite well.
2 responses so far ↓
1 Hugh Williams // Aug 16, 2008 at 7:10 pm
Its been a long time since I have heard the “myth” that “ODBC is slow” , something that companies like OpenLink Software have spent years dispelling …
If an implementation of a given ODBC Driver happens to be slow that does not mean the ODBC API itself is slow.
Some database vendors (like DB2 and others) are now adpoting and promoting ODBC as the preferred API for application development, due to the richness of the API which if written correctly provides just as good performance as their other APIs, ditto for JDBC.
ODBC also provides a generic interface for database access across multiple databases enabling application re-use without the need to re-write should one change their database, thereby eliminating the lock in to proprietary APIs, which is always a bad thing …
2 JJ // Sep 9, 2008 at 8:04 am
Fair enough, it’s the implementation and not the protocol. But I’m sure it hasn’t been a “long time” since so many ODBC connections continue to be slow. (*cough* DB2 *cough*)
Leave a Comment