<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>... and points beyond &#187; Vertica</title>
	<atom:link href="http://andpointsbeyond.com/category/vertica/feed/" rel="self" type="application/rss+xml" />
	<link>http://andpointsbeyond.com</link>
	<description>mostly about data</description>
	<lastBuildDate>Wed, 05 May 2010 23:26:58 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Vertica for the Cloud</title>
		<link>http://andpointsbeyond.com/2008/12/11/vertica-for-the-cloud/</link>
		<comments>http://andpointsbeyond.com/2008/12/11/vertica-for-the-cloud/#comments</comments>
		<pubDate>Fri, 12 Dec 2008 06:36:01 +0000</pubDate>
		<dc:creator>Jay Jakosky</dc:creator>
				<category><![CDATA[Vertica]]></category>
		<category><![CDATA[cloud]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[database]]></category>

		<guid isPermaLink="false">http://andpointsbeyond.com/2008/12/11/vertica-for-the-cloud/</guid>
		<description><![CDATA[While I have my head in the clouds, I should mention that Vertica has a cloud solution that they manage for you. Not new, but gives some perspective.
With competitive offerings in the $10-20k per terabyte, this is an attractive offer and a great way to try before you invest when you have that much data.
I [...]


Related posts:<ol><li><a href='http://andpointsbeyond.com/2008/12/10/qlikview-in-the-cloud/' rel='bookmark' title='Permanent Link: QlikView in the Cloud'>QlikView in the Cloud</a></li>
<li><a href='http://andpointsbeyond.com/2007/02/16/more-on-vertica/' rel='bookmark' title='Permanent Link: More on Vertica'>More on Vertica</a></li>
<li><a href='http://andpointsbeyond.com/2007/02/16/whats-vertica/' rel='bookmark' title='Permanent Link: What&#8217;s Vertica?'>What&#8217;s Vertica?</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>While I have my head in the clouds, I should mention that Vertica has a cloud solution that they manage for you. Not new, but gives some perspective.</p>
<p>With competitive offerings in the $10-20k per terabyte, this is an attractive offer and a great way to try before you invest when you have that much data.</p>
<p>I hear Vertica is a screamer, but I can&#8217;t imagine getting sub-second results for 3 TB of data on 3 virtualized servers, for the same reasons I gave in my previous post.</p>
<p><a href="http://www.vertica.com/_pdf/verticacloudpricing">Vertica for the Cloud Pricing </a></p>


<p>Related posts:<ol><li><a href='http://andpointsbeyond.com/2008/12/10/qlikview-in-the-cloud/' rel='bookmark' title='Permanent Link: QlikView in the Cloud'>QlikView in the Cloud</a></li>
<li><a href='http://andpointsbeyond.com/2007/02/16/more-on-vertica/' rel='bookmark' title='Permanent Link: More on Vertica'>More on Vertica</a></li>
<li><a href='http://andpointsbeyond.com/2007/02/16/whats-vertica/' rel='bookmark' title='Permanent Link: What&#8217;s Vertica?'>What&#8217;s Vertica?</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://andpointsbeyond.com/2008/12/11/vertica-for-the-cloud/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Low-Cost Data Analysis &amp; Visualization: It&#8217;s Getting Better All The Time</title>
		<link>http://andpointsbeyond.com/2008/09/07/low-cost-data-analysis-visualization-its-getting-better-all-the-time/</link>
		<comments>http://andpointsbeyond.com/2008/09/07/low-cost-data-analysis-visualization-its-getting-better-all-the-time/#comments</comments>
		<pubDate>Mon, 08 Sep 2008 02:43:45 +0000</pubDate>
		<dc:creator>Jay Jakosky</dc:creator>
				<category><![CDATA[MPP]]></category>
		<category><![CDATA[QlikView]]></category>
		<category><![CDATA[Tableau]]></category>
		<category><![CDATA[Vertica]]></category>
		<category><![CDATA[business intelligence]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[emerging technology]]></category>
		<category><![CDATA[interactive analysis]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://andpointsbeyond.com/?p=213</guid>
		<description><![CDATA[Over the weekend I have revisited Tableau, enjoyed some success with MonetDB, tried to turn MySQL into a hundred million row data warehouse, been underwhelmed with Firebird, installed Greenplum and spent many frustrated hours with Talend Open Studio, Pentaho Kettle and Jitterbit.
Of course, I could just buy QlikView, but what can be done for less [...]


Related posts:<ol><li><a href='http://andpointsbeyond.com/2007/07/01/interactive-information-visualization/' rel='bookmark' title='Permanent Link: Interactive Information Visualization'>Interactive Information Visualization</a></li>
<li><a href='http://andpointsbeyond.com/2007/05/02/response-to-the-tableau-30-webinar/' rel='bookmark' title='Permanent Link: Response to the Tableau 3.0 Webinar'>Response to the Tableau 3.0 Webinar</a></li>
<li><a href='http://andpointsbeyond.com/2007/02/16/whats-vertica/' rel='bookmark' title='Permanent Link: What&#8217;s Vertica?'>What&#8217;s Vertica?</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Over the weekend I have revisited <a href="http://tableausoftware.com/">Tableau</a>, enjoyed some success with <a href="http://monetdb.cwi.nl/">MonetDB</a>, tried to turn <a href="http://dev.mysql.com/">MySQL</a> into a hundred million row data warehouse, been underwhelmed with <a href="http://www.firebirdsql.org/">Firebird</a>, installed <a href="http://www.greenplum.com/">Greenplum</a> and spent many frustrated hours with <a href="http://www.talend.com/index.php">Talend Open Studio</a>, <a href="http://kettle.pentaho.org/">Pentaho Kettle</a> and <a href="http://www.jitterbit.com/">Jitterbit</a>.</p>
<p>Of course, I could just buy <a href="http://qlikview.com/home.aspx?LangType=1033">QlikView</a>, but what can be done for less $money? Unfortunately data warehouses and BI front-ends are not sexy problems in the opensource community. <a href="http://www.collegeathome.com/blog/2008/06/05/50-cool-things-you-can-do-with-google-charts-api/">Graphs and charts</a> get a little more attention, but you&#8217;ll need to write your own code to glue them to your application.</p>
<p><strong>In summary, what can I say about our options?</strong></p>
<p>First, write your own ETL. Why do opensource ETL tools like Talend and Kettle work so hard to rebuild <a href="http://www.informatica.com/Pages/index.aspx">Informatica</a>? It reminds me of Linux in the 1990s when the community wanted to beat Windows and kept working to look like Windows and wondering when victory would arrive. Informatica, like OLAP and mainframes, is from an era when memory was scarce; languages were low-level, slow to compile &amp; run, abstracted little and were not at all portable. On top of that, ODBC drivers were tightly controlled and costly.</p>
<p>But now we can pick from many great scripting languages. Today&#8217;s languages abstract the hard parts, are easy to read, can be edited while executing and talk to any system, database, web service or application. I think the next direction for ETL will be a simple (but extensible) transformation language using an ORM wrapper&#8230; Rails on ETL. Until that arrives, you can achieve everything you need with PHP, Perl, Ruby and others.</p>
<p><strong>Best option for low-cost data warehouse?</strong></p>
<p><span id="more-213"></span></p>
<p>Check out the totally free <a href="http://monetdb.cwi.nl/">MonetDB</a>. Unless <a href="http://www.vertica.com/">Vertica</a> or <a href="http://www.infobright.com/">InfoBright</a> reconsiders releasing a low/no cost option, MonetDB will likely mature to become a first-choice column-store database. It&#8217;s an academic project that has earned a sizeable development community and user base. The product is functional today for tens of millions of rows (maybe more). So far I have personally worked with a few million rows in MonetDB and I&#8217;d like to use it again. With a little focus on usability and packaging, it could be a contender.</p>
<p>Greenplum, freely available for development, won&#8217;t help. The architecture is designed around Massively Parallel Processing. As a single, standalone installation, it&#8217;s basically just PostgreSQL. You won&#8217;t see extra performance without a farm of servers.</p>
<p>To my surprise, MySQL itself is not too bad. The MyISAM tables are speedy and <a href="http://tomictech.com/2008/06/16/building-a-data-warehouse-on-a-budget-with-mysql-51/">Alex Tomic wrote a post </a>about using multiple queries against the Archive storage engine and how to steal an index with that engine. With basic MyISAM on a fast server, I&#8217;m running 10GB table scans in under a minute, but moderate aggregations take a few minutes. Architecturally, MySQL is limited. One query = one thread = one core. Running two simultaneous queries is an option, but MySQL still would not do the kind of transparent, optimized caching that you need for a warehouse. Throughput is limited to disk I/O speed. InfoBright has built a column-store storage engine for MySQL but it&#8217;s targeted for the enterprise only.</p>
<p><strong>What about the front end?</strong></p>
<p>For the money and quality and ease of integration, it&#8217;s hard to beat <a href="http://tableausoftware.com/">Tableau</a>. $1800 bucks isn&#8217;t cheap, but for a small business that truly needs to analyze patterns, this will do the job and it makes very pretty charts. The most recent version has integrated support for mapping based on zip code, area code, state, country and others. The maps also incorporate Census and USGS data and are pulled live from an online source. They look great! Tableau has always had a smooth, easy-to-understand layout and a crisp look that makes each chart very attractive in a presentation. It also automatically guesses what chart you want based on the quality &amp; number of aggregates and dimensions.</p>
<p>The drawback is that Tableau doesn&#8217;t have its own high-speed database or ETL tool. Tableau can&#8217;t shine until a low/no-cost read-optimized database is available. Until then, it does support the most common databases and data warehouses, both commercial and open-source. Except it can&#8217;t handle generic ODBC and I don&#8217;t know why.</p>
<p>There&#8217;s <a href="http://www.jaspersoft.com/">JasperSoft</a> = CrystalReports + OLAP + Informatica + Web Dashboards. Each component is from a different opensource project, so they don&#8217;t all use the same platform or interface, and they can&#8217;t all read the same data sources. The democratization of BI is NOT going to come from enterprise tools made cheap; it will come from simple disruptive tools that add new ideas and polish with each release. Sorry, Jasper.</p>
<p><strong>What would I use to build a reporting system for a smaller business?</strong></p>
<p>Well, assuming we&#8217;re doing it to make more money, not to keep up appearances, the best choice is still to pay the money for QlikView. It reads ODBC, OLE DB, text files and Excel&#8211;everything a business needs. The ETL language is easy to understand for any businessperson that has put together an Access database or enjoys Excel formulas (blech!). The GUI front-end designer is powerful &amp; straightforward. And the in-memory database behind QlikView is so incredibly fast that I routinely analyze 10 million of rows in a split-second. It&#8217;s a one-stop shop.</p>
<p>Tableau is a good option but you lose the database and ETL. Maybe you don&#8217;t have a large volume of data or maybe it&#8217;s all in one view in the database&#8211;Tableau could work for you.</p>
<p>At a lower cost? Well, it definitely comes down to tradeoffs in coder skill, money, development time and ease of use. Whereas in QlikView anyone can write the basic code to read a couple tables, all other solutions demand heavy lifting somehwere.</p>
<p><strong>If I was doing it for free?</strong></p>
<p>I&#8217;d start with PHP, and possibly Ruby. Read from a database, calculate, generate Google Charts, and maybe use one of the <a href="http://www.maani.us/xml_charts/">low/no-cost Flash-based charting libraries for interactive splash</a>. In a future post I&#8217;d like to cover ORMs and Google Chart APIs and how it can help get these projects off and running quickly.</p>
<p>Got any ideas? I&#8217;m always on the lookout for a faster cheaper better way to create these solutions.</p>
<p><a href="http://www.collegeathome.com/blog/2008/06/05/50-cool-things-you-can-do-with-google-charts-api/">50 Cool Things You Can Do with Google Charts</a></p>


<p>Related posts:<ol><li><a href='http://andpointsbeyond.com/2007/07/01/interactive-information-visualization/' rel='bookmark' title='Permanent Link: Interactive Information Visualization'>Interactive Information Visualization</a></li>
<li><a href='http://andpointsbeyond.com/2007/05/02/response-to-the-tableau-30-webinar/' rel='bookmark' title='Permanent Link: Response to the Tableau 3.0 Webinar'>Response to the Tableau 3.0 Webinar</a></li>
<li><a href='http://andpointsbeyond.com/2007/02/16/whats-vertica/' rel='bookmark' title='Permanent Link: What&#8217;s Vertica?'>What&#8217;s Vertica?</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://andpointsbeyond.com/2008/09/07/low-cost-data-analysis-visualization-its-getting-better-all-the-time/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>More on Vertica</title>
		<link>http://andpointsbeyond.com/2007/02/16/more-on-vertica/</link>
		<comments>http://andpointsbeyond.com/2007/02/16/more-on-vertica/#comments</comments>
		<pubDate>Fri, 16 Feb 2007 20:34:00 +0000</pubDate>
		<dc:creator>Jay Jakosky</dc:creator>
				<category><![CDATA[Vertica]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[database]]></category>

		<guid isPermaLink="false">http://andpointsbeyond.com/?p=109</guid>
		<description><![CDATA[The ideas in this paper will be incorporated into the Vertica database product. And unfortunately it won&#8217;t be open source. At least that&#8217;s what one company employee commented on Slashdot.
In the same way that RAID design options (e.g. 1, 5 and 10) can accommodate multiple drive failures, the Vertica system will distribute the same slice [...]


Related posts:<ol><li><a href='http://andpointsbeyond.com/2007/02/16/whats-vertica/' rel='bookmark' title='Permanent Link: What&#8217;s Vertica?'>What&#8217;s Vertica?</a></li>
<li><a href='http://andpointsbeyond.com/2007/11/26/how-well-do-netezza-greenplum-vertica-and-others-handle-12-way-joins/' rel='bookmark' title='Permanent Link: How well do Netezza, Greenplum, Vertica and others handle 12-way joins?'>How well do Netezza, Greenplum, Vertica and others handle 12-way joins?</a></li>
<li><a href='http://andpointsbeyond.com/2008/12/11/vertica-for-the-cloud/' rel='bookmark' title='Permanent Link: Vertica for the Cloud'>Vertica for the Cloud</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p><span style="font-family:georgia;">The ideas </span><a href="http://www.mit.edu/%7Edna/vldb.pdf">in this paper</a><span style="font-family:georgia;"> will be incorporated into the Vertica database product. And unfortunately it won&#8217;t be open source. At least that&#8217;s what one company employee commented on Slashdot.</span></p>
<p><span style="font-family:georgia;">In the same way that RAID design options (e.g. 1, 5 and 10) can accommodate multiple drive failures, the Vertica system will distribute the same slice of the database to several servers. A grid of commodity hardware can act as a high-availability system and Vertica&#8217;s shared-nothing architecture enables this feature without complex design or execution.</span></p>
<blockquote><p>We call a system that tolerates K failures K-safe. C-Store will be configurable to support a range of values of K.</p></blockquote>
<p><span style="font-family:georgia;">Inserts and updates are performed on a separate data store and merged in batches. Deletes are marked with bitmasks. Rather than building a complex locking scheme for grid members, data in the read-only and write stores is stamped with a time &#8220;epoch&#8221;. Queries specify an epoch. It&#8217;s an elegant implementation that is very well suited to a data warehouse.</span></p>


<p>Related posts:<ol><li><a href='http://andpointsbeyond.com/2007/02/16/whats-vertica/' rel='bookmark' title='Permanent Link: What&#8217;s Vertica?'>What&#8217;s Vertica?</a></li>
<li><a href='http://andpointsbeyond.com/2007/11/26/how-well-do-netezza-greenplum-vertica-and-others-handle-12-way-joins/' rel='bookmark' title='Permanent Link: How well do Netezza, Greenplum, Vertica and others handle 12-way joins?'>How well do Netezza, Greenplum, Vertica and others handle 12-way joins?</a></li>
<li><a href='http://andpointsbeyond.com/2008/12/11/vertica-for-the-cloud/' rel='bookmark' title='Permanent Link: Vertica for the Cloud'>Vertica for the Cloud</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://andpointsbeyond.com/2007/02/16/more-on-vertica/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>What&#8217;s Vertica?</title>
		<link>http://andpointsbeyond.com/2007/02/16/whats-vertica/</link>
		<comments>http://andpointsbeyond.com/2007/02/16/whats-vertica/#comments</comments>
		<pubDate>Fri, 16 Feb 2007 05:46:00 +0000</pubDate>
		<dc:creator>Jay Jakosky</dc:creator>
				<category><![CDATA[Vertica]]></category>
		<category><![CDATA[data warehouse]]></category>
		<category><![CDATA[database]]></category>

		<guid isPermaLink="false">http://andpointsbeyond.com/?p=108</guid>
		<description><![CDATA[Started by a major contributor to the Ingres and Postgres projects, Vertica is implementing a read-optimized database that is an excellent fit for the data warehouse world. Given the founder&#8217;s support of open-source, I expect this company will follow the hybrid commercial/FOSS model of MySQL and others. Some core design features include highly compact storage, [...]


Related posts:<ol><li><a href='http://andpointsbeyond.com/2007/02/16/more-on-vertica/' rel='bookmark' title='Permanent Link: More on Vertica'>More on Vertica</a></li>
<li><a href='http://andpointsbeyond.com/2007/11/26/how-well-do-netezza-greenplum-vertica-and-others-handle-12-way-joins/' rel='bookmark' title='Permanent Link: How well do Netezza, Greenplum, Vertica and others handle 12-way joins?'>How well do Netezza, Greenplum, Vertica and others handle 12-way joins?</a></li>
<li><a href='http://andpointsbeyond.com/2006/11/20/using-sound-to-aid-in-detecting-changes-in-complex-systems/' rel='bookmark' title='Permanent Link: Using Sound To Aid In Detecting Changes In Complex Systems'>Using Sound To Aid In Detecting Changes In Complex Systems</a></li>
</ol>]]></description>
			<content:encoded><![CDATA[<p>Started by a major contributor to the Ingres and Postgres projects, Vertica is <a href="http://www.mit.edu/%7Edna/vldb.pdf">implementing a read-optimized database</a> that is an excellent fit for the data warehouse world. Given the founder&#8217;s support of open-source, I expect this company will follow the hybrid commercial/FOSS model of MySQL and others. Some core design features include highly compact storage, total ad-hoc read optimization, and using a shared-nothing grid design that is dead easy to implement with commodity (not High-Availability) hardware. Via <a href="http://developers.slashdot.org/article.pl?sid=07/02/14/2020251">Slashdot.</a></p>
<p><a href="http://www.networkworld.com/news/2007/021407-vertica-oracle.html">New database company raises funds, nabs ex-Oracle bigwigs &#8211; Network World</a></p>


<p>Related posts:<ol><li><a href='http://andpointsbeyond.com/2007/02/16/more-on-vertica/' rel='bookmark' title='Permanent Link: More on Vertica'>More on Vertica</a></li>
<li><a href='http://andpointsbeyond.com/2007/11/26/how-well-do-netezza-greenplum-vertica-and-others-handle-12-way-joins/' rel='bookmark' title='Permanent Link: How well do Netezza, Greenplum, Vertica and others handle 12-way joins?'>How well do Netezza, Greenplum, Vertica and others handle 12-way joins?</a></li>
<li><a href='http://andpointsbeyond.com/2006/11/20/using-sound-to-aid-in-detecting-changes-in-complex-systems/' rel='bookmark' title='Permanent Link: Using Sound To Aid In Detecting Changes In Complex Systems'>Using Sound To Aid In Detecting Changes In Complex Systems</a></li>
</ol></p>]]></content:encoded>
			<wfw:commentRss>http://andpointsbeyond.com/2007/02/16/whats-vertica/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
