Debunking Another Myth: Column-Stores vs. Vertical Partitioning

| | Comments (0) | TrackBacks (0)
Editors note: This post was co-authored by Daniel Abadi and Samuel Madden

Last week we discussed the myth that a heavily indexed row-store can provide column-store-like performance. In particular, there is a common misconception that a column-store is basically like having a row-store with an index on every column. In fact, an indexed column in a row-store and a regular column in a column-store are very different data structures: an index maps column values to tuple IDs while a column maps tuple IDs to column values. The different data structures are useful in different situations.

This week, we debunk another commonly proposed approach for making a row-store perform like a column-store: vertically partitioning a row-store. Vertical partitioning is a technique that can be used to enhance performance on read-mostly data warehouse workloads. The idea is to store an n-column table in n new tables. Each of these new tables contains two columns - a tuple ID column and data value column from the original table. For example, the three column-table:

 
myth2_img1.jpg

could be transformed into the following three tables:
 
myth2_img2.jpg
Each table is clustered by Tuple ID. Queries over the original table are then rewritten into queries over the new tables. For example, the simple query:

SELECT name, city
FROM customers

Would be rewritten into:

SELECT t1.value, t2.value
FROM Name AS t1,
             City AS t2
WHERE t1.tuple_id = t2.tuple_id

Note that a vertically partitioned store functions like a column-store. If only 2 out of the 3 attributes of a table are requested by a query (as in the example above), then only 2 out of the 3 partitions need to be accessed. Of course, a join must be performed to merge these multiple attributes together, but since each vertical partition is clustered by the join key (Tuple ID), the join is not expensive. Furthermore, due to the clustering on Tuple ID, a vertically partitioned data structure is like a column-store in that each partition is a Tuple_ID-to-value mapping (which, as we showed in last week's post, is the opposite of the value-to-Tuple_ID mapping used in indexes).

Hence, a column store and a vertically partitioned row-store are very similar. They have the same set of trade offs relative to normal row-stores (better for read queries, slower for non-batch write queries). Thus, it is only natural to expect the two approaches to perform similarly well on read-mostly data warehouse workloads.

Surprisingly, in practice, this is not the case.

In the same SIGMOD 2008 paper that we discussed in last week's blog posting, "Column-Stores vs. Row-Stores: How Different Are They Really?," we implemented vertical partitioning inside of a commercial row-store on a read-only benchmark. The benchmark we used was the Star Schema Benchmark, a recently proposed benchmark designed to be more "typical" of data warehousing data and queries than TPC-H. We compared the performance of the vertically partitioned commercial row-store with the same row-store under a more normal configuration (optimized by a professional DBA using best practices) and with a column-store. The results are shown in the figure below:

 
myth2_img3.jpg

On this read-only data warehouse benchmark, one expects the column-store to outperform the row-store. We found this to indeed be the case. However, the surprising result is that instead of the vertically partitioned row-store performing similarly to the column-store as the discussion above suggests it should, it performs an order of magnitude worse than the column-store and a factor of 3 worse than the original row-store!

In the paper, we explore the reasons behind this surprising result, looking in detail at performance on different queries in the benchmark. The following are a list of high level reasons we came across for the poor performance of vertical partitioning on the row-store.

1) Data sizes. Table scans are often the best choice (relative to index scans) for data access in data warehouse workloads. The reason is that, in general, a large amount of data is needed by the query operators (e.g. to perform a summarization or an aggregation). A good rule of thumb is that unless less than 0.1-1% of the tuples are accessed by a query, an index scan will be slower than (or no faster than) a simple sequential scan. However, the time to perform a table scan is directly proportional to the size of the table.

Unfortunately, the Tuple ID column significantly increases the total storage required for the vertical partitioning approach. Since the columns that are accessed from the fact table in data warehouse queries are generally foreign keys (for joins with a dimension table) or are otherwise a numeric type (such as 'price' or 'revenue') that will be input to an aggregate operator, the size of the Tuple ID is generally on the order of the same size as the actual column value data. Thus, the existence of the Tuple ID column basically doubles the size of the table.

Furthermore, each tuple in a table contains a tuple header. Usually the size of a tuple header is insignificant, since the size of the actual data stored in a tuple is generally much larger. However, in the vertical partitioning case, where each table is really narrow (each containing two narrow columns), the size of the tuple header relative to the size of the rest of the tuple is no longer insignificant.  For example, in the open source PostgreSQL database, which is reputed to have a small header relative to most commercial products, the header size is 24 bytes. Compare this to the 8 bytes of actual data per tuple for a two-column vertical partition containing two integer columns. Column-stores avoid the tuple header overhead problem by storing tuple headers just once in their own separate column, and using concurrency control schemes that do not require checking the tuple header on every tuple access.

The bottom line is that the full 17 column fact table in our benchmark took up approximately 4 GB of space after compression. Meanwhile, each vertical partition took up between 0.7 and 1.1 GB of space after compression. In the star schema benchmark, the average query accesses 4-5 vertical partitions. This results in approximately the same amount of data being scanned in the vertical partitioning case as in the normal row-store case (even though only 25% of the vertical partitions need to be accessed per query). Thus, these space overheads completely eliminate the 'efficient data reading' advantage of vertical partitioning!

(As a side note, the row-store product that we used would not allow the use of 'virtual tuple ids' if we wanted to use tuple id as a join key. If the product allowed this, some of the space overhead could have been avoided).

2)  Horizontal partitioning. Under the normal configuration, the row-store product was able to take advantage of horizontal partitioning to improve performance. In particular, many of the queries in the benchmark contain a predicate on the 'year' attribute of the date dimension table. The row-store was able to use a 'partition-by-reference' feature to partition the fact table by the year dimension attribute to yield a factor of two performance improvement on average. Once we vertically partitioned the data, we could no longer use the 'partition-by-reference' feature to horizontally partition the vertical partitions, since only one of the vertical partitions contained the appropriate reference to the date dimension table. If the product allowed for an extra level of indirection in 'partition-by-reference' (using the tuple id) we could have partitioned the vertical partitions by year and seen the factor of two performance improvement; but as far as we know, no row-store product allows for 'partition-by-reference-by-reference'.

3)  Partition joins. Although joins between vertical partitions can generally be done using high performance techniques due to the clustering by tuple id; the sheer number of joins required by the vertical partitioning approach occasionally overwhelmed the query optimizer, resulting in suboptimal plans. Column-stores generally avoid this issue by considering joins of columns within the same table separately from joins across tables, and since joins within a table are so common, they typically heuristically push high-performance column merge-joins to the bottoms of query plans.

We found a variety of other reasons why vertical partitioning does not in practice yield column-store performance, including a lack of column-oriented compression and a lack of optimizations for fixed-width attributes; but the above three reasons were responsible for the bulk of the observed performance difference.

In conclusion, our published experiments show that vertically partitioning a row-store can yield very different performance relative to using a column-store. The observed poor performance of the vertically partitioned row-store is likely the reason why most DBAs do not employ complete vertical partitioning to improve row-store performance; rather, they tend to use hybrid methods such as multi-column vertical partitions or column-subset materialized views (nonetheless, our paper contains some experiments that show that the column-store still outperforms the row-store, even when the row-store contains the optimal set of materialized views).

These results present a snapshot of the state of affairs as they exist today. However, as column-stores continue to gain market share in the data warehousing market, we expect some of the commercial row-store products to fix some of the issues mentioned above. The data size issue seems to be particularly low hanging fruit--it is easy to envision a row-store that uses virtual tuple IDs as join keys and that avoids the duplication of tuple headers across vertical partitions. So while vertical partitioning performance can (and most likely will) improve, how close it can get to column-store performance remains an open question.

Intuitively, one expects a DBMS designed specifically for vertical partitioning (i.e., a column-store) will have a fundamental advantage, since column-stores can make column-specific optimizations in all parts of the DBMS code (the query executer, the query optimizer, the storage layer, etc.) while a row-store will have to occasionally make trade offs to perform well for both row-oriented and column-oriented processing. In our next blog posting, we will explore this topic in more detail, showing how column-oriented optimizations made to DBMS components outside of the traditional (storage layer) realm associated with column-stores can yield order of magnitude performance improvements.  Hence, it will require wide spread and extensive code modifications to for row-stores to even approach column-store performance.

 

Categories

0 TrackBacks

Listed below are links to blogs that reference this entry: Debunking Another Myth: Column-Stores vs. Vertical Partitioning.

TrackBack URL for this entry: http://www.databasecolumn.com/blog/mt-tb.cgi/38

Leave a comment

About this Post

This page contains a single post by Daniel Abadi published on July 31, 2008 3:46 PM.

Debunking a Myth: Column-Stores vs. Indexes was the previous entry in this blog.

Field Fodder -- Compression in Real World Datasets is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.