In-memory database systems (IMDSs) offer breakthrough performance by eliminating I/O, caching, data transfer, and other overhead that is hard-wired into traditional "on-disk" database management systems (DBMSs). But some applications require a high level of data durability. In other words, what happens to the data if somebody pulls the plug?
As a solution, IMDSs offer transaction logging, in which changes to the database are recorded in a log that can be used to automatically recover the database, in the event of failure.
Wait a second, critics say — doesn't this logging simply re-introduce storage-related latency, which is the very thing IMDSs have "designed out" to gain their high performance? Won't an IMDS with transaction logging (IMDS+TL) perform about the same as an on-disk DBMS?
As a leading IMDS vendor, McObject has always offered logical (if pretty technical) explanations of why this is not the case, i.e. why IMDSs with transaction logging retain their speed advantage. But nothing beats cold, hard evidence — so we decided to test our own claims. We also wanted to see how different data storage technologies affected any performance difference, so we ran all the tests using hard-disk drive (HDD), solid state drive (SSD), and a state-of-the-art NAND flash memory platform (Fusion ioDrive2). In the case of the traditional DBMS, these devices stored database records and the transaction log. With the IMDS+TL, they stored the transaction log.
Let’s talk results first, and then the reasons why. When inserting records into a database, moving from an on-disk DBMS to an IMDS with transaction logging (but still storing the transaction log on HDD) delivered a 3.2x performance gain. In other words, you can have your data durability, and triple your speed (and then some) for inserts, even when using garden variety HDD storage.
It gets even better, though, using today’s faster storage options to hold the IMDS’s transaction log. For example, storing the transaction log on flash memory-based SSDs boosted IMDS+TL insert performance 9.69x over the DBMS writing to hard disk.
Next, we swapped in the Fusion ioDrive2 to store the transaction log, and racked up a 20.05x performance gain over the DBMS+HDD combination. That’s right, inserting records into the database became more than 2,000% faster, while retaining transaction logging’s compelling data durability benefit (compelling, that is, to anyone who has lost critical data in a system crash). The gain was even more dramatic for database deletes, at 23.19x.
Why is an in-memory database system with transaction logging so much faster than a disk-based DBMS for the most I/O-intensive operations? First, on-disk DBMSs cache large amounts of data in memory to avoid disk writes. The algorithms required to manage this cache are a drain on speed and an IMDS (with or without transaction logging) eliminates the caching sub-system. There are other reasons, too. (As promised, they are technical.) For a full discussion, download McObject's free white paper, "In Search of Data Durability and High Performance: Benchmarking In-Memory & On-Disk Databases with Hard-Disk, SSD and Memory-Tier NAND Flash."
The paper describes other test scenarios — and in all of them, the Fusion ioDrive2 dramatically outperformed the other storage devices. Other pages on this Web site explain the reasons more eloquently and completely than I can. My understanding as a "database guy" is that Fusion ioDrive2 differs from SSD storage in that it presents flash to the host system as a new memory tier, integrating flash close to the host CPU and eliminating hardware and software layers that would otherwise introduce latency by standing between CPU and SSD storage devices.
As someone who works with system designers seeking the fastest possible responsiveness in fields ranging from telecommunications to financial trading, I can tell you that the prospect of boosting performance in I/O-intensive (high overhead) operations by more than 2,000% certainly finds a receptive audience.