My name is Stuart Frost. I founded DATAllegro in 2003 and I've been the CEO of the company from the beginning.
As CEOs go, I'm pretty technical and still get heavily involved in specifying the architecture of the product, although I haven't written any of the DATAllegro code (much to the relief of the engineering team).
I have a degree in electronic engineering and started my career as a programmer in the telecoms and defense industries back in England, writing low level code for such things as phone exchanges and sonar and radar systems. While I didn't know it at the time, I guess this mix of software and hardware was an ideal grounding for what I do now—leading an appliance vendor. |
|
I started my first company, SELECT Software Tools, in 1988 and ran it as CEO & Founder for 10 years, through several rounds of funding and a Nasdaq IPO that brought me to the US. The VC that backed SELECT made a 26x return. After leaving that company in 1998, I took a couple of years off and missed most of the Internet boom. Great timing!
By late 2002, I was looking for my next startup idea. While at SELECT, I'd been involved in several large database design projects (SELECT was a software design tools company), so I started studying the DBMS market to see if there were any disruptive opportunities and quickly started focusing on the data warehousing sector.
The database market in general was a no-go area for VCs through the 1990s. After all, Oracle had won, hadn't they? This started to change with the introduction of a couple of strong open source databases i.e. MySQL and Postgres and accelerated when Netezza attacked the data warehousing market.
Netezza came to market with an interesting business model and value proposition:
- It leveraged an open source DBMS (Postgres) to reduce engineering costs and time to market.
- It used an appliance business model to create a tightly integrated software and hardware stack, thereby removing a significant area of complexity for DBAs and system admin staff.
- It shifted to sequential I/O from the more typical random I/O generated by the incumbents. This allowed the use of much larger and cheaper SATA disk drives and led to a highly competitive price/performance ratio.
However, there is a significant flaw in Netezza's strategy - in achieving #3, they created a highly proprietary hardware platform and, effectively, a proprietary software platform (with little of Postgres remaining).
Netezza secured its first few customers around the time DATAllegro was being founded. Looking at the Netezza architecture, I realized that there was an opportunity to create a similar value proposition while using a completely non-proprietary platform. Hence, my vision was to create a massively parallel DW appliance with an embedded, off-the-shelf open source DBMS (Ingres) running on Linux and using completely standard servers, networking and storage from major vendors.
DATAllegro
Almost five years after starting DATAllegro, I'm very pleased to see that my vision has become a reality. We now have a highly competitive
DW appliance that uses an array of
Dell servers (or
Bull servers in Continental Europe),
Cisco networking and
EMC storage.
Each server runs a highly tuned copy of the Ingres DBMS on SuSe Linux. Our proprietary software turns these separate databases into a massively parallel, shared nothing database system that offers incredibly good performance, especially under complex mixed workloads.
The appliance model is key to getting great performance. Tuning a large database using traditional approaches is extremely difficult and requires highly skilled DBAs. One of the main problems is the difficulty of understanding and tuning the interface between the DBMS software and the underlying OS and hardware platform. Database vendors such as Oracle and Microsoft have to build their software to run on any hardware. Hence there are a plethora of tuning parameters and options for the DBA and sys admins to setup. In the appliance model, we have the luxury of controlling the entire software and hardware stack from SQL to storage. As a result, we can hide all of the complexity.
Another very important aspect of performance is ensuring sequential reads under a complex workload. Traditional databases do not do a good job in this area - even though some of the management tools might tell you that they are! What we typically see is that the combination of RAID arrays and intervening storage infrastructure conspires to break even large reads by the database into very small reads against each disk. The end result is that most large DW installations have very large arrays of expensive, high-speed disks behind them - and still suffer from poor performance.
Through a lot of trial and error, smart engineering and code changes to the database engine, we've been able to create a platform that sustains sequential reads - even under very high levels of concurrency. This allows us to use relatively low-cost, high-capacity SATA disk drives and therefore to provide a very high price/performance ratio.
Exciting Times
It's an exciting time to be involved in the data warehousing market. It's rare to see a $30bn market go through such a rapid transition, with a few powerful incumbents under attack from several fast-moving, innovative disruptors.
In my next few blog entries, I'll be talking about the various players in the market and how I think they fit in and stack up. Don't worry, it won't be the usual self-serving PR blog - I'll be honest and straightforward about how I see the strengths and weaknesses of the various players, including DATAllegro.