SAP HANA versus Microsoft Business Intelligence and xVelocity, take two…

Not surprisingly, my original post generated huge traffic to my blog as well as a couple of good comments. I am sorry that it took me a while to respond, but now that it is Saturday and I am at Starbucks,I don’t really have any excuse not to respond anymore…

Before I jump into a sentence by sentence analysis of the pro-HANA comment that was created on my original post, I just wanted to articulate the original point I was trying to make:

  • I am not saying that HANA does not work (in fact that entire point is outside of the scope of my post, although there is now enough factual evidence that there is quite a bit of a gap between the marketing hype and the actual capabilities of the product as it is available today)
  • I am however challenging the “innovation” angle by saying that there is nothing new in the way HANA is framed up with respect to “in memory”, “Appliance” and “Real time BI”
  • I am further challenging the value of HANA by saying that in my opinion everything that HANA brings to the table in its current state and beyond can be accomplished by using commodity software and hardware at a fraction of the cost with the same or better performance

I find that there is surprisingly little content available today to compare and contrast HAHA with Microsoft BI stack, with Microsoft taking a deaf-mute position on the subject and with SAP generating a lot of marketing (but unfortunately often nonsensical) hype around the product.

So let’s try to take a deep dive on some claims that are made about HANA using the aforementioned comment to guide the discussion.

“it is an appliance: Yes, but an open appliance” – I am not saying that there is no value in an appliance, in fact I have recommended various BI appliances before, however, any appliance comes with a significantly higher cost and some handcuffs in the form of a stringent maintenance contract. In makes perfect sense if there is no commodity option available, but one of the main points I am trying to make is that comparable or better level of performance can be achieved using Microsoft BI stack on a commodity server from HP, Dell, etc. who will be happy to sell you a 16 core server with a terabyte of RAM for less than $20k

HANA “works with detailed data, no Aggregates, No tuning needed, no caching or cubes to maintain the data, Real-time information, simplicity and agility” – there is actually a lot of wrong in this statement, but being the patient man that I am, I will let most of it slide, or I may address it in some other post, but… There is one thing I cannot leave unaddressed, and that is how completely irresponsible it is. The insinuation here is that just because it is an appliance and is in memory, one can casually load all of his/her transactional data into the thing and get a sub second real-time answers without doing any additional maintenance and design. I will promise to be as civil as I can possibly be, but this is a completely ridiculous. Having an in memory database brings a lot of benefits with respect to performance, but a bad design coupled with a bad query is certain to bring any hardware to its knees particularly in a multi-user environment. Somehow, all vendors who are selling in-memory solutions are making these types of claims and it scares me as it creates a perception that one can solve any problem by just throwing more money and hardware at it and unfortunately it is completely not true. Many of the best practices with respect to design of a BI solution still apply to in-memory databases and there will never be hardware fast enough to invalidate years of knowledge that the BI world has accumulated over time which by the way explains why HANA does in fact have an Aggregation Engine as illustrated by the picture below (right in the middle) taken from the SAP HANA Technical Overview document

HANA “covers both OLTP and OLAP needs which is unique in the market place” – this is obviously completely not true, SQL Server has covered both OLTP and OLAP for at least fifteen years. ROLAP mode for Analysis Services in multidimensional mode and DirectQuery mode for the Tabular modes bring the two workloads together, although, combining OLTP and OLAP under one umbrella is not always a good thing.

HANA allows to “create applications that run business logic in the in-memory engine rather than moving huge amounts of data from the DB layer to the App layer, which is also unique” – this statement is easy to misinterpret, but all I can do is offer my own interpretation so here it is. Somehow SAP has managed to create a perception that all of the data in HANA is only stored in memory. That either means that HAHA can never be turned off as we all know what happens with computer memory when the hardware is off, or it means that there is a database there somewhere. The latter is actually true as the data is actually persisted using a good old database engine… Well, a very old database engine in fact, HANA was cobbled together using three different products with MaxDB (formerly known as ADABAS) being one of them. For me personally, it is nostalgic to see that the mainframe database that I had to work with over fifteen years ago is now coming back and taking a bigger part of the database landscape again, but other than seeing an old dog learning new tricks, I am not sure how much genuine innovation we can find here. I understand that there is some effort to leverage more of the Sybase database engine in HANA as well, and very soon ADABAS may be a thing of the past, but regardless, the main point is, database is definitely part of HANA with all database related baggage attached to it (which is not necessarily a bad thing).

HANAbring Real-Time capabilities: Real Time replication of the data, so that ETL Jobs are not needed” – this statement is misleading but also, from what I understand, it is not true. HANA is a BI appliance (although it comes in several flavors and not every flavor requires one) as such, it has no Real-time capabilities per se. My research indicates that currently SAP offers two approaches to load data and both of those require separate investment in a different data load technologies that are not really part of HANA:

  1. Business Objects Data Services – which is a great and robust technology to, ironically, develop ETL jobs to load data in HANA
  2. Trigger based technology (SAP SLT) – which, as picture below (from the SAP web site) indicates, clearly is not part of HANA

Selling any BI solutions under a premise that it will provide real time BI is generally speaking irresponsible as well. The problem of real time BI cannot be addressed with hardware or in-memory approaches. Trigger based approach has inherent architectural risks that are easily validated by doing a quick web search on “SAP SLT problem” (in my case Google brought back 129,000 results in 0.24 seconds). I am sure there are cases where that technology works great, but again, there is nothing in HANA per se that makes real time data integration possible.

“BW does have an OLAP engine” – if BW had an OLAP engine it would not have to sit on top of a relational database, BWA adds an in-memory component to the architecture and there is a flavor of HANA specifically designed for BW, in which case BW uses HANA’s data engine, but facts are facts, BW is a meta-data/semantic layer that uses another data engine to physically run queries. Microsoft has been certified by SAP to use its xVelocity technology for BW (find more details here) which, again, makes it possible to leverage existing investments in Microsoft and hardware to achieve the same performance benefits as HANA using the same column store in-memory type of technology.

This turned out to be a bit longer than I hoped and there are still a lot of points to be made as we are comparing HANA with Microsoft BI stack. I will conclude with another statement that will echo on my original post somewhat.

SAP is in an interesting business situation where in my opinion it is running out of things to sell to its customers. The Business Objects acquisition has been extremely successful as it fueled a great share of the SAP revenue. The ERP business in general has reached the apex in terms of market saturation and maturity and I would not be surprised if every SAP customer has already acquired some Business Objects in one fashion or another. This explains the recent wave of acquisitions made by SAP as well as the big bet on HANA. Therefore, HANA makes complete sense for SAP and having what is commonly considered the best field sales force out there, SAP is going to sell a lot of it. My goal with this post is not necessarily discourage one from investing in this technology. My goal is to dispel the vaporware promises and distill the conversation to some factual information so that the people with $$$ can make a better decision not based on marketing hype. I believe that we are heading to a heterogeneous environment where Microsoft is certain to own the end user BI experience, SAP is certain to own the ERP and applications, and everything in between is still a lot of grey area. I have no doubts in my mind that as time goes by, SAP will be able to identify clear differentiators for HANA, however, today, just about everything that supposed to be an innovative differentiator for HANA is already available in Microsoft BI stack.

6 thoughts on “SAP HANA versus Microsoft Business Intelligence and xVelocity, take two…

  1. I agree that just like many products out there, SAP HANA is marketed as the best thing since sliced bread and that a lot of claims from SAP are easily proven false, like you just did.

    However, one thing that I will dispute is that xVelocity is functionally the same as SAP HANA columnar features. xVelocity only works on read-only tables, whereas SAP HANA manages full ACID in real-time. I’m not a MSoft BI guy, so I can’t claim that a real-time data mart is impossible to implement on the MS stack, but xVelocity definitely can’t support that.

    The thing is that “really” real-time data marts are a requirement for a tiny percentage of actual business applications, so SAP HANA really fills a niche, rather than being a product that will sweep away the likes of SQL Server and Oracle… Unless the pricing is dropped dramatically.

    Also, I have other gripes about SAP HANA. Their programming environment is very poor at the moment (e.g. you can’t even create your own SQL function), and their claims that you don’t need to manually create materialized views and indexes speed up development time will easily be outweighed by other development quirks.

    By the way, I’m extremely keen to find some performance benchmarks and real-life experiences of OLTP on SAP HANA. So far I found plenty on OLAP and nothing on OLTP, so I’m still fairly skeptical in terms of all the claims that ERPs will be ported and running on HANA hunky dory.

  2. Re: HANA allows to “create applications that run business logic in the in-memory engine rather than moving huge amounts of data from the DB layer to the App layer, which is also unique”

    Actually what SAP means is that you can run stored procedures in HANA. They seem to think this is new, and have developed their own procedural SQL standard (called SQL Script) rather than the ANSI one.

    Where this gets sneaky is that SAP is rewriting Suite to use stored procedures, which is a massive speed gain, but they will only do this for HANA, not Oracle, DB2 or SQL Server. This gives a massive advantage to HANA that outweighs the inherent inefficiencies of column store for OLTP.

  3. I am just working on a project of migration from Microsoft SSAS to HANA Views. And this blogs express exactly what i’m feeling. HANA Views are far from doing what Microsoft SSAS does. Thanks very much, very useful.

    1. This is true if you don’t know how to implement HANA views. HANA views are more close to write a query (without coding) than building elegant SSAS models.
      But at the end if you learn how to build it, you’ll have one powerful model with virtually no limitation and you’ll not need to schedule any fancy background tasks to refresh your data.
      I experienced two DWH implementations both with Terabyte of data, and both with huge investments. But at the end of story on MS stack most all the data updated in hours nightly, and every time you need to fix or enhance the model it’s a pain. HANA is in real-time 24/7, and no issue in processing billion of record in seconds, and changes are much much simpler.

Leave a comment