SAP Hana vs Microsoft SQL Server, the war is on…

and the Germans are making the first move… I have written two posts about Hana vs. xVelocity (Part I and Part II) making the basic point that there is nothing truly visionary or revolutionary in SAP Hana from the technology perspective and that a customer can get all the benefits (or rather alleged benefits as many early Hana adopters will attest to) and the same or better level of performance on a commodity hardware without having to pay millions for some hardware appliance. I was not saying that there is anything particularly wrong with Hana, although it is obviously still an immature product since there is not a single SAP customer running its SAP ERP in production on it yet. Rather, my point was that there is no need to pay millions of dollars for functionality that is readily available for a lot less.

SAP’s official target with Hana until 6:46:59 AM today was Oracle, but it seems that Microsoft SQL Server is now officially a target as well, which at the very least is a good indication of how seriously SAP is taking Microsoft’s in-memory story.

I am still working on my Part III of the xVelocity vs Hana saga, so I just wanted write up a quick critique on the aforementioned post from SAP:
 

  • There are lots of technical reasons to believe that HANA is a far superior product today than what Microsoft has announced will be available a few years out” – again, not a single link to a reference to this bold claim. Last time I checked, 40% or so of SAP deployments run on Microsoft SQL Server as compared to 0% that run on Hana, so calling Hana a “far superior product” is just silly. But why provide evidence when misinformation will suffice? Here are some more unsubstantiated pearls:
    • The SQL Server products require a batch process to build the index” (obviously he has no clue how clustered indexes work)
    • OLTP product does only OLTP so a redundant copy of the data is required” (this is so not true it made me laugh)
    • There are odd SQL limitations that require hand-tuning to get it all to work” (seriously? Odd limitations? How odd??? And what are they??????)
  • The third paragraph of the post gives us a good insight on what SAP thinks of what a DBA’s job description is. Unfortunately it also makes it apparent that whoever wrote that piece has no clue about what real DBAS do, and more importantly what they don’t do (build cubes? re-architect the eco-system?). Clearly, the article is not intended for a technical audience.
  • Shared-nothing gave us the basis for “big data” databases” this is an oxymoronic statement, by definition, big data is something that is too big to be stored in memory and SAP itself positions Hadoop is the big data answer. I am starting to get a feeling that this article was written by a marketing guy who does not even know SAP’s own products and how they work.
  • The starting point for HANA is based on the recognition that the current and upcoming hardware technologies are capable of solving for all of this in a single database instance if only the database was re-written to fully utilize the hardware” – this is just silly, give us at least one example to prove this point. HANA does not even support the newest Intel Ivy bridge processors, how fully optimized for hardware is it really??? Sounds like SAP’s strategy is for IT departments to become hostages of the hardware vendors.
  • I am going to skip the apples to oranges comparison between the two test cases from Microsoft and SAP because other then they are obviously not apple to apples I don’t have (and the article does not provide) and factual information to compare and contrast the two

And then the author is closing with stating that Microsoft SQL Server’s “1995 architecture was designed for single-core x486 systems with 256MB of RAM”. This is not cool because it’s just a straight up lie. But it is also ironic because the so-called innovative Hana platform according to this Wiki page is cobbled from several not so new technologies:

  1. MaxDB also known as Adabas – a 35 year old database that used to only live on a mainframe
  2. TREX – at least 15 years old
  3. P*Time – acquired by SAP (notice a pattern there, all of the IP in HANA is acquired) in 2005

So it does not take a lot of research to understand that if one takes the marketing fluff out, there is nothing really innovating or new in Hana, nothing that has not existed or been available form many other vendors for a very long time. This SAP blog post is not very significant in making a good case for HANA, but my guessis it was not really intended to. The real intention is for the SAP’s marketing department to officially declare war on Microsoft SQL Server, now it’s up to Microsoft to make the next move.

15 thoughts on “SAP Hana vs Microsoft SQL Server, the war is on…

  1. Awesome article outlining some realities, instead of SAP’s misconceptions about some great MS developments. For many customers I work with, a realistic approach to a BI solution is something most of them need, not more over priced vapour-ware.

  2. I do agree with you that the marketing effort around HANA is way over the top. Two weeks ago I was at SAPPHIRE in Madrid and if I got 1 penny for each time HANA was mentioned it would have been paying for the trip by itself.

    But I think there has to be a segregation between the technical and the marketing aspects of HANA. While I hate the last, the reality is that in-memory databases will be a game changer. From several perspectives:
    – processing time: if today we’re kind of used to wait until the next day to see some reports (due to “processing time” of the data) the in-memory will make us demand reporting in real time. Think about the advantages a retail company can draw from such a technology for instance
    – simplification of data model: I do believe that you have missed the point about the OLTP statement, but in current world systems use normal RBDS for transactional systems (ex. ERP) and then use OLTP optimised systems (ex. BI) for reporting. Reporting directly on the transactional system is slow, using an OLTP system for transactions is slow and inefficient. This is what the duplication of data refers to; you need an ETL process to transfer data from transactional to OLTP data stores and you end up with data (albeit in different formats) duplicated across the systems. Even in the BI world there is very common, when dealing with very high volume of data to produce aggregates to improve response time for queries. With HANA practically all this optimisation is no longer necessary. You just dump all transactional data in it and just run whatever queries you want.
    – reduction of data volume: databases are notoriously hungry for storage space, which is understandable given the optimisations they need to provide for data access and query performance. It becomes even more visible in the case of OLTP optimised systems. What HANA promises is the reduction between 8 and 20 times of the need for space for storing the same data. If your database is now 1TB under HANA it could be 100GB – which reduces significantly the price gap between the solutions.

    In conclusion, as much as I hate the marketing machine and the plain stupid claims they come up with in their “white papers”, dismissing HANA as a fade or a marketing hype would be a mistake. I’m certain that Oracle and Microsoft will follow the trend and offer their own in-memory products (100% in-memory not hybrids). In the meantime SAP can still enjoy the benefit of being first on the market and be quick on dismissing anything coming from any competitor as being inferior and late in the game.

    1. Alex, the problem is that once you “segregate” marketing from technical, there is not much left in Hana to admire:
      • Hana is just a database, there is nothing real time about it, it requires different software (with NetWeaver infrastructure) SAP Landscape Transformation services (database trigger based) for real time data loads
      • “simplification of data model” – it is possible that I missed some point related to that, but last time I checked, SAP had 80,000 (as in eighty thousand) tables across various modules. Do you really think they will simplify their backend to allow real time analysis on that data? HANA is in memory and available today, however, SAP does not run on it, so just being in memory is not enough, in fact, I guarantee you that if and when SAP ERP does run on HANA, customers will have no more real time access into their data than they do today. SAP uses some very big “clustered” tables such as BSEG that cannot be queried directly using SQL and no “In-memory” gimmicks can easily fix that.
      • “no aggregations necessary” – well, Hana does have an aggregations engine, please see Part I on Hana on my blog, but xVelocity does away with aggregations as well, no need for an appliance
      • “Reduction of data volume” – again, Hana does not do anything there that cannot be done with xVelocity for a fraction of cost
      And lastly “follow the Trend”. This is where I am not sure why you think that SAP is the trend setter. SAP is well known for its business acumen, but I would never think of SAP as an innovator. All of their innovation comes from acquisitions, so that SAP’s best in class sales force can sell the hell out of it and then move to the next thing to sell and let the first thing to kind of quietly die down (ie, BW, BobJ, BPC, SRC, etc)
      I agree with what I think is your point over all – there are some major changes taking place in the database world that are taking advantage of having lots of CPU cores and RAM. There are lots of vendors playing in this space today, including Oracle, SAP, and Qlikview. There are some features that are implemented better by some, but overall, the in-memory game is becoming commoditized just like the hardware business. Which is exactly the point I am trying to make, in-memory is a commodity technology, therefore, there is no reasons to pay premium price for sub-par products, xVelocity has everything that HANA has (and then some) leveraging technology companies already own, existing hardware and the skill set that they already have in-house.

  3. Before calling anyone out on inaccuracies, you should get your own facts straight. There are several unfortunate mistakes and misconceptions about SQL Server in your post. (I don’t know SAP and HANA, so I can’t comment on your statements about those technologies).

    “Hekaton that will be available in 2014-2015”
    No reference given, indeed. But Hekaton was announced as being part of the next major release of SQL Server during the PASS Summit, earlier this month. And some years ago (I think it was somewhere between the release of SQL Server 2005 and SQL Server 2008), Microsoft has announced their intention to release a new major version every 2-3 years. So adding 2-3 years to the last major release (2012) does indeed work out to 2014-2015.

    “Microsoft demoed updateable column store clustered index (Hekaton)”
    Hekaton and columnstore index are two completely different things. Columnstore indexes are already available in SQL Server 2012, but with some limitations – the most important one being that the index is read-only, so they are far from trivial to use in an OLTP database and even require some care in a DW database. Though columnstore indexes are often called “in-memory”, they are not truly in-memory; they are cached in memory, stored on disk, and will continue to work (and benefit performance) when their size exceeds available memory. In November, Microsoft announced (and demonstrated) support for updateable columnstore clustered indexes in “the next major version of SQL Server” (no date announced) and in the next version of the Parallel Data Warehouse Edition (available first half of 2013).
    Hekaton is the code name for a completely different technology that Microsoft has announced this November. This is a truly in-momory technology, and it is targeted at increasing the throughput of high-volume OLTP databases with many thousands transactions per second. Like updateable clustered columnstore indexes, this feature will be available in the next major release of SQL Server. A CTP of Hekaton has been announced to be available “soon”, and some customers have been using early development versions of Hekaton for some time already.

    “The SQL Server products require a batch process to build the index”
    Though I think it’s a bit far-fetched to describe a single CREATE INDEX statement as a batch process, building the columnstore index IS indeed a seperate operation that has to be executed before the columnstore index is usable. In the current implementation. Updateable clustered columnstore indexes will change this, but they are not yet available. (Except, maybe, to selected customers in the Microsoft TAP program)

    “OLTP product does only OLTP so a redundant copy of the data is required”
    This statement is a bit unclear. But I think that this is a reference to Project Hekaton. Based on what I have seen at the PASS Summit in November, where Hekaton was announced, I think that columnstore indexes and Hekaton are, at least for now, built as seperate products addressing seperate problems. They might merge one day in the future, but I expect them to be seperate in the first version(s).

    “There are odd SQL limitations that require hand-tuning to get it all to work”
    One thing you CAN say about the post from SAP is that it is (deliberately?) confusing about what point addresses what issue, making it hard to know when they are talking about the curent implementation of columnstore indexes, the announced updateable clustered columnstore indexes, or Hekaton. After following some references, I think that this statement refers to a limitation of the current implementation of columnstore indexes. This implementation boosts the speed of SQL Server processing versus processing using traditional (row store) indexes by 50-100x (based on my own, not scientifically relevant, benchmarks) by using a combination of two techniques: saving I/O (as a result of using columnnar storage), and speeding up processing (as a result of a new “batch mode” processing model built into the SQL Server engine, that eliminates a lot of overhead by processing rows in batches). The current implementation of batch mode is indeed quite limited. Many queries can’t use batch mode without rewriting them. The statement from SAP that these limitations “require hand-tuning to get it all to work” is not true – without hand-tuning, it will still work and you’ll still get a 5-10x performance increase (again, based on my own non-scientific benchmarks) from the columnar storage. But the current implementation does require hand-tuning of some queries to get the full performance benefit of the combination of columnar storage and batch mode processing.
    For the record, Microsoft has also announced (not in the keynote, but on one of the breakout sessions of the PASS Summit) that the next major release of SQL Server will lift “almost all” of those limitations. I did try to get a more exact specification of the “almost” part in that announcement, but they refused to comment – my theory is that engineers are still working on the last edge cases. But all the problematic queries I knew about worked fine in batch mode, without any tuning, during the demos I saw.


    Hugo Kornels, SQL Server MVP

    1. Hugo, there is simply not enough public/official information to really say how xVelocity and Hekaton are related to each other and also, one can only speculate about the date of the next release of SQL Server (and I will take a stab at speculating about it later today) so I will take that offending paragraph out as it really is not that important for the point I was trying to make anyway.

      The main point of my post, however, is still very valid – Hana is being marketd and differenciated on technical specs as a superior and more mature product which is simply not true.

      1. You are totally right about the release date. And I know nothing about HANA, so I won’t comment on that.
        But I do know quite a lot about Columnstore (I refuse to call it xVelocity, only because the MS marketing peeps thought they had to change a name that everyone had just gotten used to), and I hve learned a few things about Hekaton at the PASS Summit. Especially during a Friday afternoon session that I attended. PASS has recorded the session, and it is available online (I don’t know if access to the recording is freely available or restricted to registered attendees, though). To find it, go to the Summit2012 site, log in with the account you used to register, then go to the session list, filter on speaker “Dandy Weyn” and pick the “xVelocity/in-memory” session. The first 25 minutes are about Columnstore; the next 3 about StreamInsight, and the rest of the session is about Hekaton..

  4. There’s nothing innovative in HANA? It’s just a database? This blog is so erroneous its shocking, so allow me to enlighten you through a brief summary.

    HANA provides FOUR core services, completely integrated and ALL in memory. A DB service, an Analytic service, a Predictive service and finally a sentiment analysis/unstructured text engine. It also includes an ETL tool, so your response you just thought of around transformation services is irrelevant.

    This means that it’s more than a database. sql server doesn’t have native predictive or sentiment. And no matter what silliness is in this blog, sql server does not offer ANY of this in memory. There are two reasons this matters:

    1) You don’t have to constantly copy and move the data from one location to another in order to perform storage, analytic, predictive and sentiment functions on data.
    2) All your work in HANA happens ridiculously faster than a traditional disk based (even solid state) RDBMS. If you disagree with this, then you clearly don’t understand the difference between read/write speeds on disk or solids state disk vs DRAM.

    Shouldn’t HANA al LEAST get credit as a broader platform than SQL Server, and also credit for the fact that it’s all in memory TODAY??? How can you come to a conclusion that it’s not innovative when you don’t understand it? I tell you what, whenever Hekaton (I am trying to resist the puns here) launches, lets do a compare then, so its fair. Talk to you in 2015 🙂

    1. Your comment, as enlightening as it was, did NOT seem in highlight a single “erroneous” point in my post. I have repeatedly stated that the technical sophistication of Hana is not something that I was focused on in my analysis.
      Hana is cobbled together with some mature, proven and robust components, like Adabas – a old school mainframe database, TREX (http://en.wikipedia.org/wiki/SAP_HANA) , etc. I do not believe Hana includes an ETL tool, from what I saw BOBJ Data Services and/or SAP Landscape Transformation Serivces (Trigger based) are the means to get the data in, or I guess any ETL tool as long as there is an ODBC/JDBC/etc driver available.
      I really don’t understand your fascination with “In-Memory”. “In-Memory” is just a marketing gimmick that all software vendors are exploiting to sell more of their products much like “web services” or “xml” before.
      I have tried to improve my golf game by buying more expensive set of clubs, but it did not work. One cannot improve one’s cooking skills with buying a more expensive refrigerator. A company will not get better in implementing Business Intelligence with buying HANA (or any other tool/technology for that matter). It takes skill and experience first and foremost and I am not even sure that technology would come second.
      SAP ERP has 80,000 tables, you have to be absolutely insane to think that you can just slap HANA on top of that and not “have to constantly copy and move the data from one location to another in order to perform storage, analytic, predictive and sentiment functions on data”.
      “Shouldn’t HANA at LEAST get credit as a broader platform than SQL Server, and also credit for the fact that it’s all in memory TODAY???” – Absolutely not and here is why:
      – SQL Server has a row store database engine just like HANA (and I am guessing it works better than Hana since SAP ERP runs on SQL Server today and not on Hana)
      – Analytical engine (that supports both MDX like Hana, but also DAX)
      – Prediction and data mining engine ( a lot more robust than Hana)
      – Data visualization engine (reporting services and PowerView) while Hana requires a separate purchase of Business Objects
      – Master Data Management – no such thing for Hana
      – Data Quality Services – no such thing in Hana, need DS from Bobj for extra $$$
      – ETL tool (SSIS), while Hana requires additional purchases for BOBJ DS or SLT
      So, it would be silly to say that Hana is a broader platform than SQL Server.
      Apple charges $200 more for its 64 GB iPhone than 16 GB iPhone although the difference in cost between those products is negligible. Why do they do it – well, because they can. People will rationalize it to themselves to justify the purchase, so companies are paying millions and millions of $$$ for Hana will obviously rationalize it as well, but that does not make their decision practical or logical. SAP has the best sales force in business and they are doing a great job selling Hana, their customers are doing a great job convincing themselves that their purchase was well justified.

      1. In April of 2012 SAP was so excited about its new database HANA that it was throwing away $492 million into programs to get people to use it. Hana has been available for almost two years and there is still no prices displayed publicly because they do not want to embarrass the early adopters with lower pricing. Only one hundred Hana systems live after two years and half a billion dollars in subsidies is not good.

  5. Hi,

    HANA Vs Microsoft SQL 2012.
    1. Too early to compare. HANA is getting lot of attention due to SAP marketing activities in ERP space.
    2. Microsoft remain Microsoft as always. Microsoft has huge MySQL deployment, developers and partners around the globe. HANA is yet to reach 1% on the global reach in terms of partners and developers.
    3. Microsoft strategy will be to avoid conflict with SAP , as the customers are widely using MsSQL on SAP platform. Microsoft has everything to gain from his.
    4. MsSQL2012 will be game changer in the market more than HANA due various factors, including Price & Promotion,
    5. Aggressive pricing from Microsoft can stop the progress of HANA, before it becomes as devil for Microsoft and others.

    Interesting space. My bet is on Microsoft SQL for multiple reasons.

  6. In his 2006 talk “Flash is Good”, Microsoft’s Jim Gray (father of Transaction Processing Theory and a Turing Award Winner) predicted that “RAM locality is King” and “Main Memory DB will be commonplace”.

    A decade or two earlier Gene Amdahl told us that “The best IO is the one you don’t have to do” (because data is kept in memory). The idea that keeping data in DRAM close to the processor is nothing new.

    The idea that SAP have “invented” anything more than a new level of Hype-Cycle marketing malarkey is just silly. Microsoft have been taking advantage of so-called “In-Memory Database” technology forever, as each generation of Microsoft products makes optimum use of improvements in Intel’s memory subsystem technologies. Even back in 2005, giving SQL server enough memory caused it to virtually stop doing I/O for most DSS workloads. MS Exchange is another example — Exchange 2003 was terribly spindle hungry and required vast farms of 15K rpm disks. Exchange 2010 is now perfectly happy with slow SATA disks and does virtually zero random disk IO, because it has moved it’s data structures “in-memory”.

    Good software stops doing random IO when there is enough DRAM available, because good programmers listened to Gene Amdahl 20 years ago. SAP HANA is good software, but no better than “generic” SQL server — except for the hype.

    1. Just wonder where you store the Transactional data in HANA ? you still need to store it some where then stream it across, surely you can blame the bottleneck is other side, because they’re not in “Memory”.

Leave a comment