Few thoughts on the SQL Server 2014 launch event in San Francisco

I just finished watching the hosted customer even in San Francisco where the Microsoft CEO Satya Nadella, COO Kevin Turner, CVP of Data Platform Group Quentin Clark and the SQL Server Product Marketing GM Eron Kelly officially introduced the new release of the SQL Server 2014.

In this post I am not going to give any assessment of the product itself (but obviously it’s a big thumbs up from me) but rather reflect on the presentation/event itself in retrospect and see what actually stood out for me, what I remembered and what I was not particularly impressed with. So, again, this is by no means a review of the product or the event, but more of the test to see what I actually remembered having spent almost two hours watching it.

The event was kicked off by Satya Nadella and to be honest I don’t think I remember a single thing from his speech. I personally do not have an informed opinion on him yet, but based on my conversations with folks who actually know him I really want to like the guy and hope he is successful in this new job. Charismatic leader he is not, which is not necessarily a bad thing, but it definitely does not help with me remembering what he had to say.

Eron popped in briefly to do some demos early in the recording, but I will do a general overview of all demos at the end.

Kevin “The Walmart” Turner took the podium after. He definitely bumped up the charisma factor a little bit, but his unfortunate habit of reading the script off the teleprompter had somewhat of a dampening effect on me. He constantly emphasized the price to performance aspect of the MS BI platform which was a little bit too much for my taste. Putting my marketing guru hat on (I probably should have put marketing guru in quotes, huh), I would probably frame SQL Server as the best and most performant and user friendly BI and Database platform in the world, but I am not the COO of Microsoft and he is so what the hell do I know… Kevin closed his speech with thanking the customers and partners and leaving his personal email address on the screen, which I found odd, I am picking up some power dynamics with Satya there, although I have been accused of reading things into things too much… Still, I thought it was odd; I think that some of the things he said were more of the prerogative of a CEO.

Quentin Clark regurgitated some of the messaging from the previous speakers, liberally garnishing his speech with words like Polybase, Hadoop, APS and ambient intelligence. To be frank, I could not understand half the points he was trying to make, but to be completely transparent, the White Sox had two runners on bases at a time so I may have found my focus drift away a little bit for a while.

Now I can talk a little bit about demos. I happen to be a very opinionated individual on that subject given the fact that building and running demos is a big part of my daily life. I happen to have a philosophical difference with how the demos where done for this sessions. I do have a very high opinion of both Quentin and Eron, however, I definitely think that they need to forget about sticking to the script. Also, it felt to me that the story that ran through the BI demo was completely fabricated and being ridiculously subjective, I felt like the demos were very mechanical and not engaging. I personally was disappointed that we did not see a single new BI feature. Eron did a good job of demoing all of the Power BI components, but at least for me, I was longing for something new and exciting but my personal and subjective needs were not met.

The in memory OLTP features of SQL Server 2014 got a lukewarm reception from the audience, which I found astonishing. The lady from the product team that did the demo tried to do what she could to elicit some reaction from the group but shockingly (no sarcasm intended) the audience either missed the significance of these features or was somewhat fatigued with having seen similar demos a few times. In any case, I was definitely impressed with some of the market share numbers and also the customer stories from NASDAQ and the absolutely ridiculous data volumes that they process using SQL Server 2014.

Again, this post is more of an unorganized dump of what I thought were the most salient points from the presentation. Overall, I enjoyed watching the session (in part thanks to the MLB app on my phone) but I cannot honestly find any compelling reasons to recommend that anyone spends two hours watching it, instead, I would read a pretty good summary of the basic massaging by Quentin Clark on the SQL Server Blog here….

What is the Microsoft Power BI connector for SAP?

I have been getting some questions about it recently so I decided to write a few lines in addition to what’s now available on the Power BI site. Here is a list of some key points about the tool:

  • No middle-ware components are necessary to provide the connectivity between Excel and the Business Objects server. In order to connect, a user will only need to provide the URL to Business Objects and be able to be authenticated by it
  • Only the latest version of Business Objects is supported (4.1 SP2)
  • Only UNX type of universes are supported (must be created with the Information Design tool), the older version of universe (UNV) that typically was created using a Universe Designer, is not currently supported
  • Any universes on Business Objects 4.0 or 3.x are not supported
  • Only relational universes are supported, multidimensional universes are not supported. However, a relational universe that exposes Hana or BW data is supported
  • Universes cannot contain prompts or parameters
  • Power Pivot refresh from the cloud is not yet supported, but supposedly this feature is coming at some point (my assumption is that the generic Power Pivot refresh from Power Query will need to be supported by Power BI first before the SAP Connector can provide Data refresh in the cloud)

The March 2014 Release of Power Query is now available to download

Microsoft has released the latest update for my favorite Power BI tool. There are several cool albeit minor improvements to the functionality. The post at the link does a great job of describing the new changes, so I am not going to rehash them here. What I do find interesting, however, is that Microsoft is sort of going on the record of implicitly committing to monthly products updates schedule with the chosen naming convention. I have been fairly vocal about being somewhat skeptical with respect to the “Cloud first, cloud only…” company strategy, but if we take the Power Query team as an example, it is definitely very encouraging to see the promised frequent releases actually materialize.

Now, if only the Power View/HTML5 team could keep up with this cadence….

How to delete a Power Query from the Data Catalog

Data Catalog is one of the most interesting features of Power BI, however, there are several strange idiosyncrasies that I would like to surface in this post. The first one is the fact there is no way to access the data catalog from the Power BI admin site (if there is one, it must be using some ninja stealth hiding techniques because I still have not been able to find it).

The only way that I was able to get to it is by going to the list of Shared Queries in the Search box of the Power Query ribbon, previewing one and then clicking on the “View in Portal” option

This option will take you to a Power BI Data catalog portal that will list things like Usage Analytics, My Queries and Data Sources

Please see a sample list of Shared Power Queries above. The interesting idiosyncrasy is that there is no way to delete a query from this list. The only option that we have is to see the Analytics information about the selected query

In the course of me playing with the tool, I created a number of test queries; after a while I decide to delete some of them and that presented a problem because I could not find an option to do that.

So if you find yourself in a similar predicament, here is how you do it. It is very simple albeit not very intuitive. To my knowledge, currently only the person who created a query can delete it. Assuming that this is your query you are trying to get rid of, this is what you need to do:

  1. Go to Power Query section of your Excel ribbon
  2. Make sure that you are Signed In
  3. Click on Shared in Manage Queries

  4. Find the query you would like to delete in the Shared Queries pane on the right
  5. Right click on the query and select Delete

You are done.

On a similar topic, if you are trying to get rid of your OData sources, the only way to delete those is to go to Gateway configuration and delete the Data Source in the Gateway definition…

Power BI Data Management Gateway and OData

I just got back from the SharePoint conference and wanted to write down a few points about the Power BI Data Management Gateway before I forgot.

It appears that there may be a progression in how Power Query may get used by an organization. I cannot say I am 100% clear on all the scenarios and what the best practice might be, but I am going to walk through several steps of how Power Query may be adopted / deployed in an enterprise and then parallel it to the functionality provided by the Data Management Gateway OData feeds

Step 1 – Standalone installation

This is the simplest scenario and most likely it will be the most common initially. In this case, a Power User will install Power Query from this link http://aka.ms/powerquery and will use all the great features of the tool to source, transform and load data into his/her Power Pivot workbook. Let’s say, for example, that I have a Store dimension in my SQL Server data warehouse. I can use my windows credentials to get authenticated and authorized by the SQL Server before I can pull my Store information into my Power Pivot model. Let’s say I add a number of transformations to enrich my Store dimension and to massage the data just the way I like it.

Here is a quick summary of what I would have thus far

Power Query script location Workbook
Security credentials Local, Secure Store
Data refresh Peer-to-peer, Excel to SQL Server DW

Step 2 – Power Query in the Data Catalog

Now that I am happy with what I have done with my Stores, I would like for everybody else in my company to be able to use what I have built, so I right click on my power query in the Workbook Queries pane and click Share

After we have filled in the required fields, our query is saved in the Enterprise Data Catalog. It is important to recognize that although the data catalog resides in the online Power BI portal, no data actually travel in and out of the cloud. In other words

Power Query script location Data Catalog (in the cloud)
Security credentials Local, Secure Store
Data refresh Peer-to-peer, Excel to SQL Server DW

To clarify, when I created the Power Query, I had to connect to SQL Server with my security credentials, when you try to consume my shared query, you will have to use yours and when you load data into your model, the data will flow directly from SQL Server into your Excel workbook

Step 3 – Expose Data Using Data Management Gateway

This is where it gets a little bit tricky. Microsoft vaguely describes the Gateway as a “client agent that provides access to on-premises data sources in your organization“. From what I can tell, and I think this will take a bit more time before we have a perfectly clear understanding of that technology, the Gateway servers two following functions:

  1. It provides VPN-like connectivity between the Power BI tenant in the Microsoft Cloud and the local or corporate data center environment, please see some additional information on data refresh here
  2. It allows to create Data Sources and expose them as OData sources, searchable using the Power Query search functionality

I have already discussed #1 before, therefore, I just wanted to zero in on #2.

Essentially, if I have a corporate data asset and I am trying to decide on the best way to expose it to my users I have two options:

  1. Use Gateway to wrap OData around it
  2. Create a standalone Power Query and to publish it into the Data Catalog

Initially, I was very excited about the option #1 but now, I think in most cases I would be leaning towards using the option #2 and here is why

Power Query script location Gateway (running on a host server on the corporate network)
Security credentials Hardcoded during the Data Source creation, shared by all consumers
Data refresh SQL Server->Gateway Host->Excel
Folding Limited support in Gateway/OData, Rich support with Data Catalog Shared Power Queries

Please note some major differences between using Data Catalog vs. the Gateway:

  • Data Catalog allows each user to use his or her security credentials to access corporate data vs a hardcoded account stored within Gateway
  • Data Catalog allows for peer-to-peer connectivity where Gateway becomes a single choke point for all Excel to SQL Server communications
  • Folding or pushing transformation all the way into the original data source to minimize loading unnecessary data or running unnecessary data transformation in power query engine in excel is somewhat supported using Gateway and OData but given the architecture limitations it is not supported quite as well as it is in the peer to peer power query architecture

This topic requires additional analysis, but at least initially, it appears to me that I would be more inclined to use the Gateway as the tool to provide the connectivity between my data center and Power BI tenant so that I can set up Data Refresh. I am, however, struggling with a good scenario where the Gateway and OData feeds would be more preferable to the functionality of the Data Catalog and Shared Power Queries.

HTML5 and Anonymous access in Power BI… or check out the new Demo Gallery

I don’t expect that anyone will be blown away, at least initially, with the new demo gallery now available on the Microsoft Power BI site because to a business user, the content of those demos will look a little difficult to relate to if not a little nonsensical. Zany layouts aside, there is, however, a very significant piece of news to celebrate. Jen Underwood has already mentioned this in her excellent post but now we have a somewhat official, albeit not very conspicuous, evidence that two of the most requested features of the Microsoft BI stack that we have been anxiously awaiting for a very long time are finally becoming available:

  • HTML5 support for Power View
  • Support for Anonymous authentication and ability to expose Power View on a public anonymous site

Try to right click on the demos in the gallery and you will see that the familiar Silverlight menu option is no longer available, instead you will see a normal set of options that show up when you right click on any typical HTML element of the page. If you examine the source code a little bit close you will see <iframe
frameborder=”0″ allowtransparency=”true” scrolling=”no” id=”” style=”width:1024px; height:650px;” src=”https://powerbishowcase.sharepoint.com/_layouts/15/guestaccess.aspx?guestaccesstoken=sEKhHj4pZbI%2bedyTIbQ%2fywmdj7j9pVw0VsXZGwXN7ew%3d&docid=05e4c65f3966e46a9b72d2d3681955ac0&action=embedview&wdBIPreview=1″></iframe>
that the HTML5 code is generated inside an IFRAME. Looks like there
are several parameters, that will hopefully be documented soon, that are involved:

  1. Powerbishowcase.sharepoint.com – is the main URL
  2. Guestaccesstoken
  3. DocID
  4. Action (Embedview is already well documented)
  5. wdBIPreview

In fact, if you put all of the pieces together, you will be able construct this link https://powerbishowcase.sharepoint.com/_layouts/15/guestaccess.aspx?guestaccesstoken=sEKhHj4pZbI%2bedyTIbQ%2fywmdj7j9pVw0VsXZGwXN7ew%3d&docid=05e4c65f3966e46a9b72d2d3681955ac0&action=embedview&wdBIPreview=1

It is good to see the Power BI site to get regular updates, particularly of such great significance. Definitely looking forward to the hopefully soon upcoming documentation for these awesome new features.

What is Power BI Gateway?

After I wrote the step by step guide on integrating on premise data with Power BI models I realized that I actually dispensed some bad advice. The basic premise of the post was to demonstrate how one may use Power Query to source data into a Power Pivot model in a very elegant way (at least elegant to me). The only tiny problem with the approach described in that post was the fact that it did not work. Currently, Power Pivot cannot be refreshed in Power BI if its source data is loaded using Power Query.

It is my understanding that this support should be available in future (because it makes sense, certainly at least to me it seems that it makes more sense than the lack of thereof given the fact that we are dealing with Power BI here and it just seems that all “power” things are meant to be connected somehow), but as of right now only SQL Server and Oracle data sources are supported using OLE DB drivers.

I am afraid I am still a little bit hazy on what exactly Power BI Data Management Gateway is supposed to be doing but I will try to summarize what I found out empirically and how I got the data refresh to work.

The problem we are trying to solve is the following – my SQL server database is located in my corporate data center on its own IP network while my Power Pivot Excel model is hosted somewhere in the cloud presumably in a Microsoft managed data center somewhere. The two IP networks (my corporate and Microsoft data centers) are not talking to each other. Normally, some sort of a VPN solution would have to be implemented to provide that visibility between the two environments. And it appears that that’s all there is to the Data Management Gateway.

Let’s walk through a few screen shots to see how this works. First, when we set up our data source, we need to specify either a host name or an IP address of our SQL Server

Then we use the same IP address or a host name when we set up Power Pivot connection (which means that you have to be inside of your corporate network to do it)

Note that I am using the same host name both in Data source connection set up and Power Pivot connection information. Obviously, anyone inside Microsoft data center would not be able to ping my host server – Karabas, but that’s where the Gateway comes to rescue. As long as the host name in the Power Pivot connection is the same as the one defined in the Power BI Admin Center, the data refresh will work.

One word of caution, the only way to refresh your Power Pivot workbook stored in Power BI portal is by using “Schedule Data Refresh” feature

If you try to refresh your workbook in a browser and then use Refresh All Connection options there

You will get an error telling you that this functionality is not supported – “Error: OnPremise error: We were not able to refresh the data connections. On-premise data sources can only be refreshed via scheduled refresh in Power BI for Office 365.”

How to access Data Catalog in Power BI

I may be using the term Data Catalog somewhat liberally here, but since we don’t have the official definition of this term yet, I hope I don’t get myself in too much trouble here.

As I have been experimenting with various features of Power BI, I have been creating and sharing queries in the Data Catalog; however, at some point I started wondering what sort of governance may be necessary to implement so that an organization does not end up with hundreds or thousands of random of queries. At a minimum I thought it would be nice to have a list of queries already defined. However, having spent quite a bit of time on the Power BI Admin center and having clicked and explored every option available there I could not find a way to see a simple list of the shared Power Queries. After randomly stumbling around I found the Manage Data Portal.

I eventually found several way to get to it. The easiest one is to click on Power Query and then Shared

Now you should see your shared queries in the search results list

If you “peek” at the data by mouse over the query or right clicking and picking the “peek” option a screen like the one below will appear

If you select View in Portal option, you will be taken to the Manage Data Portal.

I thought I was somewhat of a clunky to get to this page but I still cannot find a faster way to it from the Admin Center. I decided to hack around the help section and see if I can get some answers that way. Pasting “usage analytics” in to the search box brought me to this help page that does a great job of explaining this feature and also contains more links with additional information.

I am still trying to find a way to delete the unused queries from the data catalog but that will be a topic that will be covered in a different post.

HTML5 support In Power View and Power BI

Now that Power BI is Generally Available (GA) I thought it would be useful to do a quick analysis of how Power View visualizations are supported in Power BI across the three media – Silverlight (default), HTML5 and Windows 8 App Please find a table below for a quick reference

Feature

Silverlight

HTML5

Windows 8 App

Background Images

Yes

No

No

Images in Tiles

Yes (but not from external sites)

No

No

Background colors

Yes

No

No

Filters – charts

Yes

Yes

Yes

Filters – dashboard

Yes

Yes

Yes

Maximize Charts

Yes

No

No

Sort charts

Yes

Yes

Yes

Line Charts

Yes

Yes

Yes

Bar Charts

Yes

Yes

Yes

Scatter Plots

Yes

Yes

Yes

Cards

Yes

No

No

Table/Matrix

Yes

Yes

Yes

Matrix with KPI

Yes

Yes

Yes

Drill down on charts

Yes

Yes

Yes

Drill down on matrix

Yes

Yes

No

Scatter Plot (static)

Yes

Yes

Yes

Scatter Plot (with play axis)

Yes

No

No

Tiles

Yes

Yes

No

Maps

Yes

No

No

Drill-through

No

No

No

       

It is easy to see that there is certain parity between features available in HTML5 and Windows 8 app. Also, I just wanted to make a few general observations about the performance. Generally speaking, the fastest chart rendering and drill down experience is available in the Windows 8 app. In most cases, HTML5 version seems to be doing a little better with refreshing charts however, I have found several instances where the Silverlight version of the dashboard ran order of magnitude faster than the HTML5 option. It seems like every week, more and more Silverlight-only features become available on the HTM5 version which seems to lend some support towards the “cloud first” software release strategy from Microsoft.

How to configure on premise access in Power BI and use Power Query to load on premise data into Power Pivot

In this post I will walk you through all the steps necessary to expose your on premise enterprise data to your Power BI users of Power Query in Excel and also will demonstrate how to source that data into a Power Pivot workbook.

I am not going to give you too much theory, rather, I will just walk you through all the steps with explanations where necessary (one assumption is that whoever will be walking through these steps will have admin level permissions to the Power BI site)

1 – Set up Gateway

Gateway has two components, first it has to be configured in your Power BI portal, second, you have to download and run a program that will provide the synchronization between your on premise and cloud environments

So let’s go ahead and configure it, go to portal.onmicrosoft.com and login with your Power BI credentials, after you logged in, click on Admin-> Power BI

This should take you to the Power BI Admin center; click on Gateways next.

In my case I already had created a BIGateway; you will need to click on “+ new gateway” bottom and go through the wizard to have it installed and registered

2 – Set up data source

After the gateway is created, we need to set up a data source

First, in the data source usage, make sure that OData feed is enabled

Then click next to provide connection info

Click on Credentials to provide login/password information (this will download a run “Configure authentication” app)

Note that in my case Credentials Type says “Loading…”, my guess it’s because the Gateway app is running on a different machine, unfortunately I am not writing this post on the machine that is running the gateway app, on that machine, however, this screen give me an option to either provide windows or SQL credentials type and then specify login and password

After that you can click on data settings and select tables and views that you would like to expose through OData to your end users

IMPORTANT: something I learned the hard way, your tables or views have to have at least on not null-able column or they will be grayed out on the list of views or tables and you will not be able to select them.


Follow the screen directions to finisht this process up.

Now that your data sources have been defined, you can use Power Query to find and consume them.

3 – Power Query

In order to find your tables, go to Power Query ribbon and click Online Search (note, it takes a few seconds for all the sync activities to run their course so give it a minute)

Now make sure that you have Organization selected in your Search drop down

Now you can type in your search criteria and click search to get my available data sets back

And this is where the fun begins.

4 – Source to Power Pivot

It took me a while to figure out the best way to source that data in Excel (and when I say the best, I only mean it in my humble opinion). First, I went a standard Power Query route and right-clicked on one of the data sets and then clicked Edit. That opened up a Power Query window that subsequently allowed me to either load data to either Worksheet or Data model

If you pick Load to worksheet, Power Query will load the selected data into an Excel table, selecting the Load to Data Model option will load the data into the workbook’s Power Pivot.

I, however, recommend neither of these options. User Power Query to apply whatever transformations on the selected data set but make sure that both of these options are NOT checked before you click on “Apply and Close” button. Doing that will not load your data in Excel just yet, but rather will save that connection information in your Data->Existing connections->Connections in your Workbook

Now you can fire up the Power Pivot window, click on Existing connections, scroll to the bottom of the list and use our Power Query connections to load data in Power Pivot

Now we just need to create our relationships and add calculations

Enjoy!

PS.  apparently, the Power Query connections cannot be refreshed using Schedule Data Refresh option of Power BI, I provided the explanation and a workaround here