Power BI Data Management Gateway and OData

I just got back from the SharePoint conference and wanted to write down a few points about the Power BI Data Management Gateway before I forgot.

It appears that there may be a progression in how Power Query may get used by an organization. I cannot say I am 100% clear on all the scenarios and what the best practice might be, but I am going to walk through several steps of how Power Query may be adopted / deployed in an enterprise and then parallel it to the functionality provided by the Data Management Gateway OData feeds

Step 1 – Standalone installation

This is the simplest scenario and most likely it will be the most common initially. In this case, a Power User will install Power Query from this link http://aka.ms/powerquery and will use all the great features of the tool to source, transform and load data into his/her Power Pivot workbook. Let’s say, for example, that I have a Store dimension in my SQL Server data warehouse. I can use my windows credentials to get authenticated and authorized by the SQL Server before I can pull my Store information into my Power Pivot model. Let’s say I add a number of transformations to enrich my Store dimension and to massage the data just the way I like it.

Here is a quick summary of what I would have thus far

Power Query script location Workbook
Security credentials Local, Secure Store
Data refresh Peer-to-peer, Excel to SQL Server DW

Step 2 – Power Query in the Data Catalog

Now that I am happy with what I have done with my Stores, I would like for everybody else in my company to be able to use what I have built, so I right click on my power query in the Workbook Queries pane and click Share

After we have filled in the required fields, our query is saved in the Enterprise Data Catalog. It is important to recognize that although the data catalog resides in the online Power BI portal, no data actually travel in and out of the cloud. In other words

Power Query script location Data Catalog (in the cloud)
Security credentials Local, Secure Store
Data refresh Peer-to-peer, Excel to SQL Server DW

To clarify, when I created the Power Query, I had to connect to SQL Server with my security credentials, when you try to consume my shared query, you will have to use yours and when you load data into your model, the data will flow directly from SQL Server into your Excel workbook

Step 3 – Expose Data Using Data Management Gateway

This is where it gets a little bit tricky. Microsoft vaguely describes the Gateway as a “client agent that provides access to on-premises data sources in your organization“. From what I can tell, and I think this will take a bit more time before we have a perfectly clear understanding of that technology, the Gateway servers two following functions:

  1. It provides VPN-like connectivity between the Power BI tenant in the Microsoft Cloud and the local or corporate data center environment, please see some additional information on data refresh here
  2. It allows to create Data Sources and expose them as OData sources, searchable using the Power Query search functionality

I have already discussed #1 before, therefore, I just wanted to zero in on #2.

Essentially, if I have a corporate data asset and I am trying to decide on the best way to expose it to my users I have two options:

  1. Use Gateway to wrap OData around it
  2. Create a standalone Power Query and to publish it into the Data Catalog

Initially, I was very excited about the option #1 but now, I think in most cases I would be leaning towards using the option #2 and here is why

Power Query script location Gateway (running on a host server on the corporate network)
Security credentials Hardcoded during the Data Source creation, shared by all consumers
Data refresh SQL Server->Gateway Host->Excel
Folding Limited support in Gateway/OData, Rich support with Data Catalog Shared Power Queries

Please note some major differences between using Data Catalog vs. the Gateway:

  • Data Catalog allows each user to use his or her security credentials to access corporate data vs a hardcoded account stored within Gateway
  • Data Catalog allows for peer-to-peer connectivity where Gateway becomes a single choke point for all Excel to SQL Server communications
  • Folding or pushing transformation all the way into the original data source to minimize loading unnecessary data or running unnecessary data transformation in power query engine in excel is somewhat supported using Gateway and OData but given the architecture limitations it is not supported quite as well as it is in the peer to peer power query architecture

This topic requires additional analysis, but at least initially, it appears to me that I would be more inclined to use the Gateway as the tool to provide the connectivity between my data center and Power BI tenant so that I can set up Data Refresh. I am, however, struggling with a good scenario where the Gateway and OData feeds would be more preferable to the functionality of the Data Catalog and Shared Power Queries.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s