Clustering the Power BI Data Gateway for High Availability (and Low visibility?)
The functionality to cluster the Power BI Data Gateway for high availability was introduced with the November update of the Data Gateway. This has been in the works for a while though I must admit the workaround I describe back in June 2017 on SQL Shack (in the appendix) is straight forward enough and I’ll probably be keeping that in place for now until the problems I describe here get ironed out.
The documentation provided on Microsoft Docs is great and can get you setup with no issues; it’s the lack of some basic information and a few configuration drawbacks that is keeping me from implementing this new feature at the moment.
Getting set up
So, let’s assume you are operating with the workaround for disaster recovery at the moment and you have your Data Gateway app on your main production server (GWAY01) and you have the app installed on a secondary, DR server (GWAY02) waiting for a manual failover.
- STOP! If you’re not comfortable with PowerShell, jump ahead to a restriction implementation of this functionality I detail later
- To implement the clustering we have to first update the main/production server, which isn’t best practice but it’s new functionality and there’s no way around it I’m afraid. The name of this Data Gateway is what the cluster will be called so make sure you don’t start with your DR one first. (Assuming it’s named as such). For testing I used PRODUCTION_GATEWAY
- You’d then update the DR server to the same version, but you can’t. There is no way to setup clustering on an already installed Gateway application (updated or not).
- You have to uninstall the application and then re-install it.
- During configuration you will be asked to take over an existing Gateway or create a new one. You should select to create a new one, set a name and check to add it to an existing cluster where you should see any other Gateways you own, you’ll also need the Recovery Key of the Gateway [cluster] you’re adding it to.
- And you’re done. The setup is that simple, which is great. No complicated DNS, failover settings or witness servers etc.
When you’ve completed the setup steps on the second node of your new cluster you’ll notice absolutely no indication that it is a member of a cluster. Check the main node too. Nothing there either.
Log into the Power BI Service and it’s just the same. No visibility of how many servers are in your cluster or what those are so you need to write that down somewhere!
You can see two Gateways here but only the PRODUCTION_GATEWAY is clustered. They both look exactly the same from the Manage Gateways page though.
The information is there and you can get to it using PowerShell but it’s not yet straight forward…
I love PowerShell and I even wrote a module with functions to query Power BI metadata but there should always be another way to get this vital information.
The documentation I mentioned earlier points you to a PowerShell module file included in the November update. You can load this file & use the commands they provide to get information about your Gateway cluster and its members or make changes to clusters.
#Start in the Data Gateway's folder
cd 'C:\Program Files\On-premises data gateway'
#Import the module file into this session
#Login to Azure using the provided email. A pop-up prompt will ask for password
Login-OnPremisesDataGateway -EmailAddress CPorteous@SQLGlasgow.co.uk
#Get all Gateway Clusters (This will include Gateways that aren't clustered)
#Get the member Gateways inside this sepecific Cluster ID
Get-OnPremisesDataClusterGateways -ClusterObjectId d05fc213-XXXX-XXXX-XXXX-9c52cab6b2e7
You might think this is an optional piece of information; maybe you aren’t familiar with PowerShell and don’t have time to dive into it. Either way, you need to know how to execute these commands
At the time of writing this, the only way to remove a server from a Gateway cluster is by using the Remove-OnPremisesDataGateway PowerShell command. You can uninstall the application but the Gateway cluster will keep looking for that server.
Remove-OnPremisesDataGateway -ClusterObjectId d05fc213-XXXX-XXXX-XXXX-9c52cab6b2e7 -GatewayObjectID 935e9482-XXXX-XXXX-XXXX-a1d881865816
This is less than ideal and I hope they fix these issues in the January update.