The Purpose and Pain of Azure Resource Locks

Azure Resource locks are a feature available in Azure to prevent the removal or change of resources within an Azure tenant (depending on which type of lock is applied). They can be applied to subscriptions, resource groups or just individual resources. Doing so essentially overrides any permissions someone has.

An example lock applied directly to a Data Factory resource

There are two types of locks:

Can Not Delete - resource can be modified and read but not deleted, by anyone.
Read Only - resource can’t be modified or deleted but can still be read.

In theory, these are perfect for preventing accidental (or deliberate) deletion of resources in Azure. They don’t prevent the deletion of data though, only operating at the “control plane” of a resource. That still sounds great though. Turn them on everywhere! That’s another layer of security in your cloud data platform. Right?

Digital Asbestos
#

I debated about this heading as I lost my Dad to Asbestosis but I honestly can’t think of a better way to describe this feature, and I’m not that thin-skinned either. A great place to link some info on Asbestos: Action on Asbestos

Azure Resource Locks are NOT as simple and as beneficial as they appear on the surface. Focusing on the CanNotDelete lock, there’s a lot more going on under the hood that we don’t see that also won’t be apparent right away and could severely impact operation of resources at a future date. A lot of these drawbacks are documented in the Azure Resource Locks docs. Here’s a few excerpts:

A cannot-delete lock on a resource group prevents Azure Resource Manager from automatically deleting deployments in the history. If you reach 800 deployments in the history, your deployments will fail.
A cannot-delete lock on the resource group created by Azure Backup Service causes backups to fail. The service supports a maximum of 18 restore points. When locked, the backup service can’t clean up restore points. For more information, see Frequently asked questions-Back up Azure VMs.
A cannot-delete lock on a resource group prevents Azure Machine Learning from autoscaling Azure Machine Learning compute clusters to remove unused nodes.

Not all drawbacks are documented. The most significant impact I’ve came across is with Azure Data Factory’s functionality.

Azure Data Factory
#

As Microsoft’s data orchestration tool in Azure, Data Factory is often central to data platform implementations and a prime candidate for such an Azure Resource lock to be applied.

Applying a CanNotDelete lock to a Data Factory resource, or the resource group it is a member of (as the locks cascade down automatically) means you can no longer remove core components inside Data Factory, such as Linked Services.

Resource lock error

Playing devil’s advocate for a second, I would argue that outside of a development environment I should not be deleting linked services inside Data Factory anyway. These should be managed with a CI/CD pipeline, much like how I describe in a previous series on CI/CD for Azure Data Factory with Azure DevOps.

The problem is that changes in a dev environment need to be propagated to your Test, UAT, and Production environments and in doing so, will fail due to the resource locks.

Triggers too
#

Solving the problem by making Data Factory an exception to resource locks won’t solve the problem entirely either. If you use an event-based trigger in Data Factory, you’re attaching an Event Hub event to the storage account you want to trigger from. Any changes to that trigger, such as disabling it, removing it, or updating it, will conflict with the resource lock on that storage account.

A BlobCreated event on a storage account

Turn it off and on again
#

Jumping back to when I mention CI/CD pipelines, there’s a way around almost all of these problems. As part of a deployment pipeline we can introduce a step to programmatically remove the necessary resource locks. We can then continue with normal deployment tasks before re-applying them again at the end.

Deleting a lock with the Azure CLI
#

lockid=$(az lock show --name LockSite --resource-group exampleresourcegroup --resource-type Microsoft.Web/sites --resource-name examplesite --output tsv --query id)
az lock delete --ids $lockid

This doesn’t really help us in development though and I think that’s ok. We should have the right RBAC set up on development resources so that someone who doesn’t know what they’re doing, can’t break things easily.

It’s sort of cheating though isn’t it? Turn it off to do what we need then turn it back on.