How to securely access Azure from Azure DevOps agents

undefined
Photo by Kevin Ku on Unsplash

Securely accessing Azure from Azure DevOps agents

Microsoft Azure and Microsoft Azure DevOps (ADO) are two popular but different services. Microsoft Azure is the cloud environment where you can deploy and use cloud resources such as virtual machines, storage, keyvault and container registries. ADO is a SaaS service where developers a.o. can commit their code, and run CI/CD pipelines to build their application and do automatic deployments of/to Azure resources.

Azure resources such as storage accounts, keyvaults and container registries have networking configuration settings that by default expose the service to the public internet. For security reasons however it is recommended to only allow access to these services from where it’s needed.

Public Access from Selected Networks

The issue with Azure DevOps (or other CI/CD SaaS tooling for that matter) is that it is a service with by default public hosted agents for CI/CD, so if we are going to block public access, our pipelines will fail because they don’t have access to the Azure resources anymore.

There are two recommended options for overcoming this problem depending on your environment.

Private access

Configuring private access by creating a private endpoint to your service and disabling public access is the most secure option. However this option requires to run your own self-hosted build agent. If you have your own self-hosted agent with fixed IP running on a VM you can whitelist access to the Azure resource from just this VM.

The downside of this is that you will have to configure, host and maintain your own agent(s). Most likely this is not cost efficient for your situation.

If you would like more information on how to configure your own agent please have a look at the Microsoft documentation: https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/agents?view=azure-devops&tabs=browser

Public access

Microsoft hosted agents connect to your Azure resources over the public internet. Each agent has it’s own public IP address and you don’t know upfront which exact agent is going to be used. You can choose to whitelist all IP addresses of all Microsoft hosted agents. The list of IP ranges is however also dynamic and updated weekly. (https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/hosted?view=azure-devops&tabs=yaml#agent-ip-ranges) Even when automating this you would be allowing too many IP addresses and exposing your Azure resources more than needed.

Unfortunately ADO is not a trusted Microsoft service, otherwise just checking this box would have made our live very easy.
Allow trusted Microsoft Services

We decided to come up with an automated alternative for whitelisting just the public IP of the current ADO hosted agent.

Below example shows how to do this for an Azure Container Registry. The concept is similar for other Azure resources, just change the Azure CLI command for the resource you want to apply it to.

Get the IP address of the agent

First we need to get the public IP address of the agent and export it to a variable to be used in later tasks in the pipeline.

          - task: Bash@3
            displayName: Get agent IP address
            inputs:
              targetType: 'inline'
              script: |
                echo "##vso[task.setvariable variable=AGENT_IP_ADDRESS]$(curl -s http://ipinfo.io/json | jq '.ip' | sed -e 's/^"//' -e 's/"$//')"

Whitelist the IP address

Just the rule add should theoretically be enough, however we found that when running multiple tasks in parallel the propagation of the firewall rules could take up to a two minutes even when Azure would return the IP already in the whitelisting. Therefore we added a sleep of 120 seconds.

          - task: AzureCLI@2
            displayName: Add agent IP to ACR firewall
            inputs:
              azureSubscription: ${{ variables.serviceconnection }}
              scriptType: bash
              scriptLocation: inlineScript
              inlineScript: |
                az acr network-rule add --subscription ${{ variables.subscription }} --resource-group ${{ variables.acr_resource_group }} --name ${{ variables.acr }} --ip-address $(AGENT_IP_ADDRESS)
                while true; do
                  az acr network-rule list --subscription ${{ variables.subscription }} --resource-group ${{ variables.acr_resource_group }} --name ${{ variables.acr }} | jq -e '.[] | .[].ipAddressOrRange | select(. == "$(AGENT_IP_ADDRESS)")'
                  if [ $? -eq 0 ] ; then
                    break
                  fi
                  echo "IP address $(AGENT_IP_ADDRESS) not found yet, sleeping 10 seconds..."
                  sleep 10
                done
                echo "IP address: $(AGENT_IP_ADDRESS) added, but still going to wait two minutes..."
                sleep 120
                echo "Firewall rules should have propagated by now."

Execute the task that uses the Azure resource

Instead of a sleep in the previous task you could also choose to set the retry count to a very high number. We chose to set a retry count here as well to handle temporary Azure API hickups and make the pipeline more consistent.

          - task: Docker@2
            displayName: Push Docker container
            retryCountOnTaskFailure: 5
            inputs:
              containerRegistry: ${{ variables.registry }}
              repository: ${{ variables.repository }}
              command: 'push'
              tags: ${{ variables.tags}}

Remove the IP address again

When done running the task you should remove the IP address again.

          - task: AzureCLI@2
            displayName: Remove agent IP from ACR firewall
            inputs:
              azureSubscription: ${{ variables.serviceconnection }}
              scriptType: bash
              scriptLocation: inlineScript
              inlineScript: |
                az acr network-rule remove --subscription ${{ variables.subscription }} --resource-group ${{ variables.acr_resource_group }} --name ${{ variables.acr }} --ip-address $(AGENT_IP_ADDRESS)

Conclusion

Using public hosted agents provides you with a lot of flexibility and scalability for running your CI/CD pipelines, but you do need to take into account that you are not running these agents in your own private environment. Therefore you need to pay extra attention to how to implement security to your resources.

The above example is specific for Azure DevOps, but the concept may apply for many other public services. Just-in-time public access may leave a theoretical security risk, however in practice the risk is fully mitigated. If however you still feel uncomfortable that your resources are temporarily exposed you should opt for private access using self-hosted agents.

LinkedIn
Twitter
WhatsApp
Facebook