Azure Data Factory, Event Driven ELT

Mar 11, 2022 8:34 AM

Personal Blog
Microsoft
Azure
Azure data Factory
Azure Event Grid Topic
ELT
Event Driven

Getting started

A while ago, the Azure Data Factory (ADF) received an update for its triggers, allowing for pipelines to now be triggered based on an Azure Event Grid Topics. This is great, because now it is possible to really do Event Driven ELT!

Why is Event Driven ELT great news? Most of the time in enterprise environments, there can be a struggle when you can load your data as a department/team or platform, due to the nature of the Data warehouse, or for other reasons. With this, you can make your Database Administrators (DBA's) very happy, because instead of polling every x times in order to find out whether there are any new deltas to load, the source can let you know when there are new deltas by posting an event within the Event Grid.

The Pipeline

For the purpose of showing an example, I created a simple pipeline called pl_wait, which waits for 30 seconds. Normally, you will use your copy pipeline for extracting and loading your data to your Data lake, Azure SQL Database, or Synaps Analytics solution.

The Event Grid

If you have your pipeline ready to go, you are in need of an Event Grid Topic. This can easily be created via the Azure Portal or via your Infrastructure as a Code process.

When you have created your Event Grid Topic Service and then go to the Overview, you will see the Topic Endpoint on the right-hand side, which is the URL to which you send your events. In my example, this would be https://egt-blog-test.westeurope-1.eventgrid.azure.net/api/events.

Copy the URL for later, go to the Access Keys Tab and copy the Value from Key 1, also for later use.

The Trigger

Going back to the ADF, go to the Manage Tab (Toolkit icon) and then go to Triggers. Create a new trigger by clicking on + New.

Give your trigger a proper name, under Type choose the Custom Events and choose your subscription and Event Grid Topic.

For the purpose of the example, I gave the Subject begins with the value Wait as well as for the Event types. Don't forget to check the Start trigger box!

With this, you have created an Event Subscription to the Event Grid Topic Service.

API call to Event Grid

Calling the endpoint of the Event Grid Topic can be done via many different ways and is completely dependent on your environment. For on-prem environments, this can be done in, for example, an SQL Job, but also via a service such as biztalk, etc. In this example I used the following powershell code to make the API call:

$eventDate = get-date -Format s 
$eagSASkey = "OsgW/HI4uC7YoaFFdir/++1ZSP7j1Xme/J7FahFP1d8=" 
$eventTopicURL = "https://egt-blog-test.westeurope-1.eventgrid.azure.net/api/events" 
$eventID = Get-Random 99999 


$json = @"
[
  {
    "subject": "Wait",
    "id": $eventID,
    "eventType": "Wait",
    "eventTime": "$eventDate",
    "data":{
      "name": "Let's wait a bit"
    },
    "dataVersion": 1
  }
]       
"@

Invoke-RestMethod -Uri $eventTopicURL -Method POST -Body $json -Headers @{"aeg-sas-key" = $eagSASkey} -ContentType 'Application/json'

Add you own Access Key to the $eagSASkey parameter and your own Topic URL to the $eventTopicURL parameter. This script can be ran from on-prem, the Azure shell, etc. Run the script and see the magic!

The Pipeline run

If you ran the script, you will see within the ADF, under Trigger runs, that the Trigger within the ADF was triggered by the Event Grid event.

As you can see, the triggered event was successful in waiting 30 seconds, but as stated before, this can easily be your ELT pipeline.

Looking at the Activity log within the Event Grid Topic Service, you will also see that the EventSubscription from the ADF was triggered.

What's next?

I'm busy with some recordings, let's see if these will be done by next week! Stay tuned!