Airflow and automated OpsGenie alerts
Are you looking to automatically create and close alerts when tasks fail in Airflow? Read on, and I’ll show you a little trick .
So, at Unacast we have been using OpsGenie for alerting system for a while now. We formerly used PagerDuty, but saw an option to shave off a few extra $ by switching. It turned out to be a pretty easy switch, and they are quite similar in user experience and integrations. So, I’ll quickly show how we both create, and close our alerts in OpsGenie from Airflow
Creating OpsGenie alerts from Airflow
This one is pretty straight forward, and Airflow even has Hook for connecting to OpsGenie. To set it up in Airflow, add an Airflow http connection, using a key you’ve created under “…opsgenie.com/settings/api-key-management” as your password
Next is to start using it in your Operators. It is basically adding a function reference in the on_failure_callback
, like so
Where the simplest implementation of the OpsGenieExceptionReporte
could be something like this
I’ll show later how this can be made even nicer, but for now, let’s skip to how we can automatically close alerts. And btw, pay attention to line 11 where we set the alias for the alert. We will reuse this when closing the alert.
Automatically closing alerts
For automatically closing alerts, let’s hook on to the on_success_callback
and create a class for closing alerts. It could look like this
See here that we reuse the alias, and utilise the OpsGenieAlertHook
And we only close an alert if there is a second run, as we then guess that the previous run was a failure. And OpsGenie doesn’t seem to mind if we try to close an alert that does not exist, or is already closed. This approach also means that if we mark a task as success it will unfortunately not close the alert. It might be that the if task_instance.try_number > 1
is not needed, but we felt that it was best not to test OpsGenie for thresholds here.
So there you have it. Easy? Good! Let’s look if we can clean up the message bit also
Jinja templating the message
It would be nice if we could easily customise parts of these alerts. So what better than to use jinja? We already have the context at hand, so let’s try
So here we now have a default description and message, which you easily can override per task, and use jinja templating. And the code to use it is minimal. Worth mentioning is that this default message is geared towards Slack markdown. This is because at Unacast, we use Slack for our OpsGenie messages. Highly recommend it.
Would be very happy to hear if you use OpsGenie in some similar way, or if you have a nice solution to also close alerts on “mark as success”. Or if you just liked the post!