Break Airflow Code into Modules

Parent: Production grade airflow scripting

Why?

The reason you do this is to get a clear distinction that Airflow is just an orchestrator and should be separated from the business logic.

The golden rule

Your DAG file should read like a workflow diagram, not like an implementation. All it should contain is what happens and when it happens

Tasks should be made lean, clean, and easy to read. Part of the reason is because dag files get read multiple times while its processed. Putting too much in your DAG’s cause overhead for the DAG Processor.

How?

This note has a prerequisite: Creating python modules

The gold standard for a task looks like this

from compacter.sql_utils import get_pending_sql_keys

	@task
	def get_keys(distrik,limit):
		engine = MsSqlHook(conn_id='mssql_creds').get_sqlalchemy_engine()
		return get_pending_sql_keys(distrik,limit)

Very lean, no logic involved, just tells you what gets ran and when.

read how to grab dag params here Accessing Airflow Params

Fauzan Acyuto

Explorer

Break Airflow Code into Modules

Why?

How?

Resources:

Backlinks

Graph View

Table of Contents

Recent Notes

Azure

Azure App Services

Azure Container Instances

Azure Economics

Azure Functions

Azure Monitor

Azure Purview

Azure Queues

Azure Redundancy Level

Azure Resource Manager

Explorer

Break Airflow Code into Modules

Why?

How?

Related:

Resources:

Backlinks

Graph View

Table of Contents

Recent Notes