

Airflow users can now have full power over their run-time environments, resources, and secrets, basically turning Airflow into an "any job you want" workflow orchestrator. To address this issue, we've utilized Kubernetes to allow users to launch arbitrary Kubernetes pods and configurations. This difference in use-case creates issues in dependency management as both teams might use vastly different libraries for their workflows. A single organization can have varied Airflow workflows ranging from data science pipelines to application deployments. However, one limitation of the project is that Airflow users are confined to the frameworks and clients that exist on the Airflow worker at the moment of execution.

Airflow also offers easy extensibility through its plug-in framework. Airflow offers a wide range of integrations for services ranging from Spark and HBase, to services on various cloud providers. Since its inception, Airflow's greatest strength has been its flexibility. You can define dependencies, programmatically construct complex workflows, and monitor scheduled jobs in an easy to read UI. What Is Airflow?Īpache Airflow is one realization of the DevOps philosophy of "Configuration As Code." Airflow allows users to launch multi-step pipelines using a simple Python object DAG (Directed Acyclic Graph). Author: Daniel Imberman (Bloomberg LP) IntroductionĪs part of Bloomberg's continued commitment to developing the Kubernetes ecosystem, we are excited to announce the Kubernetes Airflow Operator a mechanism for Apache Airflow, a popular workflow orchestration framework to natively launch arbitrary Kubernetes Pods using the Kubernetes API.
