PythonVirtualenvOperator

Apache Airflow Certified

Allows one to run a function in a virtualenv that is created and destroyed automatically (with certain caveats).

View on GitHub

Last Updated: May. 27, 2021

Access Instructions

Install the Apache Airflow provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

python_callableRequiredfunctionA python function with no references to outside variables, defined with def, which will be run in a virtualenv
requirementslist[str]A list of requirements as specified in a pip install command
python_versionOptional[Union[str, int, float]]The Python version to run the virtualenv with. Note that both 2 and 2.7 are acceptable forms.
use_dillboolWhether to use dill to serialize the args and result (pickle is default). This allow more complex types but requires you to include dill in your requirements.
system_site_packagesboolWhether to include system_site_packages in your virtualenv. See virtualenv documentation for more information.
op_argslistA list of positional arguments to pass to python_callable.
op_kwargsdictA dict of keyword arguments to pass to python_callable.
string_argslist[str]Strings that are present in the global var virtualenv_string_args, available to python_callable at runtime as a list[str]. Note that args are split by newline.
templates_dictdict of stra dictionary where the values are templates that will get templated by the Airflow engine sometime between __init__ and execute takes place and are made available in your callable’s context after the template has been applied
templates_extslist[str]a list of file extensions to resolve while processing templated fields, for examples ['.sql', '.hql']

Documentation

Allows one to run a function in a virtualenv that is created and destroyed automatically (with certain caveats).

The function must be defined using def, and not be part of a class. All imports must happen inside the function and no variables outside of the scope may be referenced. A global scope variable named virtualenv_string_args will be available (populated by string_args). In addition, one can pass stuff through op_args and op_kwargs, and one can use a return value. Note that if your virtualenv runs in a different Python major version than Airflow, you cannot use return values, op_args, op_kwargs, or use any macros that are being provided to Airflow through plugins. You can use string_args though.

See also

For more information on how to use this operator, take a look at the guide: PythonVirtualenvOperator

Was this page helpful?