S3ListOperator

Amazon

List all objects from the bucket with the given string prefix in name.

View Source

Last Updated: May. 7, 2021

Access Instructions

Install the Amazon provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

bucketstrThe S3 bucket where to find the objects. (templated)
prefixstrPrefix string to filters the objects whose name begin with such prefix. (templated)
delimiterstrthe delimiter marks key hierarchy. (templated)
aws_conn_idstrThe connection ID to use when connecting to S3 storage.
verifybool or strWhether or not to verify SSL certificates for S3 connection. By default SSL certificates are verified. You can provide the following values:False: do not validate SSL certificates. SSL will still be used(unless use_ssl is False), but SSL certificates will not be verified.path/to/cert/bundle.pem: A filename of the CA cert bundle to uses.You can specify this argument if you want to use a different CA cert bundle than the one used by botocore.

Documentation

List all objects from the bucket with the given string prefix in name.

This operator returns a python list with the name of objects which can be used by xcom in the downstream task.

Example:

The following operator would list all the files (excluding subfolders) from the S3 customers/2018/04/ key in the data bucket.

s3_file = S3ListOperator(
task_id='list_3s_files',
bucket='data',
prefix='customers/2018/04/',
delimiter='/',
aws_conn_id='aws_customers_conn'
)

Example DAGs

Improve this module by creating an example DAG.

View Source
  1. Add an `example_dags` directory to the top-level source of the provider package with an empty `__init__.py` file.
  2. Add your DAG to this directory. Be sure to include a well-written and descriptive docstring
  3. Create a pull request against the source code. Once the package gets released, your DAG will show up on the Registry.

Was this page helpful?