MongoToS3Operator

Amazon

Operator meant to move data from mongo via pymongo to s3 via boto.

View Source

Last Updated: May. 10, 2021

Access Instructions

Install the Amazon provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

mongo_conn_idstrreference to a specific mongo connection
aws_conn_idstrreference to a specific S3 connection
mongo_collectionstrreference to a specific collection in your mongo db
mongo_querylistquery to execute. A list including a dict of the query
s3_bucketstrreference to a specific S3 bucket to store the data
s3_keystrin which S3 key the file will be stored
mongo_dbstrreference to a specific mongo database
replaceboolwhether or not to replace the file in S3 if it previously existed
allow_disk_useboolin the case you are retrieving a lot of data, you may have to use the disk to save it instead of saving all in the RAM
compressionstrtype of compression to use for output file in S3. Currently only gzip is supported.

Documentation

Operator meant to move data from mongo via pymongo to s3 via boto.

Example DAGs

Improve this module by creating an example DAG.

View Source
  1. Add an `example_dags` directory to the top-level source of the provider package with an empty `__init__.py` file.
  2. Add your DAG to this directory. Be sure to include a well-written and descriptive docstring
  3. Create a pull request against the source code. Once the package gets released, your DAG will show up on the Registry.

Was this page helpful?