LocalToAzureDataLakeStorageOperator

Microsoft Azure

Upload file(s) to Azure Data Lake

View Source

Last Updated: Nov. 30, 2020

Access Instructions

Install the Microsoft Azure provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

local_pathstrlocal path. Can be single file, directory (in which case, upload recursively) or glob pattern. Recursive glob patterns using ** are not supported
remote_pathstrRemote path to upload to; if multiple files, this is the directory root to write within
nthreadsintNumber of threads to use. If None, uses the number of cores.
overwriteboolWhether to forcibly overwrite existing files/directories. If False and remote path is a directory, will quit regardless if any files would be overwritten or not. If True, only matching filenames are actually overwritten
buffersizeintint [2**22] Number of bytes for internal buffer. This block cannot be bigger than a chunk and cannot be smaller than a block
blocksizeintint [2**22] Number of bytes for a block. Within each chunk, we write a smaller block for each API call. This block cannot be bigger than a chunk
extra_upload_optionsdictExtra upload options to add to the hook upload method
azure_data_lake_conn_idstrReference to the Azure Data Lake connection

Documentation

Upload file(s) to Azure Data Lake

See also

For more information on how to use this operator, take a look at the guide: LocalToAzureDataLakeStorageOperator

Was this page helpful?