SqoopHook

Sqoop

This hook is a wrapper around the sqoop 1 binary. To be able to use the hook it is required that “sqoop” is in the PATH.

View Source

Last Updated: Apr. 27, 2021

Access Instructions

Install the Sqoop provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

conn_idstrReference to the sqoop connection.
verboseboolSet sqoop to verbose.
num_mappersintNumber of map tasks to import in parallel.
propertiesdictProperties to set via the -D argument

Documentation

This hook is a wrapper around the sqoop 1 binary. To be able to use the hook it is required that “sqoop” is in the PATH.

Additional arguments that can be passed via the ‘extra’ JSON field of the sqoop connection:

  • job_tracker: Job tracker local|jobtracker:port.

  • namenode: Namenode.

  • lib_jars: Comma separated jar files to include in the classpath.

  • files: Comma separated files to be copied to the map reduce cluster.

  • archives: Comma separated archives to be unarchived on the compute

    machines.

  • password_file: Path to file containing the password.

Example DAGs

Improve this module by creating an example DAG.

View Source
  1. Add an `example_dags` directory to the top-level source of the provider package with an empty `__init__.py` file.
  2. Add your DAG to this directory. Be sure to include a well-written and descriptive docstring
  3. Create a pull request against the source code. Once the package gets released, your DAG will show up on the Registry.

Was this page helpful?