HiveToDynamoDBOperator

Amazon

Moves data from Hive to DynamoDB, note that for now the data is loaded into memory before being pushed to DynamoDB, so this operator should be used for smallish amount of data.

View on GitHub

Last Updated: May. 7, 2021

Access Instructions

Install the Amazon provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

sqlRequiredstrSQL query to execute against the hive database. (templated)
table_nameRequiredstrtarget DynamoDB table
table_keysRequiredlistpartition key and sort key
pre_processfunctionimplement pre-processing of source data
pre_process_argslistlist of pre_process function arguments
pre_process_kwargsdictdict of pre_process function arguments
region_namestraws region name (example: us-east-1)
schemastrhive database schema
hiveserver2_conn_idstrsource hive connection
aws_conn_idstraws connection

Documentation

Moves data from Hive to DynamoDB, note that for now the data is loaded into memory before being pushed to DynamoDB, so this operator should be used for smallish amount of data.

Was this page helpful?