gcp_projectstrThe GCP project which houses the GCS buckets where the Expectations files are stored and where the validation files & data docs will be output (e.g. HTML docs showing if the data matches Expectations).
expectations_suite_namestrThe name of the Expectation Suite containing the Expectations for the data. The suite should be in a JSON file with the same name as the suite (e.g. if the Expectations Suite named in the Expectation file is 'my_suite' then the Expectations file should be called my_suite.json)
gcs_bucketstrGoogle Cloud Storage bucket where expectation files are stored and where validation outputs and data docs will be saved. (e.g. gs:////Unexpected indentation.gs://mybucket/myprefix/myexpectationsfile.json )
gcs_expectations_prefixstrGoogle Cloud Storage prefix where the Expectations file can be found. (e.g. 'ge/expectations')
gcs_validations_prefixstrGoogle Cloud Storage prefix where the validation output files should be saved. (e.g. 'ge/validations')
gcs_datadocs_prefixstrGoogle Cloud Storage prefix where the validation datadocs files should be saved. (e.g. 'ge/datadocs')
querystra SQL query that defines the set of data to be validated (i.e. compared against Expectations). If the query parameter is filled in then the table parameter cannot be.
tablestrThe name of the BigQuery table (dataset_name.table_name) that defines the set of data to be validated. If the table parameter is filled in then the query parameter cannot be.
bigquery_conn_idstrName of the BigQuery connection (as configured in Airflow) that contains the connection and credentials info needed to connect to BigQuery.
bq_dataset_namestrThe name of the BigQuery data set where any temp tables will be created that are needed as part of the GE validation process.
send_alert_emailbooleanSend an alert email if one or more Expectations fail to be met. Defaults to True. This requires configuring an SMTP server in the Airflow config.
datadocs_link_in_emailbooleanInclude in the alert email a link to the data doc in GCS that shows the validation results? Defaults to False because there's extra setup needed to serve HTML data docs stored in GCS. When set to False, only a GCS path to the results are included in the email. Set up a GAE app to serve the data docs if you want a clickable link for the data doc to be included in the email. See here for set up instructions: https://docs.greatexpectations.io/en/latest/guides/how_to_guides/configuring_data_docs/how_to_host_and_share_data_docs_on_gcs.html
datadocs_domainstrThe domain from which the data docs are set up to be served (e.g. ge-data-docs-dot-my-gcp-project.ue.r.appspot.com). This only needs to be set if datadocs_link_in_email is set to True.
email_tostrEmail address to receive any alerts when Expectations are not met.
fail_task_on_validation_failurebooleanFail the Airflow task if Expectations are not met? Defaults to True.