BigQueryExecuteQueryOperator

Google

Executes BigQuery SQL queries in a specific BigQuery database. This operator does not assert idempotency.

View Source

Last Updated: May. 7, 2021

Access Instructions

Install the Google provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

sqlCan receive a str representing a sql statement, a list of str (sql statements), or reference to a template file. Template reference are recognized by str ending in '.sql'.the sql code to be executed (templated)
destination_dataset_tablestrA dotted (.|:). that, if set, will store the results of the query. (templated)
write_dispositionstrSpecifies the action that occurs if the destination table already exists. (default: 'WRITE_EMPTY')
create_dispositionstrSpecifies whether the job is allowed to create new tables. (default: 'CREATE_IF_NEEDED')
allow_large_resultsboolWhether to allow large results.
flatten_resultsboolIf true and query uses legacy SQL dialect, flattens all nested and repeated fields in the query results. allow_large_results must be true if this is set to false. For standard SQL queries, this flag is ignored and results are never flattened.
gcp_conn_idstr(Optional) The connection ID used to connect to Google Cloud.
bigquery_conn_idstr(Deprecated) The connection ID used to connect to Google Cloud. This parameter has been deprecated. You should pass the gcp_conn_id parameter instead.
delegate_tostrThe account to impersonate using domain-wide delegation of authority, if any. For this to work, the service account making the request must have domain-wide delegation enabled.
udf_configlistThe User Defined Function configuration for the query. See https://cloud.google.com/bigquery/user-defined-functions for details.
use_legacy_sqlboolWhether to use legacy SQL (true) or standard SQL (false).
maximum_billing_tierintPositive integer that serves as a multiplier of the basic price. Defaults to None, in which case it uses the value set in the project.
maximum_bytes_billedfloatLimits the bytes billed for this job. Queries that will have bytes billed beyond this limit will fail (without incurring a charge). If unspecified, this will be set to your project default.
api_resource_configsdicta dictionary that contain params 'configuration' applied for Google BigQuery Jobs API: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs for example, {'query': {'useQueryCache': False}}. You could use it if you need to provide some params that are not supported by BigQueryOperator like args.
schema_update_optionsOptional[Union[list, tuple, set]]Allows the schema of the destination table to be updated as a side effect of the load job.
query_paramslista list of dictionary containing query parameter types and values, passed to BigQuery. The structure of dictionary should look like 'queryParameters' in Google BigQuery Jobs API: https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs. For example, [{ 'name': 'corpus', 'parameterType': { 'type': 'STRING' }, 'parameterValue': { 'value': 'romeoandjuliet' } }]. (templated)
labelsdicta dictionary containing labels for the job/query, passed to BigQuery
prioritystrSpecifies a priority for the query. Possible values include INTERACTIVE and BATCH. The default value is INTERACTIVE.
time_partitioningdictconfigure optional time partitioning fields i.e. partition by field, type and expiration as per API specifications.
cluster_fieldslist[str]Request that the result of this query be stored sorted by one or more columns. BigQuery supports clustering for both partitioned and non-partitioned tables. The order of columns given determines the sort order.
locationstrThe geographic location of the job. Required except for US and EU. See details at https://cloud.google.com/bigquery/docs/locations#specifying_your_location
encryption_configurationdict[Optional] Custom encryption configuration (e.g., Cloud KMS keys). Example:encryption_configuration = { "kmsKeyName": "projects/testp/locations/us/keyRings/test-kr/cryptoKeys/test-key" }
impersonation_chainUnion[str, Sequence[str]]Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

Documentation

Executes BigQuery SQL queries in a specific BigQuery database. This operator does not assert idempotency.

Was this page helpful?