MLEngineStartTrainingJobOperator

Google

Operator for launching a MLEngine training job.

View Source

Last Updated: May. 7, 2021

Access Instructions

Install the Google provider package into your Airflow environment.

Import the module into your DAG file and instantiate it with your desired params.

Parameters

job_idstrA unique templated id for the submitted Google MLEngine training job. (templated)
regionstrThe Google Compute Engine region to run the MLEngine training job in (templated).
package_urisList[str]A list of Python package locations for the training job, which should include the main training program and any additional dependencies. This is mutually exclusive with a custom image specified via master_config. (templated)
training_python_modulestrThe name of the Python module to run within the training job after installing the packages. This is mutually exclusive with a custom image specified via master_config. (templated)
training_argsList[str]A list of command-line arguments to pass to the training program. (templated)
scale_tierstrResource tier for MLEngine training job. (templated)
master_typestrThe type of virtual machine to use for the master worker. It must be set whenever scale_tier is CUSTOM. (templated)
master_configdictThe configuration for the master worker. If this is provided, master_type must be set as well. If a custom image is specified, this is mutually exclusive with package_uris and training_python_module. (templated)
runtime_versionstrThe Google Cloud ML runtime version to use for training. (templated)
python_versionstrThe version of Python used in training. (templated)
job_dirstrA Google Cloud Storage path in which to store training outputs and other data needed for training. (templated)
service_accountstrOptional service account to use when running the training application. (templated) The specified service account must have the iam.serviceAccounts.actAs role. The Google-managed Cloud ML Engine service account must have the iam.serviceAccountAdmin role for the specified service account. If set to None or missing, the Google-managed Cloud ML Engine service account will be used.
project_idstrThe Google Cloud project name within which MLEngine training job should run. If set to None or missing, the default project_id from the Google Cloud connection is used. (templated)
gcp_conn_idstrThe connection ID to use when fetching connection info.
delegate_tostrThe account to impersonate using domain-wide delegation of authority, if any. For this to work, the service account making the request must have domain-wide delegation enabled.
modestrCan be one of 'DRY_RUN'/'CLOUD'. In 'DRY_RUN' mode, no real training job will be launched, but the MLEngine training job request will be printed out. In 'CLOUD' mode, a real MLEngine training job creation request will be issued.
labelsDict[str, str]a dictionary containing labels for the job; passed to BigQuery
impersonation_chainUnion[str, Sequence[str]]Optional service account to impersonate using short-term credentials, or chained list of accounts required to get the access_token of the last account in the list, which will be impersonated in the request. If set as a string, the account must grant the originating account the Service Account Token Creator IAM role. If set as a sequence, the identities from the list must grant Service Account Token Creator IAM role to the directly preceding identity, with first account from the list granting this role to the originating account (templated).

Documentation

Operator for launching a MLEngine training job.

See also

For more information on how to use this operator, take a look at the guide: Launching a Job

Was this page helpful?