HEX

File: //snap/google-cloud-cli/394/help/man/man1/gcloud_dataproc_batches_submit_spark-sql.1
.TH "GCLOUD_DATAPROC_BATCHES_SUBMIT_SPARK\-SQL" 1



.SH "NAME"
.HP
gcloud dataproc batches submit spark\-sql \- submit a Spark SQL batch job



.SH "SYNOPSIS"
.HP
\f5gcloud dataproc batches submit spark\-sql\fR \fISQL_SCRIPT\fR [\fB\-\-async\fR] [\fB\-\-batch\fR=\fIBATCH\fR] [\fB\-\-container\-image\fR=\fICONTAINER_IMAGE\fR] [\fB\-\-deps\-bucket\fR=\fIDEPS_BUCKET\fR] [\fB\-\-history\-server\-cluster\fR=\fIHISTORY_SERVER_CLUSTER\fR] [\fB\-\-jars\fR=[\fIJAR\fR,...]] [\fB\-\-kms\-key\fR=\fIKMS_KEY\fR] [\fB\-\-labels\fR=[\fIKEY\fR=\fIVALUE\fR,...]] [\fB\-\-metastore\-service\fR=\fIMETASTORE_SERVICE\fR] [\fB\-\-properties\fR=[\fIPROPERTY\fR=\fIVALUE\fR,...]] [\fB\-\-region\fR=\fIREGION\fR] [\fB\-\-request\-id\fR=\fIREQUEST_ID\fR] [\fB\-\-service\-account\fR=\fISERVICE_ACCOUNT\fR] [\fB\-\-staging\-bucket\fR=\fISTAGING_BUCKET\fR] [\fB\-\-tags\fR=[\fITAGS\fR,...]] [\fB\-\-ttl\fR=\fITTL\fR] [\fB\-\-user\-workload\-authentication\-type\fR=\fIUSER_WORKLOAD_AUTHENTICATION_TYPE\fR] [\fB\-\-vars\fR=[\fINAME\fR=\fIVALUE\fR,...]] [\fB\-\-version\fR=\fIVERSION\fR] [\fB\-\-network\fR=\fINETWORK\fR\ |\ \fB\-\-subnet\fR=\fISUBNET\fR] [\fIGCLOUD_WIDE_FLAG\ ...\fR]



.SH "DESCRIPTION"

Submit a Spark SQL batch job.



.SH "EXAMPLES"

To submit a Spark SQL job running "my\-sql\-script.sql" and upload it to
"gs://my\-bucket", run:

.RS 2m
$ gcloud dataproc batches submit spark\-sql my\-sql\-script.sql \e
    \-\-deps\-bucket=gs://my\-bucket \-\-region=us\-central1 \e
    \-\-vars="NAME=VALUE,NAME2=VALUE2"
.RE



.SH "POSITIONAL ARGUMENTS"

.RS 2m
.TP 2m
\fISQL_SCRIPT\fR

URI of the script that contains Spark SQL queries to execute.


.RE
.sp

.SH "FLAGS"

.RS 2m
.TP 2m
\fB\-\-async\fR

Return immediately without waiting for the operation in progress to complete.

.TP 2m
\fB\-\-batch\fR=\fIBATCH\fR

The ID of the batch job to submit. The ID must contain only lowercase letters
(a\-z), numbers (0\-9) and hyphens (\-). The length of the name must be between
4 and 63 characters. If this argument is not provided, a random generated UUID
will be used.

.TP 2m
\fB\-\-container\-image\fR=\fICONTAINER_IMAGE\fR

Optional custom container image to use for the batch/session runtime
environment. If not specified, a default container image will be used. The value
should follow the container image naming format:
{registry}/{repository}/{name}:{tag}, for example,
gcr.io/my\-project/my\-image:1.2.3

.TP 2m
\fB\-\-deps\-bucket\fR=\fIDEPS_BUCKET\fR

A Cloud Storage bucket to upload workload dependencies.

.TP 2m
\fB\-\-history\-server\-cluster\fR=\fIHISTORY_SERVER_CLUSTER\fR

Spark History Server configuration for the batch/session job. Resource name of
an existing Dataproc cluster to act as a Spark History Server for the workload
in the format: "projects/{project_id}/regions/{region}/clusters/{cluster_name}".

.TP 2m
\fB\-\-jars\fR=[\fIJAR\fR,...]

Comma\-separated list of jar files to be provided to the classpaths.

.TP 2m
\fB\-\-kms\-key\fR=\fIKMS_KEY\fR

Cloud KMS key to use for encryption.

.TP 2m
\fB\-\-labels\fR=[\fIKEY\fR=\fIVALUE\fR,...]

List of label KEY=VALUE pairs to add.

Keys must start with a lowercase character and contain only hyphens (\f5\-\fR),
underscores (\f5_\fR), lowercase characters, and numbers. Values must contain
only hyphens (\f5\-\fR), underscores (\f5_\fR), lowercase characters, and
numbers.

.TP 2m
\fB\-\-metastore\-service\fR=\fIMETASTORE_SERVICE\fR

Name of a Dataproc Metastore service to be used as an external metastore in the
format: "projects/{project\-id}/locations/{region}/services/{service\-name}".

.TP 2m
\fB\-\-properties\fR=[\fIPROPERTY\fR=\fIVALUE\fR,...]

Specifies configuration properties for the workload. See Dataproc Serverless for
Spark documentation
(https://cloud.google.com/dataproc\-serverless/docs/concepts/properties) for the
list of supported properties.

.TP 2m

Region resource \- Dataproc region to use. Each Dataproc region constitutes an
independent resource namespace constrained to deploying instances into Compute
Engine zones inside the region. This represents a Cloud resource. (NOTE) Some
attributes are not given arguments in this group but can be set in other ways.

To set the \f5project\fR attribute:
.RS 2m
.IP "\(em" 2m
provide the argument \f5\-\-region\fR on the command line with a fully specified
name;
.IP "\(em" 2m
set the property \f5dataproc/region\fR with a fully specified name;
.IP "\(em" 2m
provide the argument \f5\-\-project\fR on the command line;
.IP "\(em" 2m
set the property \f5core/project\fR.
.RE
.sp


.RS 2m
.TP 2m
\fB\-\-region\fR=\fIREGION\fR

ID of the region or fully qualified identifier for the region.

To set the \f5region\fR attribute:
.RS 2m
.IP "\(bu" 2m
provide the argument \f5\-\-region\fR on the command line;
.IP "\(bu" 2m
set the property \f5dataproc/region\fR.
.RE
.sp

.RE
.sp
.TP 2m
\fB\-\-request\-id\fR=\fIREQUEST_ID\fR

A unique ID that identifies the request. If the service receives two batch
create requests with the same request_id, the second request is ignored and the
operation that corresponds to the first batch created and stored in the backend
is returned. Recommendation: Always set this value to a UUID. The value must
contain only letters (a\-z, A\-Z), numbers (0\-9), underscores (\fI), and
hyphens (\-). The maximum length is 40 characters.

.TP 2m
\fB\-\-service\-account\fR=\fRSERVICE_ACCOUNT\fI

The IAM service account to be used for a batch/session job.

.TP 2m
\fB\-\-staging\-bucket\fR=\fRSTAGING_BUCKET\fI

The Cloud Storage bucket to use to store job dependencies, config files, and job
driver console output. If not specified, the default [staging bucket]
(https://cloud.google.com/dataproc\-serverless/docs/concepts/buckets) is used.

.TP 2m
\fB\-\-tags\fR=[\fRTAGS\fI,...]

Network tags for traffic control.

.TP 2m
\fB\-\-ttl\fR=\fRTTL\fI

The duration after the workload will be unconditionally terminated, for example,
\'20m' or '1h'. Run gcloud topic datetimes
(https://cloud.google.com/sdk/gcloud/reference/topic/datetimes) for information
on duration formats.

.TP 2m
\fB\-\-user\-workload\-authentication\-type\fR=\fRUSER_WORKLOAD_AUTHENTICATION_TYPE\fI

Whether to use END_USER_CREDENTIALS or SERVICE_ACCOUNT to run the workload.

.TP 2m
\fB\-\-vars\fR=[\fRNAME\fI=\fRVALUE\fI,...]

Mapping of query variable names to values (equivalent to the Spark SQL command:
SET name="value";).

.TP 2m
\fB\-\-version\fR=\fRVERSION\fI

Optional runtime version. If not specified, a default version will be used.

.TP 2m

At most one of these can be specified:


.RS 2m
.TP 2m
\fB\-\-network\fR=\fRNETWORK\fI

Network URI to connect network to.

.TP 2m
\fB\-\-subnet\fR=\fRSUBNET\fI

Subnetwork URI to connect network to. Subnet must have Private Google Access
enabled.


\fR
.RE
.RE
.sp

.SH "GCLOUD WIDE FLAGS"

These flags are available to all commands: \-\-access\-token\-file, \-\-account,
\-\-billing\-project, \-\-configuration, \-\-flags\-file, \-\-flatten,
\-\-format, \-\-help, \-\-impersonate\-service\-account, \-\-log\-http,
\-\-project, \-\-quiet, \-\-trace\-token, \-\-user\-output\-enabled,
\-\-verbosity.

Run \fB$ gcloud help\fR for details.



.SH "NOTES"

This variant is also available:

.RS 2m
$ gcloud beta dataproc batches submit spark\-sql
.RE