HEX

File: //snap/google-cloud-cli/394/help/man/man1/gcloud_alpha_ai-platform_local_train.1
.TH "GCLOUD_ALPHA_AI\-PLATFORM_LOCAL_TRAIN" 1



.SH "NAME"
.HP
gcloud alpha ai\-platform local train \- run an AI Platform training job locally



.SH "SYNOPSIS"
.HP
\f5gcloud alpha ai\-platform local train\fR  \fB\-\-module\-name\fR=\fIMODULE_NAME\fR [\fB\-\-distributed\fR] [\fB\-\-evaluator\-count\fR=\fIEVALUATOR_COUNT\fR] [\fB\-\-job\-dir\fR=\fIJOB_DIR\fR] [\fB\-\-package\-path\fR=\fIPACKAGE_PATH\fR] [\fB\-\-parameter\-server\-count\fR=\fIPARAMETER_SERVER_COUNT\fR] [\fB\-\-start\-port\fR=\fISTART_PORT\fR;\ default=27182] [\fB\-\-worker\-count\fR=\fIWORKER_COUNT\fR] [\fIGCLOUD_WIDE_FLAG\ ...\fR] [\-\-\ \fIUSER_ARGS\fR\ ...]



.SH "DESCRIPTION"

\fB(ALPHA)\fR This command runs the specified module in an environment similar
to that of a live AI Platform Training Job.

This is especially useful in the case of testing distributed models, as it
allows you to validate that you are properly interacting with the AI Platform
cluster configuration. If your model expects a specific number of parameter
servers or workers (i.e. you expect to use the CUSTOM machine type), use the
\-\-parameter\-server\-count and \-\-worker\-count flags to further specify the
desired cluster configuration, just as you would in your cloud training job
configuration:

.RS 2m
$ gcloud alpha ai\-platform local train \-\-module\-name trainer.task \e
        \-\-package\-path /path/to/my/code/trainer \e
        \-\-distributed \e
        \-\-parameter\-server\-count 4 \e
        \-\-worker\-count 8
.RE

Unlike submitting a training job, the \-\-package\-path parameter can be
omitted, and will use your current working directory.

AI Platform Training sets a TF_CONFIG environment variable on each VM in your
training job. You can use TF_CONFIG to access the cluster description and the
task description for each VM.

Learn more about TF_CONFIG:
https://cloud.google.com/ai\-platform/training/docs/distributed\-training\-details.



.SH "POSITIONAL ARGUMENTS"

.RS 2m
.TP 2m
[\-\- \fIUSER_ARGS\fR ...]

Additional user arguments to be forwarded to user code. Any relative paths will
be relative to the \fBparent\fR directory of \f5\-\-package\-path\fR.


The '\-\-' argument must be specified between gcloud specific args on the left
and USER_ARGS on the right.


.RE
.sp

.SH "REQUIRED FLAGS"

.RS 2m
.TP 2m
\fB\-\-module\-name\fR=\fIMODULE_NAME\fR

Name of the module to run.


.RE
.sp

.SH "OPTIONAL FLAGS"

.RS 2m
.TP 2m
\fB\-\-distributed\fR

Runs the provided code in distributed mode by providing cluster configurations
as environment variables to subprocesses

.TP 2m
\fB\-\-evaluator\-count\fR=\fIEVALUATOR_COUNT\fR

Number of evaluators with which to run. Ignored if \-\-distributed is not
specified. Default: 0

.TP 2m
\fB\-\-job\-dir\fR=\fIJOB_DIR\fR

Cloud Storage path or local_directory in which to store training outputs and
other data needed for training.

This path will be passed to your TensorFlow program as the \f5\-\-job\-dir\fR
command\-line arg. The benefit of specifying this field is that AI Platform will
validate the path for use in training. However, note that your training program
will need to parse the provided \f5\-\-job\-dir\fR argument.

.TP 2m
\fB\-\-package\-path\fR=\fIPACKAGE_PATH\fR

Path to a Python package to build. This should point to a \fBlocal\fR directory
containing the Python source for the job. It will be built using
\fBsetuptools\fR (which must be installed) using its \fBparent\fR directory as
context. If the parent directory contains a \f5setup.py\fR file, the build will
use that; otherwise, it will use a simple built\-in one.

.TP 2m
\fB\-\-parameter\-server\-count\fR=\fIPARAMETER_SERVER_COUNT\fR

Number of parameter servers with which to run. Ignored if \-\-distributed is not
specified. Default: 2

.TP 2m
\fB\-\-start\-port\fR=\fISTART_PORT\fR; default=27182

Start of the range of ports reserved by the local cluster. This command will use
a contiguous block of ports equal to parameter\-server\-count + worker\-count +
1.

If \-\-distributed is not specified, this flag is ignored.

.TP 2m
\fB\-\-worker\-count\fR=\fIWORKER_COUNT\fR

Number of workers with which to run. Ignored if \-\-distributed is not
specified. Default: 2


.RE
.sp

.SH "GCLOUD WIDE FLAGS"

These flags are available to all commands: \-\-access\-token\-file, \-\-account,
\-\-billing\-project, \-\-configuration, \-\-flags\-file, \-\-flatten,
\-\-format, \-\-help, \-\-impersonate\-service\-account, \-\-log\-http,
\-\-project, \-\-quiet, \-\-trace\-token, \-\-user\-output\-enabled,
\-\-verbosity.

Run \fB$ gcloud help\fR for details.



.SH "NOTES"

This command is currently in alpha and might change without notice. If this
command fails with API permission errors despite specifying the correct project,
you might be trying to access an API with an invitation\-only early access
allowlist. These variants are also available:

.RS 2m
$ gcloud ai\-platform local train
.RE

.RS 2m
$ gcloud beta ai\-platform local train
.RE