HEX

File: //snap/google-cloud-cli/394/help/man/man1/gcloud_container_ai_profiles.1

.TH "GCLOUD_CONTAINER_AI_PROFILES" 1

.SH "NAME"
.HP
gcloud container ai profiles \- quickstart engine for GKE AI workloads

.SH "SYNOPSIS"
.HP
\f5gcloud container ai profiles\fR \fIGROUP\fR | \fICOMMAND\fR [\fIGCLOUD_WIDE_FLAG\ ...\fR]

.SH "DESCRIPTION"

The GKE Inference Quickstart helps simplify deploying AI inference on Google
Kubernetes Engine (GKE). It provides tailored profiles based on Google's
internal benchmarks. Provide inputs like your preferred open\-source model (e.g.
Llama, Gemma, or Mistral) and your application's performance target. Based on
these inputs, the quickstart generates accelerator choices with performance
metrics, and detailed, ready\-to\-deploy profiles for compute, load balancing,
and autoscaling. These profiles are provided as standard Kubernetes YAML
manifests, which you can deploy or modify.

To visualize the benchmarking data that support these estimates, see the
accompanying Colab notebook:
https://colab.research.google.com/github/GoogleCloudPlatform/kubernetes\-engine\-samples/blob/main/ai\-ml/notebooks/giq_visualizations.ipynb

.SH "GCLOUD WIDE FLAGS"

These flags are available to all commands: \-\-help.

Run \fB$ gcloud help\fR for details.

.SH "GROUPS"

\f5\fIGROUP\fR\fR is one of the following:

.RS 2m
.TP 2m
\fBbenchmarks\fR

Manage benchmarks for GKE Inference Quickstart.

.TP 2m
\fBmanifests\fR

Generate optimized Kubernetes manifests.

.TP 2m
\fBmodel\-server\-versions\fR

Manage supported model server versions for GKE Inference Quickstart.

.TP 2m
\fBmodel\-servers\fR

Manage supported model servers for GKE Inference Quickstart.

.TP 2m
\fBmodels\fR

Manage supported models for GKE Inference Quickstart.

.RE
.sp

.SH "COMMANDS"

\f5\fICOMMAND\fR\fR is one of the following:

.RS 2m
.TP 2m
\fBlist\fR

List compatible accelerator profiles.

.RE
.sp

.SH "NOTES"

This variant is also available:

.RS 2m
$ gcloud alpha container ai profiles
.RE