File: //snap/google-cloud-cli/396/help/man/man1/gcloud_alpha_container_ai_profiles.1
.TH "GCLOUD_ALPHA_CONTAINER_AI_PROFILES" 1
.SH "NAME"
.HP
gcloud alpha container ai profiles \- quickstart engine for GKE AI workloads
.SH "SYNOPSIS"
.HP
\f5gcloud alpha container ai profiles\fR \fIGROUP\fR [\fIGCLOUD_WIDE_FLAG\ ...\fR]
.SH "DESCRIPTION"
\fB(ALPHA)\fR The GKE Inference Quickstart helps simplify deploying AI inference
on Google Kubernetes Engine (GKE). It provides tailored profiles based on
Google's internal benchmarks. Provide inputs like your preferred open\-source
model (e.g. Llama, Gemma, or Mistral) and your application's performance target.
Based on these inputs, the quickstart generates accelerator choices with
performance metrics, and detailed, ready\-to\-deploy profiles for compute, load
balancing, and autoscaling. These profiles are provided as standard Kubernetes
YAML manifests, which you can deploy or modify.
To visualize the benchmarking data that support these estimates, see the
accompanying Colab notebook:
https://colab.research.google.com/github/GoogleCloudPlatform/kubernetes\-engine\-samples/blob/main/ai\-ml/notebooks/giq_visualizations.ipynb
.SH "GCLOUD WIDE FLAGS"
These flags are available to all commands: \-\-help.
Run \fB$ gcloud help\fR for details.
.SH "GROUPS"
\f5\fIGROUP\fR\fR is one of the following:
.RS 2m
.TP 2m
\fBaccelerators\fR
\fB(ALPHA)\fR Manage supported accelerators for GKE Inference Quickstart.
.TP 2m
\fBmanifests\fR
\fB(ALPHA)\fR Generate optimized Kubernetes manifests.
.TP 2m
\fBmodel\-and\-server\-combinations\fR
\fB(ALPHA)\fR Manage supported model and model servers for GKE Inference
Quickstart.
.TP 2m
\fBmodel\-server\-versions\fR
\fB(ALPHA)\fR Manage supported model server versions for GKE Inference
Quickstart.
.TP 2m
\fBmodel\-servers\fR
\fB(ALPHA)\fR Manage supported model servers for GKE Inference Quickstart.
.TP 2m
\fBmodels\fR
\fB(ALPHA)\fR Manage supported models for GKE Inference Quickstart.
.RE
.sp
.SH "NOTES"
This command is currently in alpha and might change without notice. If this
command fails with API permission errors despite specifying the correct project,
you might be trying to access an API with an invitation\-only early access
allowlist. This variant is also available:
.RS 2m
$ gcloud container ai profiles
.RE