Skip to content
This repository was archived by the owner on Jun 23, 2025. It is now read-only.

Reverting the TGI image version for LLAMA multiple GPUs in GKE samples #931

Conversation

raushan2016
Copy link
Member

The current image override the HF_HOME to /tmp from /data. Even after changing the mountpath to /tmp there is some regression in the newer TGI image which results into out of GPU memory on L4 and requires atleast A2 node. Rolling back the image version to get the sample working will investigation happen in the background.

Issue: GoogleCloudPlatform/kubernetes-engine-samples#1581

@annapendleton
Copy link
Contributor

/gcbrun

@chengcongdu chengcongdu merged commit c985e95 into GoogleCloudPlatform:main Jan 15, 2025
6 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants