-
Notifications
You must be signed in to change notification settings - Fork 11
feat: online inferencing with gpus (downloader) #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: online inferencing with gpus (downloader) #138
Conversation
b2b97e6
to
185c429
Compare
fc05f24
to
0295256
Compare
1f1b2ca
to
34f0bdb
Compare
06b3137
to
cc1ffc1
Compare
b1b62b7
to
14271a4
Compare
14271a4
to
5fb8060
Compare
Increasing the disk size on the node pool is going to increase the cost quite a bit for something that will be used infrequently. I think we should investigate something more event based using Cloud Build or possibly Cloud Run Jobs instead of increasing the disk size. |
521f0fe
to
811f98a
Compare
platforms/gke/base/use-cases/inference-ref-arch/terraform/cloud_storage/main.tf
Outdated
Show resolved
Hide resolved
platforms/gke/base/use-cases/inference-ref-arch/terraform/deploy.sh
Outdated
Show resolved
Hide resolved
811f98a
to
36e6e23
Compare
Refactored to use Cloud Storage directly, so no need to increase the boot disk size. |
36e6e23
to
ca01fc7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments, questions and suggestions. other than that LGTM
...ases/inference-ref-arch/kubernetes-manifests/model-download/load-model-to-cloud-storage.yaml
Show resolved
Hide resolved
...ases/inference-ref-arch/kubernetes-manifests/model-download/load-model-to-cloud-storage.yaml
Show resolved
Hide resolved
platforms/gke/base/use-cases/inference-ref-arch/online-inference-gpu/README.md
Show resolved
Hide resolved
platforms/gke/base/use-cases/inference-ref-arch/online-inference-gpu/README.md
Show resolved
Hide resolved
platforms/gke/base/use-cases/inference-ref-arch/online-inference-gpu/README.md
Outdated
Show resolved
Hide resolved
platforms/gke/base/use-cases/inference-ref-arch/online-inference-gpu/README.md
Outdated
Show resolved
Hide resolved
Implement a Kubernetes Job to download models from Hugging Face to Cloud Storage.
ca01fc7
to
4c53637
Compare
Implement a Kubernetes Job to download models from Hugging Face to Cloud Storage.
Implement a Kubernetes Job to download models from Hugging Face to Cloud Storage.
Implement a Kubernetes Job to download models from Hugging Face to Cloud Storage.
Implement a Kubernetes Job to download models from Hugging Face to Cloud Storage.
Implement a Kubernetes Job to download models from Hugging Face to Cloud Storage.
Implement a Kubernetes Job to download models from Hugging Face to Cloud Storage.