admin管理员组文章数量:1277338
I have been experiencing a problem running an Autopilot GKE cluster. This problem actually inhabilitates pods to run so i'ts a little bit frustrating.
Actually mi configuration is only of two workloads. One deployment is configured to assign to the pods 1CPU and 2Gib RAM. But I'm constantly receiving the error that there is not enough CPU, also, in the events tab from the pod I can see the error "GCE quota exceeded" but without any details of what quota I've been exceeded. I've configured the deployment to only scale 1 pod.
Also I tried to look at quota usages ( IAM > Quotas) and the only quota I have with more than a 50% usage is Persistent Disk SSD, but the error i'm receiving indicates that the CPUs are not available.
This is an screenshot from the events tab:
This is an screenshot of my quota usage:
This is really messing me up, i absolutely dont understand why i'm receiving a quota error while i clearly have enough room for running 1 more CPU. I have been also checking minimum and maximums of resource requests for my class:
If that table is correct i should be inside the boundaries, 250mCPU < 1 CPU < 30 CPU. 512 MB < 2GB < 110 GB. I really don't understand why GKE is not executing my pod...
I have tried and investigated a lot, also other threads but im not able to find anything, hopefully someone has experienced the same problem and has succesfully solved it :)
I have been experiencing a problem running an Autopilot GKE cluster. This problem actually inhabilitates pods to run so i'ts a little bit frustrating.
Actually mi configuration is only of two workloads. One deployment is configured to assign to the pods 1CPU and 2Gib RAM. But I'm constantly receiving the error that there is not enough CPU, also, in the events tab from the pod I can see the error "GCE quota exceeded" but without any details of what quota I've been exceeded. I've configured the deployment to only scale 1 pod.
Also I tried to look at quota usages ( IAM > Quotas) and the only quota I have with more than a 50% usage is Persistent Disk SSD, but the error i'm receiving indicates that the CPUs are not available.
This is an screenshot from the events tab:
This is an screenshot of my quota usage:
This is really messing me up, i absolutely dont understand why i'm receiving a quota error while i clearly have enough room for running 1 more CPU. I have been also checking minimum and maximums of resource requests for my class:
If that table is correct i should be inside the boundaries, 250mCPU < 1 CPU < 30 CPU. 512 MB < 2GB < 110 GB. I really don't understand why GKE is not executing my pod...
I have tried and investigated a lot, also other threads but im not able to find anything, hopefully someone has experienced the same problem and has succesfully solved it :)
Share Improve this question asked Feb 25 at 10:26 comandantexdcomandantexd 576 bronze badges1 Answer
Reset to default 0The error that you are encountering GCE out of resources. Pod is at risk of not being scheduled indicates that your GKE Autopilot cluster is unable to allocate the necessary resources (CPU, memory, etc.) to schedule your pod.
As per this GCP Autopilot troubleshooting document :
To resolve this issue, you can try the following:
Deploy the Pod in a different region or zone. If your Pod has a zonal restriction such as a topology selector, remove the restriction if you can. For instructions, see Place GKE Pods in specific zones.
Create a cluster in a different region and retry the deployment.
Try using a different compute class. Compute classes that are backed by smaller Compute Engine machine types are more likely to have available resources. For example, the default machine type for Autopilot has the highest availability. For a list of compute classes and the corresponding machine types, see When to use specific compute classes.
If you run GPU workloads, the requested GPU might not be available in your node location. Try deploying your workload in a different location or requesting a different type of GPU.
You can also request the higher quota value following this request the higher quota value to raise the request and also check this document which may help to resolve the issue.
Edit :
Clear the cluster by manually deleting the resources which are stuck or in an inconsistent state and try to create a new cluster which might help you to resolve the issue.
版权声明:本文标题:google cloud platform - Autopilot GKE Cluster: GCE quota exceeded and insufficient CPU error - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1741211851a2359279.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论