admin管理员组

文章数量:1320661

There were 2 pods running in my micro-service, both of them got restarted with kubernetes reason as OOM killed enter image description here (The above dashboard uses the following query->sum(0,increase(kube_pod_container_status_last_terminated_reason{cluster="prod_cluster",container=~"$service"}[$__range])) by (reason,container,namespace,pod) > 0) We analysed the memory and CPU trends for these pods: CPU trend Memory trend As we can see CPU and memory looks normal. These are the specs for this service: Specs file Also, the average CPU consumption and memory consumption for both pods was normal. Next we suspected connection issues(with downstream service/databases/kafka) was checked but it was fine and nothing was observed. This led us to beleive the issue might have been at the node level, we checked the memory consumption of node and realised it was always fine: Container memory Node memory trend As seen, container and node memory both are fine and no spikes/leakages are observed. Also we analysed the traffic patterns, latencies of all the downstream systems but still could not find anything.

We analysed all the possible explanations of OOM killed but could not reach to anything conclusive. Expected a memory breach.

There were 2 pods running in my micro-service, both of them got restarted with kubernetes reason as OOM killed enter image description here (The above dashboard uses the following query->sum(0,increase(kube_pod_container_status_last_terminated_reason{cluster="prod_cluster",container=~"$service"}[$__range])) by (reason,container,namespace,pod) > 0) We analysed the memory and CPU trends for these pods: CPU trend Memory trend As we can see CPU and memory looks normal. These are the specs for this service: Specs file Also, the average CPU consumption and memory consumption for both pods was normal. Next we suspected connection issues(with downstream service/databases/kafka) was checked but it was fine and nothing was observed. This led us to beleive the issue might have been at the node level, we checked the memory consumption of node and realised it was always fine: Container memory Node memory trend As seen, container and node memory both are fine and no spikes/leakages are observed. Also we analysed the traffic patterns, latencies of all the downstream systems but still could not find anything.

We analysed all the possible explanations of OOM killed but could not reach to anything conclusive. Expected a memory breach.

Share Improve this question asked Jan 18 at 16:21 Yash AroraYash Arora 1
Add a comment  | 

1 Answer 1

Reset to default 1

There is a possibility that the limit set to the pod is too low. You can use the kubectl describe pod $POD_NAME command to check the events.

I recommend making the memory limit higher.

本文标签: memoryPod restart issue in java based microservice architectureStack Overflow