admin管理员组

文章数量:1390822

I'm working with NVIDIA's MIG technology for parallel inference execution. Although the technology works well, I wanted to know if anyone knows how to reassign resources to an already created partition, whether they are unused resources or how to remove them from an in-use partition and assign them to another.

For now, my approach consists of stopping a process, destroying the partition, and regenerating it, but that causes a very large overhead, both due to the technology itself and also in terms of the time required to load the model, especially when resources need to be allocated from a partition in use. Does anyone know if these reassignments can be done on the fly to avoid overhead? I've seen that in Kubernetes-based systems it's possible. Does anyone know if it's possible to do something similar without relying on containers or virtual environments?

本文标签: cudaIs it possible to dynamically allocate resources in MIG partitions on NVIDIA GPUsStack Overflow