admin管理员组

文章数量:1278793

I tried to start image generation via chat in OpenWebUI on AUTOMATIC1111 (Schedule type: Кarras, Sampling method: DPM++ 2M, CFG Scale: 5.5, W:768, H:768, Sampling Steps: 20), and after ~20 seconds I got an image, but the VRAM resource was not released and I decided to see what ollama processes are currently running, and got the following:

root@PrivateAI:~# ps aux | grep ollama
root        2309  0.2  0.5 7020232 147716 pts/0  Sl+  15:21   0:03 ollama serve
root        9416  2.9  2.0 52018904 576780 pts/0 Sl+  15:38   0:08 /usr/local/lib/ollama/runners/cuda_v12_avx/ollama_llama_server runner --model /root/.ollama/models/blobs/sha256-87d5b13e5157d3a67f8e10a46d8a846ec2b68c1f731e3dfe1546a585432b8fa0 --ctx-size 2048 --batch-size 512 --n-gpu-layers 41 --mmproj /root/.ollama/models/blobs/sha256-42037f9f4c1b801eebaec1545ed144b8b0fa8259672158fb69c8c68f02cfe00c --threads 12 --parallel 1 --port 39237
root       11587  0.0  0.0   9144  2176 pts/2    S+   15:42   0:00 grep --color=auto ollama

ps aux | grep ollama

I want to know:

why is batch-size set to 512 (if i understand correctly, it is the number of generated images that go to the response.. or not?) why is ctx-size equal to 2048 and not 10240? is it possible to change these parameters?

Server: Proxmox VE (v. 8.3.3) rtx 3080ti PCI(e) passthrough, driver 560.35.03 RAM 28.0 GiB CPU [host] (2 sockets, 6 cores) Ollama (v. 0.5.7) Open WebUI (v. 0.5.10) AUTOMATIC1111 (Stable Diffusion WebUI): version - v1.10.1, python - 3.10.16, torch - 2.1.2+cu121, xformers - 0.0.23.post1, gradio - 3.41.2

本文标签: where does ollama get the parameters to run the model with automatic1111Stack Overflow