install qwen coder on ollama getting it working in open web ui
Notes from AI - Research
you can download a Claude 4b model via the Ollama CLI, provided a specific variant named claude-4b (or similar, e.g., claude-3.5-sonnet:4b or claude-3-haiku:4b)
ollama search claude
Recommendation #1 (The Powerhouse)ollama pull qwen2.5-coder:3b
Recommendation #2 (The 2026 New Arrival)
ollama pull qwen3-coder:4b - not working
ollama pull qwen3.5:4b
^^^ working
for coding use this one as its trained more on code
ollama pull qwen2.5-coder
To use it in Open WebUI:
-
Once the download finishes in your terminal, go to your Open WebUI browser tab.
-
Refresh the page.
-
Click the model selector at the top (where it says Llama 3.2).
-
Select Qwen2.5-Coder:3b from the list.
⚠️ Pro-Tip for 4GB Coding
When using a coding model, the "Context Window" (how much code the AI can "read" at once) is what eats up your remaining VRAM.
If you paste a 500-line script and the model crashes, it's because the "KV Cache" filled up your last 1GB of VRAM. To fix this in Open WebUI:
-
Go to Settings -> Advanced Parameters.
-
Find Context Length.
-
Set it to 4096 instead of the default 8192 or 32k. This limits how much "history" the AI remembers but guarantees it won't crash your 1650 Super during a long coding session.