Index | Thread | Search

From:
Kirill A. Korinsky <kirill@korins.ky>
Subject:
Re: misc/llama.cpp: update to b6934 with required update devel/libggml
To:
ports@openbsd.org
Date:
Mon, 03 Nov 2025 23:19:39 +0100

Download raw body.

Thread
On Mon, 03 Nov 2025 19:22:14 +0100,
Stuart Henderson <stu@spacehopper.org> wrote:
> 
> On 2025/11/03 15:15, Kirill A. Korinsky wrote:
> > We don't have GPU but with -t 32 I had run Qwen3 VL 30B model on CPU only at
> > AMD Ryzen 9 7950X3D with acceptable to use speed like 2 tokens/second which
> > more or leass useble. But it requires memory. 120G as :datasize is enough.
> > 
> > Because we uses libggml as dedicated port, it must to be updated to the last
> > version, and it contains a bug which brokes large models under large number
> > of threads: https://github.com/ggml-org/llama.cpp/issues/16960
> 
> i was hoping to hold off updating llama until there was a new ggml
> release (implying they think it's stable-ish) rather than follow the
> bleeding edge, but if you want then do it... please keep an eye on the
> repo for fixes for any breakages though.
>

Sure, I'll do.

More of that I'm waiting that upstream decided with this PR.

I won't go until it's merged at least to llama.cpp.

> 
> whisper still works, so with those changes it's ok with me.
> 

I'll incorparated your remarks in my local tree, thanks!

-- 
wbr, Kirill