Llamacpp - DEV Community

Skip to content

DEV Community

👋 Sign in for the ability to sort posts by relevant, latest, or top.

Jasur Yuldoshev

Jul 18

The model switcher that kernel-panicked my Mac

#llm #macos #debugging #llamacpp

5 min read

Pneumetron

Jul 18

GnLOLot Releases MiniCPM5-1B-Claude-Opus-Fable5-V2-Thinking-GGUF for Enhanced Local AI Development

#gguf #llamacpp #quantized #minicpm5

3 min read

Jul 14

Running a 1.5B-Parameter LLM Entirely On-Device for Mental Health — The NilaMind Architecture

#llamacpp #opensource #android

5 min read

ZyVOP

Jun 22

Can a $2,000 Mini PC Replace Your AI Cloud Bill?

#localai #strixhalo #hermesagent #llamacpp

9 min read

Creeta

Jun 18

llama-bench skipped FA on capable GPUs — b9437 corrects it

#llamacpp #llm #gguf #flashattention

7 min read

Kunal

Jun 5

Hermes Agent Desktop Free With Local LLMs: The Claude Code Alternative Nobody's Billing You For [2026]

#hermesagent #localllm #claudecodealternative #llamacpp

8 min read

Federico "SpeederX" Piana

Jul 4

What secretly eats your local LLMs' speed as your context fills up - Part 2

#ai #machinelearning #locallm #llamacpp

4 min read

Rost

May 24

Qwen 3.6 27B and 35B MTP vs Standard on 16GB GPU

#selfhosting #llm #ai #llamacpp

8 min read

Aurora

May 13

Self-Hosted AI Agent Systems: Why Local Inference Matters More Than You Think

#rust #ai #llamacpp #selfhosted

4 min read

TTFT and RAG efficiency insights

Deepu K Sasidharan

Jun 2

How fast is LlamaStash? Overhead, throughput, and a fair comparison with Ollama and LM Studio

#ai #llamacpp #benchmark #llm

24 min read

Jun 9

How to Tune llama.cpp --n-gpu-layers: A Practical VRAM Guide (2026)

#localllm #llamacpp #gpu #vram

4 min read

Jun 8

How to Tune --n-gpu-layers for Your VRAM Budget

#localllm #llamacpp #gpu #vram

4 min read

OpenAI proxy and VRAM-aware crash recovery

Deepu K Sasidharan

Jun 2

Introducing LlamaStash: a zero-overhead, terminal-native llama.cpp launcher

#ai #llamacpp #localllm #rust

11 min read

Jun 4

llama.cpp ngl: when -ngl 99 still runs on your CPU

#llamacpp #localllm #gpuoffloading #ngpulayers

5 min read

Storm Engine Technology.

Jun 3

llama.cpp b9455 Finally Caught vLLM: 70t/s on 2x3090 Qwen 27B UQ8

#llamacpp #llm #ai #opensource

3 min read

👋 Sign in for the ability to sort posts by relevant, latest, or top.