Self-Enrichment

Self-Enrichment

#llm

Articles tagged with #llm

MobileLLM-R1-950M meets Apple Silicon
From Stub to Coherent MLX Model, a story-shaped release note Getting Meta's efficient 950M parameter model running natively on Apple Silicon for fast, local inference I’ve done a tiny bit of contributions to the llama.cpp community, including adding ...
Sep 15, 20256 min read77
On KL-Divergence and Context Size Optimization in GGUF Quantization
note: This paper was written with AI, and the exploration it describes was done collaboratively with AI. The "we" described here is us; me and a few models. With llama.cpp model quantization, properly adjusting models to keep their performance after ...
Oct 28, 202413 min read151