๐Ÿงช
Mac Playbook
โฑ 1 hr

LLaMA Factory on macOS

Install and fine-tune models with LLaMA Factory on MPS

Replaces DGX Spark: LLaMA Factory
fine-tuningui

Basic idea

LLaMA Factory is a unified fine-tuning framework that supports 100+ model architectures and multiple training paradigms โ€” SFT (Supervised Fine-Tuning), DPO (Direct Preference Optimization), ORPO, and PPO โ€” through a single WebUI or CLI. Rather than writing training code from scratch, you configure a run through a form, click Start, and LLaMA Factory handles the rest.

On macOS, it routes all computation through the PyTorch MPS backend, so all the MPS constraints apply (float32 only, fallback env var for missing ops). LLaMA Factory's main advantage over writing raw HuggingFace Trainer code is its built-in support for DPO and RLHF workflows, which require substantial infrastructure to implement manually.

What you'll accomplish

LLaMA Factory installed and serving its WebUI at localhost:7860, a completed LoRA fine-tune of Qwen2.5-7B-Instruct on the built-in alpaca_en_demo dataset (~50K samples), and the adapter saved to disk ready for inference or further evaluation.

What to know before starting

SFT vs DPO: SFT trains the model to predict the next token given an input โ€” it learns "what to say." DPO trains from preference pairs (chosen vs rejected responses), teaching the model "what's better." LLaMA Factory supports both with the same WebUI.
LLaMA Factory wraps HuggingFace Trainer: All MPS limitations apply. Float16 training will produce NaN loss. The `PYTORCH_ENABLE_MPS_FALLBACK=1` environment variable must be set.
Dataset formats: LLaMA Factory supports two main formats. Alpaca format has `instruction`, `input`, `output` fields. ShareGPT format has a `conversations` list with `from`/`value` pairs. Using the wrong format produces empty training batches.
WebUI maps to CLI arguments 1:1: Every WebUI field corresponds to a YAML config key. Once you've found working settings in the WebUI, export them to YAML for reproducible CLI runs.

Prerequisites

โ€ข macOS 12.3 or later
โ€ข Apple Silicon Mac (M1, M2, or M3 family)
โ€ข Python 3.9 or later
โ€ข git installed
โ€ข 16 GB+ unified memory (32 GB recommended for 7B models)
โ€ข ~20 GB free disk space for model weights and checkpoints

Time & risk

Duration: ~1 hour (install 15 min, model download 20-30 min, training varies)
Risk level: Medium โ€” clones a large repository and installs dozens of packages. Use a venv.
Rollback: Deactivate and delete the venv, delete the cloned repository and downloaded model cache.