Install and fine-tune models with LLaMA Factory on MPS
Replaces DGX Spark: LLaMA Factory
fine-tuningui
Basic idea
LLaMA Factory is a unified fine-tuning framework that supports 100+ model architectures and multiple training paradigms โ SFT (Supervised Fine-Tuning), DPO (Direct Preference Optimization), ORPO, and PPO โ through a single WebUI or CLI. Rather than writing training code from scratch, you configure a run through a form, click Start, and LLaMA Factory handles the rest.
On macOS, it routes all computation through the PyTorch MPS backend, so all the MPS constraints apply (float32 only, fallback env var for missing ops). LLaMA Factory's main advantage over writing raw HuggingFace Trainer code is its built-in support for DPO and RLHF workflows, which require substantial infrastructure to implement manually.
What you'll accomplish
LLaMA Factory installed and serving its WebUI at localhost:7860, a completed LoRA fine-tune of Qwen2.5-7B-Instruct on the built-in alpaca_en_demo dataset (~50K samples), and the adapter saved to disk ready for inference or further evaluation.
What to know before starting
SFT vs DPO: SFT trains the model to predict the next token given an input โ it learns "what to say." DPO trains from preference pairs (chosen vs rejected responses), teaching the model "what's better." LLaMA Factory supports both with the same WebUI.
LLaMA Factory wraps HuggingFace Trainer: All MPS limitations apply. Float16 training will produce NaN loss. The `PYTORCH_ENABLE_MPS_FALLBACK=1` environment variable must be set.
Dataset formats: LLaMA Factory supports two main formats. Alpaca format has `instruction`, `input`, `output` fields. ShareGPT format has a `conversations` list with `from`/`value` pairs. Using the wrong format produces empty training batches.
WebUI maps to CLI arguments 1:1: Every WebUI field corresponds to a YAML config key. Once you've found working settings in the WebUI, export them to YAML for reproducible CLI runs.