ComfyUI on macOS

Node-based image generation workflow with MPS backend

Replaces DGX Spark: Comfy UI

image generationui

Basic idea

ComfyUI is a node-based visual programming environment for image generation with Stable Diffusion and FLUX models. Instead of a single prompt box, you build a directed acyclic graph (DAG) where each node does exactly one thing: load a model, encode text, sample latents, decode to pixels, save an image. This visual composition lets you build complex pipelines — inpainting, ControlNet guidance, LoRA stacking, img2img, upscaling — by wiring nodes together without writing code. On macOS, ComfyUI uses PyTorch's MPS (Metal Performance Shaders) backend to accelerate inference on Apple Silicon's GPU.

What you'll accomplish

ComfyUI running at localhost:8188 with a working text-to-image pipeline using SDXL, a clear understanding of how to build and modify node graphs, and the knowledge to install custom nodes for additional capabilities like ControlNet and upscalers.

What to know before starting

Stable Diffusion architecture: — Three components work in sequence: (1) CLIP text encoder converts your prompt to a 77-token embedding, (2) UNet denoiser iteratively removes noise from a latent tensor guided by that embedding, (3) VAE decoder translates the latent tensor into a full-resolution RGB image.

Latent space: — The UNet doesn't work on pixels. It works on a compressed 4-channel latent representation that is 8× smaller in each spatial dimension (a 1024×1024 image is a 128×128×4 latent). This makes the denoising steps computationally tractable.

Sampler: — The algorithm that takes denoising steps. Different samplers (euler, dpmpp_2m, dpmpp_2m_karras) produce different quality/speed tradeoffs. Karras variants use a noise schedule that often improves results.

CFG scale: — Classifier-Free Guidance scale. Higher values make the output follow the prompt more strictly but can cause over-saturation and artifacts. For SDXL: 6–8 is a good range; above 12 often degrades quality.

VAE: — The Variational Autoencoder decoder. Each model checkpoint typically ships with a built-in VAE, but external VAE files (e.g., SDXL's `sdxl_vae.safetensors`) can improve color accuracy and sharpness.

safetensors format: — The secure model checkpoint format that ComfyUI expects. Avoids the arbitrary code execution risk of `.ckpt` (pickle) files. Always prefer `.safetensors` when downloading models.

Prerequisites

• macOS 12.3+ (Monterey) — minimum for MPS backend in PyTorch

• Apple Silicon Mac (Intel Macs work but are significantly slower without MPS acceleration)

• Python 3.10+

• 16 GB+ unified memory

• 10–30 GB free disk space for models

Time & risk

Duration:: 45 minutes including SDXL model download (6.9 GB)

Risk level:: Low — ComfyUI runs as a local web server, no system changes

Rollback:: Delete the `ComfyUI/` directory and uninstall Python packages