Node-based image generation workflow with MPS backend
Replaces DGX Spark: Comfy UI
image generationui
Basic idea
ComfyUI is a node-based visual programming environment for image generation with Stable Diffusion and FLUX models. Instead of a single prompt box, you build a directed acyclic graph (DAG) where each node does exactly one thing: load a model, encode text, sample latents, decode to pixels, save an image. This visual composition lets you build complex pipelines โ inpainting, ControlNet guidance, LoRA stacking, img2img, upscaling โ by wiring nodes together without writing code. On macOS, ComfyUI uses PyTorch's MPS (Metal Performance Shaders) backend to accelerate inference on Apple Silicon's GPU.
What you'll accomplish
ComfyUI running at localhost:8188 with a working text-to-image pipeline using SDXL, a clear understanding of how to build and modify node graphs, and the knowledge to install custom nodes for additional capabilities like ControlNet and upscalers.
What to know before starting
Stable Diffusion architecture: โ Three components work in sequence: (1) CLIP text encoder converts your prompt to a 77-token embedding, (2) UNet denoiser iteratively removes noise from a latent tensor guided by that embedding, (3) VAE decoder translates the latent tensor into a full-resolution RGB image.
Latent space: โ The UNet doesn't work on pixels. It works on a compressed 4-channel latent representation that is 8ร smaller in each spatial dimension (a 1024ร1024 image is a 128ร128ร4 latent). This makes the denoising steps computationally tractable.
Sampler: โ The algorithm that takes denoising steps. Different samplers (euler, dpmpp_2m, dpmpp_2m_karras) produce different quality/speed tradeoffs. Karras variants use a noise schedule that often improves results.
CFG scale: โ Classifier-Free Guidance scale. Higher values make the output follow the prompt more strictly but can cause over-saturation and artifacts. For SDXL: 6โ8 is a good range; above 12 often degrades quality.
VAE: โ The Variational Autoencoder decoder. Each model checkpoint typically ships with a built-in VAE, but external VAE files (e.g., SDXL's `sdxl_vae.safetensors`) can improve color accuracy and sharpness.
safetensors format: โ The secure model checkpoint format that ComfyUI expects. Avoids the arbitrary code execution risk of `.ckpt` (pickle) files. Always prefer `.safetensors` when downloading models.
Prerequisites
โข macOS 12.3+ (Monterey) โ minimum for MPS backend in PyTorch
โข Apple Silicon Mac (Intel Macs work but are significantly slower without MPS acceleration)
โข Python 3.10+
โข 16 GB+ unified memory
โข 10โ30 GB free disk space for models
Time & risk
Duration:: 45 minutes including SDXL model download (6.9 GB)
Risk level:: Low โ ComfyUI runs as a local web server, no system changes
Rollback:: Delete the `ComfyUI/` directory and uninstall Python packages