Local AI Image Generator on Cachy?

ROCm, Torch, Torchvision, etc.. how do i get it running with amd on linux?
I want to use a realistic image generator locally. So i need a model. Flux seems promissing. Models are not the problem, rather finding a GUI for linux, is. And the dependencies do confuse me, with that i mean rocm, torchvision, that stuff.

Could someone elaborate into which cathegories can be main/primary software i need be split into?

  1. GUI, 2. Model, 3. Dependencies?

ComfyUI and WebUI Forge (formerly known as AUTOMATIC1111) are the most popular frontends. There are simpler ones, but less fully featured. If you’re serious about getting into it, you might as well learn to ride on a real bike.

Their Github pages contain detailed install instructions.

Civitai and Huggingface are some of the biggest sources of models. There are many others.

Both Comfy and Forge depend on Python and work best with version 3.13. Cachy already updated to 3.14. It’s probably a good idea to use something like uv to set up a venv with Python 3.13. Being familiar with python, pip, and uv would be very useful but is not strictly required.

After setting up the venv, any time instructions tell you to use pip or python, use uv pip or uv run [python] (while in the directory with the venv) instead.

For example:

# Install uv:
sudo pacman -S uv

# Get ComfyUI:
git clone https://github.com/Comfy-Org/ComfyUI

# Setup uv virtual env with 3.13 in ComfyUI directory:
cd ComfyUI
uv venv --python=3.13 --seed

# Install AMD dependencies, as per ComfyUI instructions, but with uv:
uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm7.1
uv pip install -r requirements.txt

# Run ComfyUI
uv run main.py

If all goes well, you should be able to access ComfyUI by pointing a browser to http://127.0.0.1:8188/. You’ll need to download models into the models/ subdirectory. It has examples that will link you to some models to get started. See ComfyUI page for more instructions.

For a treasure trove of links to resources, see here: More /sdg/ links
(Contains only links, but following some of those links may be very NSFW without warning)

Might wanna go for this one instead GitHub - lshqqytiger/stable-diffusion-webui-amdgpu: Stable Diffusion web UI · GitHub

seeing as he wants to use rocm and this is specifically geared for that (and active, unlike a1111 webui, since a1111 seems to have just disappeared)

anyhow, I have some experience with this, start with a1111 webui or the most promising of it’s forks, based on most starred ofrks, the amdgpu one is the most starred, but the second one is this:

it looks super promising, i might install it myself.

I would advise against starting with comfyui if you’re new to this, because comfyui is rather complicated and unintuitive, it’s geared towards people who want more than just the basics and has complexity to match.

a1111 webui was excellent for beginners, good for getting a basic grip on configuration and prompting without any more advanced features that beginners don’t want.

So start with some simple beginner friendly ui, then switch to comfy if you feel like you want to do something more elaborate, or run different types of ai (comfyai supports a lot of different types of models)

For running LLMs if you’re interested in that, Ollama is the easiest way.

Yeah, I also started on A1111 back in the early days.

I agree, it’s a great place to start. Plenty of features, and the UI just dumps them right in front of you, so you know what to look up to learn more, or just click things to try them.

The “WebUI Forge” I linked is also an active fork of it. But that’s kind of the problem: Too many active forks. Efforts too spread out. Consequently, missing / limited features compared to Comfy. No matter which fork you use, you’re missing out on something, and will eventually need to switch forks as some fall behind and others pull ahead.

ComfyUI seems to have won the war. If I was starting today, I’d probably start there. But maybe that’s biased by the benefit of everything I already learned from A1111.

Whichever you go with, it’s probably best practice to sandbox it in docker or similar. Keeps it separate from your system, eliminates need for uv / venv, gives you more control over network access, etc. But setting that up properly is not trivial. And I’m not sure if AMD has GPU passthru for docker like Nvidia does.

It depends on how easily you grasp things, are you the kind of guy that can pick up blender in 3 days? or would you need 3 months?

If you’re the former, starting with comfyui is the answer, if you’re the latter, starting with something simpler is the answer.

I grasp things very fast, but I still needed a while to get used to comfy and ultimately just purely for image gen I actually still prefer SDWebui because I don’t actually need the advanced features of comfyui, like the majority of comfyui users I’m only doing things that i could have done more easily in sdwebui lol.

Im done with comfyUI, sh!t connects to china ccp and alibaba when installing.

I recommend ComfyUI. It’s a bit complicated at first, but it makes running AI very easy. And they’re coming out with ComfyUI apps so if you aren’t tech savvy, that’s ok.