horse render 01.mp4
U
A
[Hide] (4.2MB, 512x768, 00:09) >>115850
Essentially. It is, of course, always desirable to have sufficient RAM and CPU to do all the other stuff comfortably. How much this is varies with what other stuff you do with your computer. If it is a dedicated AI stuff workstation and you're not going to play games, do much web surfing, etc., in my opinion it would be perfectly fine and perfectly usable with a ten-year-old i5 Gen 6 four-banger and 4GB of RAM, especially if you are running Ubuntu or Pop!OS instead of Windows. I bring up those two distros specifically because they are universally agreed to have the best nVidia driver support and compatibility.
You will need enough storage space for the models, and they get ridiculously yuge very fast. Putting the models on SSDs will make the AI, whether it's text and chat stuff or image gen stuff, much more responsive, but I know SSDs are going up in price and spinning platters are a lot cheaper.
You will need a big enough power supply for a beastly GPU and all those drives. I believe the 4090 could suck down 1000+ watts, which is a lot. Or for a specialized AI system, there's the RTX Pro 6000 Blackwell with 96GB of VRAM, and that one only needs 600 watts. The current king of big boppers for local AI is the nVidia Blackwell B200 with 192GB of VRAM, and some motherboards support a pair of them, but at 1200 watts each that's too much of a good thing. You're not going to get more than around 1500 watts through a wall socket with residential wiring at 110 volts, not without tripping the breakers or setting the house on fire, and getting LLM software or image diffusion software to "see" and communicate with more than a single GPU is non-trivial. I do recommend going full modular on a PSU for this kind of application.
A chipset as old as the H110 or B150 will support an RTX5090 or RTX Pro 6000, just at PCIe 4.0 speeds, which would not be great for gaming, but is perfectly fine for image gen and text.
As for definitions, https://www.ibm.com/think/topics/diffusion-models says that image generation diffusion models and LLMs are completely different beasts, though the portion that interprets your text prompt is going to be a very truncated, very specialized little LLM that feeds data to the diffusion model.
>GPT5
The last time I checked Grok was estimated to be in the 4-5 trillion parameter range. Opus, Gemini, and GPT-5.4 are generally judged to be roughly similar, but the people in charge do not publicize these details. The Chinese have one called Moonshot Kimi K2 that they claim has one trillion parameters, and I will let you decide how plausible you think that is.