I’ve been experimenting with generative AI ever since Stable Diffusion was released in Oct 2022.
This process brought me joy and some interesting ideas. Contrary to what most have written online, it resparked my interest in drawing and also writing.
I’ll share more on AI topics in upcoming posts, but for now, let me start with some information about what I use and why.
My PC has a moderate GPU capable of running some racing sim games 😉 Turns out, it was also capable for exploring the world of AI images locally.
I will share more thought on AI topics in the following posts, but for now, here is some info on what I use and why.
My current primary OS is Ubuntu. I am an advanced user, but there is a mismatch between my GPU and some libraries so I decided to take the shorter route and abandon Ubuntu for image generation. Some users do manage to get it working, so keep at it if you must.
On the other hand, my Windows 10 installs of SD projects went mostly flawlessly. That is unless the project itself was broken due to updates. You know: go fast and break things mentality 🙂
Still, I was amazed at the pace and usability of these open source tools even from the very beginning.
I first used the official repo, then a few smaller ones that are currently no longer working, then the now very popular AUTOMATIC1111, and finally the InvokeAI app. I also briefly explored the ComfyUI.
There are a gazzilion other projects and there is a new tool or plugin every week, so do not limit yourself to these. There is also likely a plugin for your preferred image editing software, so check those too.
I find ComfyUI the most flexible, but also not as user friendly. I rarely if ever use it, simply because I’ve already familiarized myself with the other projects below.
InvokeAI has a great model manager and its UI is different and IMHO better than the rest, especially if you’re doing a lot of editing, inpainting and outpainting. The UI integrates well with custom LORAs and is overall very user friendly.
AUTOMATIC1111 has the most extensions AFAIK and would be my suggestion if you wanna play around with additional functionalities in one and the same repo. It’s inpainting UI does not work well for me, which is my main complaint.
Midjourney vs StableDiffusion vs DALL-E
Overall, Midjourney is great. But I am a fan of open source and tinkering. StableDiffussion gives you more and costs nothing (ok, it does cost time and some extra electricity). It allows you to create permanent personas, concepts, styles and poses. And all of these are very controllable.
OpenAi are really onto something with their ChatGPT+DALL-E 3 integration. It is amazing at understanding prompts. Still, SD excels at things that I find most interesting: which is direction, control, and extendability. If DALL-E had openpose, character consistency, and fixed styles – that would be a killer combination. Until then, SD is my choice.
Here is a list of articles in this series:
- A year of AI Images – StableDiffusion (this article)
- AI Images – Experiments
- AI Images – In search of control (coming soon)