top of page

The last mile of AI assisted creation is customization.

While AI tools like Midjourney are now capable of producing high quality content, most of their output looks distinctly AI-generated and can be noticed by most people who have had some exposure to AI art.

​

The characteristics that give it away include hyper-detail, inconsistency in style, wrong lighting, and over-complication. Without using artist names, the results tend to have a very similar style. AI art has a lot of redundancy by default.

You_ran_away_an_arrangement_of_artifacts_gradient_background_su_32a3066c-b666-4c6d-9b14-91

Created by Youran Song, with Midjourney AI

This is the last mile problem for creators using AI as a tool.

 

Although the general model is already there, only those who can train and customize their own models will be able to unleash the true power of AI and add value to their creations. The training and fine-tuning the customized AI model is likely to be the most time-consuming and resource-intensive, and therefore the most valuable.

​

This is the documentation of my experiments with Stable Diffusion and Midjourney, attempting to create a model that produces stable images that are creative, unique looking, and in line with my existing personal style.

Last-Mile-Delivery-1024x435.png

The first step is to understand Stable Diffusion.

 

The current version 2.1 provides 20 sampling methods. Here is the comparison of results using the same parameter settings, seed value and ControlNet.

euler a.png

Euler a

heun.png

Heun

dpm++2sa.png

DPM++ 2S a

dpm fast.png

DPM fast

dpm2 karras.png

DPM2 Karras

dpm++2m karras.png

DPM++ 2M Karras

euler.png

Euler

dpm2.png

DPM2

dpm++2m.png

DPM++ 2M

dpm adaptive.png

DPM adaptive

dpm2a karras.png

DPM2 a Karras

dpm++ sde karras.png

DPM++ SDE Karras

lms.png

LMS

dpm2 a.png

DPM2 a

dpm++sde.png

DPM++ SDE

lms karras.png

LMS Karras

dpm++2sa karras.png

DPM++ 2S a Karras

Since the quality of the results are far from satisfactory, I then send them back as reference images into Midjourney. Here are some of the outcomes.

up1-1.png
up2-1.png
up3-1.png
up5-1.png
up6 copy.png

As a next step, these outputs will be run through Midjourney again to achieve higher fidelity, and then used as a dataset to train the customized Stable Diffusion LoRA model. The tagging will be done manually in BooruDatasetTagManager.

bottom of page