The last mile of AI assisted creation is customization.
While AI tools like Midjourney are now capable of producing high quality content, most of their output looks distinctly AI-generated and can be noticed by most people who have had some exposure to AI art.
The characteristics that give it away include hyper-detail, inconsistency in style, wrong lighting, and over-complication. Without using artist names, the results tend to have a very similar style. AI art has a lot of redundancy by default.
Created by Youran Song, with Midjourney AI
This is the last mile problem for creators using AI as a tool.
Although the general model is already there, only those who can train and customize their own models will be able to unleash the true power of AI and add value to their creations. The training and fine-tuning the customized AI model is likely to be the most time-consuming and resource-intensive, and therefore the most valuable.
This is the documentation of my experiments with Stable Diffusion and Midjourney, attempting to create a model that produces stable images that are creative, unique looking, and in line with my existing personal style.
The first step is to understand Stable Diffusion.
The current version 2.1 provides 20 sampling methods. Here is the comparison of results using the same parameter settings, seed value and ControlNet.
DPM++ 2S a
DPM++ 2M Karras
DPM2 a Karras
DPM++ SDE Karras
DPM++ 2S a Karras
Since the quality of the results are far from satisfactory, I then send them back as reference images into Midjourney. Here are some of the outcomes.
As a next step, these outputs will be run through Midjourney again to achieve higher fidelity, and then used as a dataset to train the customized Stable Diffusion LoRA model. The tagging will be done manually in BooruDatasetTagManager.