AI Art Generation Handbook/ControlNet

From Wikibooks, open books for an open world
Jump to navigation Jump to search

What is ControlNet ?[edit | edit source]

ControlNet 22022023
ControlNet 22022023

As you experimented around with Stable Diffusion, you may experience that at times you may want to create a characther with a cool imposing pose, but using prompt itself is a huge undertaking. You may generate up to hundreds of images but unable to get that type of images that you wanted.

This is where, ControlNet is to the rescue.

ControlNet is a neural network structure that works with Stable Diffusion to provide finer visual consistency and more control (pose, face, background, etc..) to AI image generation.

It is developed by Standford researcher named Lvmin Zhang and first introduced in public on Feb 2023. .

It allows for the use of additional input conditions, such as sketches, outlines, depth maps, or human poses, to guide the output of Stable Diffusion.

This means that the output image can be controlled and guided by these additional inputs by modifying variety of properties of an image, such as the pose, the expression, and the background to produce a more accurate and desired result. For example:

For example:

Let's say you want to control the pose of a human in an image (as shown , a girl squatting down)

AI model would be trained on a dataset of images of humans in different poses. Meanwhile, ControlNet network are trained on the same dataset of images.

The ControlNet network would learn to associate the pose of the human in the image with the desired output of the diffusion model.

Once the ControlNet network is trained, you can use it to control the pose of the human in an image and use the prompt to change from female to male as shown.

How ControlNet works ?[edit | edit source]

ControlNet Simplified Flow

(1) Text prompts is sent to ControlNet neural network

(2) The trained AI model is then used to generate an image.

(3) ControlNet is then used to add extra conditions to the generated image.

(4) The generated image is then fine-tuned using ControlNet.

(5) The fine-tuned image is then output.





How to use ControlNet ?[edit | edit source]

We assume you have Web-UI by Automatic1111 or SD.Next (Vladmandic ) .

First of all, go to Extension -> Available ->Click "Load From" button to download the extension.

Search for extension sd-webui-controlnet-manipulations. Click "Install" button located at right side.

Restart the whole Automatic1111 to make sure the Extension is properly installed.

Once installed, inside txt2img or img2img tab, you should see ControlNet v1.1.xxx (usually on below of Seed). Click the to expand on the menu.

You should see something similar below.

Here are the some examples of settings that are found at the ControlNet style:

To use any ControlNet models, you need to click Enable

If the Graphic Card VRAM is not powerful enough for aditional pre-procesing, click on Low VRAM

For Pixel Perfect is suitable if you wanted the ControlNet to retain most of the details of the picture

(Note: This is useful if you wanted to change from a photorealism to anime style)


Control Type is the ControlNet type of preprocessing that you wanted to apply in the images.

Below shows the type of ControlNet that are currently working in SDXL.







ControlNet Type

Canny

Pose

QR Code

Scribble

Line-art


References Link[edit | edit source]

ControlNet Layman Introductions - Controlnet

Huggingface - Download all of pre-trained ControlNetmodels here

Github - All info authors works are here