AI Art Generation Handbook/Node Prompting in Stable Diffusion

From Wikibooks, open books for an open world
Jump to navigation Jump to search
Generic nodes programming for SDXL in ComfyUI

This type of "programming language" used in ComfyUI may overwhelm especially to the user whom is not familiar to before compared to other local UI (such as Auto1111, Foooocus, Oomost and such...) but if we bite the bullet and get through it , this may become easy .

As you can see, there are square blocks that are connected to one another using multiple coloured lines. In this context, the square blocks are known as "node" or "block diagram" and the multiple coloured lines are known as "wires" or "pipelines"

Control

[edit | edit source]

Once you successfully installing ComfyUI, we learn how to control inside ComfyUI

Actions Expected Results
Left click on any empty spaces and drag it anywhere you like. The whole canvas will move according to your dragged directions
Left click on any nodes and drag it anywhere you like. The selected nodes will move according to your dragged directions
Right click on any empty spaces Menu bar will appeared to "Add Nodes"
Right click on any nodes Menu bar will appear to edit properties of the selected nodes
Double click on empty spaces Search Bar of "Nodes" to be added in
Hover on the bottom right of the nodes until cursor changes to thisChess dwt45 Once pressed and drag, it will resize the selected nodes

Anatomy of the nodes

[edit | edit source]

Every nodes have this 3 characteristic :
Input Data Types (Red): The data flows into the nodes through this pipelines. Take note that the wire colours are matching to the input data types
Output Data Types (Blue): The data flows out of nodes through this pipelines. Take note that the wire colours are matching to the output data types
Control Parameter (Green): The results of the calculations performed within this node is largely influenced by the setting values inside control parameters

Basically, how this works is the whole nodes (as shown on the left) is basically a big "black box" where it takes data flow into the nodes and perform abstractions as well as the complex calculations (based on the control parameters within its nodes). After the calculations is done, the data is outputted to the next nodes for other processing.

Node Programming

[edit | edit source]

To learn node programming, we can clear the whole workflow by clicking "Clear" button on the right. As seen on the diagram of the Text-2-Image of how it works, we can recreate the workflow .

Node Programming - Add in new nodes

[edit | edit source]

To add new nodes, we can double click on any empty spaces in canvas and using search functions to search required block diagrams easier:

We add this few important block diagram

Block Diagram Names Images
Load Checkpoint [Diffusion Model] Loading the preferable AI Image models stored at Comfyui\ComfyUI_windows_portable\ComfyUI\models\checkpointsFrom Load Check Points, we can see it needs at least 3 other outputs to connect to namely "Model" , "CLIP" and "VAE"
CLIP Text Encode (Contrastive Language-Image Pre-Training) [Text Encoder] It is a learning neural network (released by OpenAI) understand the relationship between images and text captions. We need to create two at the moment, which is one is for "Positive" prompts and one is for "Negative" prompts
VAE Image Decoder (Variational AutoEncoder) [Image Decoder] It is a type of generative neural network model that learns to compress and reconstruct the pixels while also learning a probabilistic representation of that data.
Empty Latent Image This is to set the final output sizes as well as number of image batch per generations
K-Sampler 'K' in this context refers to the number of steps or iterations in the sampling process during image decoding. They can be regarded as 'scheduler' to allow the diffusion model to run at how many steps.
Save Images This is where the output images are loaded here

Node Programming - Connecting the nodes

[edit | edit source]
ComfyUI Node Step 1

We can start connecting the block diagrams by the pipelines.

Note: Either way is OK but if you wanted to connect many to one nodes, the recommended method is to connect from the block diagrams input (left side) to the block diagrams output (right side) to minimises potential bugs and issues later on. See Connecting CLIP to K-Sampler parts

Let's start with the Load Checkpoint block diagrams and K-Sampler block diagrams . Click on the MODEL output (with lavender circle) , you can notices that only model input is highlighted meanwhile the rest is grayed out (Meaning it is not connectable).

Another point to take note that , pay attention to the colour of input and output data types (Only both of same colours are able to be connected together)


Now , Load Check Point should be connected to K-Sampler through the MODEL pipelines.


There are CLIP to be connected. As mentioned above, the CLIP is related to Text Encoder, therefore , we can connect both CLIP Block Diagrams .

But However, before we started to connect those , rename the Block Diagrams first into more appropriate name to prevent confusions. Right click on the block diagrams and click on "Properties Panel" . There should be Title, Mode , Color and Node name for S&R, change the both Title and Node Name for S&R to more self explanatory names likme POSITIVE CLIP as examples.

Note : We want to distinguish which block diagrams is for positive and for negative prompts.

Connect the Load Checkpoints CLIP Output to both Positive and Negative Block Diagrams clip input (See yellow pipelines). Lastly connect the CLIP CONDITIONING output to positive and negative K-Sampler input (See orange pipelines)respective to the block diagrams names

You should have the connections as seen in the left screenshot by now


Note: There are potential bugs of connecting output data types CONDITIONING to input data types positive or negative , therefore reverse the sequence to connect from input positive / negative to output CONDITIONING.


As we can see now in Load Checkpoints block diagram, VAE block diagram is still unconnected .

We can click on VAE output datatypes (red pipelines) to see which it can connect to, only VAE Decode Block Diagram vae input is highlighted.


Repeat the same in VAE Decode diagram block for sampler input datatypes (pink pipelines) to be connected to K-SAMPLER's latent output data type


Finally , repeat the same for VAE Decode IMAGE output datatypes (blue pipelines) to be connected to images input datatypes in Save Images Block Diagrams.


You should get the similar diagrams as below and left only one connections between the K-Sampler and Empty Latent Image block diagrams to make it functional.

Almost finished nodes connections
Almost finished nodes connections







Connect the LATENT in Empty Latent Image to KSampler's latent_image to finish the workflow.

Congrats , you finished the basic text to image workflow and have slightly better understanding on how this all works.


Node Programming - Generate Image

[edit | edit source]

Try to click the "Queue Prompt" on the right button on the sidebar. You may not notice at first but the KSampler node is suddenly highlighted in green colour and then, there is green colour bar that starting to go across.

It meant that the AI Art Models is working.


To know what does each of the setting meant , go to AI Art Generation Handbook/Stable Diffusion settings to learn more

Improving the ComfyUI

[edit | edit source]

Save Prompt Names as Image Ouput File Name

[edit | edit source]

As we generate the images, we get the generic names of ComfyUI_0001.png as default , but do you know that with some additional steps , we can save prompt as filenames in ComfyUI ?

Here is how we do it very easily.

Naming prompts as filenames
Naming prompts as filenames

On the Positive CLIP block diagram, right click on the block diagrams to search for the name of Node name for S&R

Use the values that you key in earlier (in this case "Positive CLIP") and type the following into filename_prefix inside Save Image block diagram:

%Positive_CLIP.text% , refer to picture on the left for clarity .

From now onwards, you can see the filename changes to the saved images if the Save Image block diagram is not altered in anyway

Improve Image Generations Quality

[edit | edit source]

SDXL and above are trained on 1024 x 1024 px images , therefore anything less than that will results in low quality of images as shown per this experiments here.


Therefore, on Empty Latent Image block diagram you can adjust both width and height up to 1024px to get more aesthetic image generations.



References

[edit | edit source]

[1] https://www.youtube.com/watch?v=RVwIz63bxN4 - KSamplers

[2] https://www.reddit.com/r/StableDiffusion/comments/174g0s9/prompt_in_file_name_in_comfyui/ - Added filenames into prompt