docs/custom-nodes/backend/datatypes.mdx at 6f063fdf548eee4dfb6eb48e515f30befcf21500 · Comfy-Org/docs

title	Datatypes

These are the most important built in datatypes. You can also define your own.

Datatypes are used on the client side to prevent a workflow from passing the wrong form of data into a node - a bit like strong typing. The JavaScript client side code will generally not allow a node output to be connected to an input of a different datatype, although a few exceptions are noted below.

Comfy datatypes

COMBO

No additional parameters in INPUT_TYPES
Python datatype: defined as list[str], output value is str

Represents a dropdown menu widget. Unlike other datatypes, COMBO it is not specified in INPUT_TYPES by a str, but by a list[str] corresponding to the options in the dropdown list, with the first option selected by default.

COMBO inputs are often dynamically generated at run time. For instance, in the built-in CheckpointLoaderSimple node, you find

"ckpt_name": (folder_paths.get_filename_list("checkpoints"), )

or they might just be a fixed list of options,

"play_sound": (["no","yes"], {}),

Primitive and reroute

Primitive and reroute nodes only exist on the client side. They do not have an intrinsic datatype, but when connected they take on the datatype of the input or output to which they have been connected (which is why they can't connect to a * input...)

Python datatypes

INT

Additional parameters in INPUT_TYPES:
- default is required
- min and max are optional
Python datatype int

FLOAT

Additional parameters in INPUT_TYPES:
- default is required
- min, max, step are optional
Python datatype float

STRING

Additional parameters in INPUT_TYPES:
- default is required
Python datatype str

BOOLEAN

Additional parameters in INPUT_TYPES:
- default is required
Python datatype bool

Tensor datatypes

IMAGE

No additional parameters in INPUT_TYPES
Python datatype torch.Tensor with shape [B,H,W,C]

A batch of B images, height H, width W, with C channels (generally C=3 for RGB).

LATENT

No additional parameters in INPUT_TYPES
Python datatype dict, containing a torch.Tensor with shape [B,C,H,W]

The dict passed contains the key samples, which is a torch.Tensor with shape [B,C,H,W] representing a batch of B latents, with C channels (generally C=4 for existing stable diffusion models), height H, width W.

The height and width are 1/8 of the corresponding image size (which is the value you set in the Empty Latent Image node).

Other entries in the dictionary contain things like latent masks.

The LATENT dictionary may contain additional keys:

samples: The main latent tensor (required)
batch_index: List of indices for batch processing
noise_mask: Optional mask for inpainting operations
crop_coords: Tuple of (top, left, bottom, right) for cropped regions
original_size: Tuple of (height, width) for the original image dimensions
target_size: Tuple of (height, width) for the target output dimensions

Channel counts for different models:

SD 1.x/2.x: 4 channels
SDXL: 4 channels
SD3: 16 channels
Flux: 16 channels
Cascade: 4 channels (stage A), 16 channels (stage B)

Example LATENT structure:

latent = {
    "samples": torch.randn(1, 4, 64, 64),  # [B, C, H, W]
    "batch_index": [0],
    "noise_mask": None,
    "crop_coords": (0, 0, 512, 512),
    "original_size": (512, 512),
    "target_size": (512, 512)
}

MASK

No additional parameters in INPUT_TYPES
Python datatype torch.Tensor with shape [H,W] or [B,C,H,W]

AUDIO

No additional parameters in INPUT_TYPES
Python datatype dict, containing a torch.Tensor with shape [B, C, T] and a sample rate.

The dict passed contains the key waveform, which is a torch.Tensor with shape [B, C, T] representing a batch of B audio samples, with C channels (C=2 for stereo and C=1 for mono), and T time steps (i.e., the number of audio samples).

The dict contains another key sample_rate, which indicates the sampling rate of the audio.

Custom Sampling datatypes

Noise

The NOISE datatype represents a source of noise (not the actual noise itself). It can be represented by any Python object that provides a method to generate noise, with the signature generate_noise(self, input_latent:Tensor) -> Tensor, and a property, seed:Optional[int].

The seed is passed into sample guider in the SamplerCustomAdvanced, but does not appear to be used in any of the standard guiders. It is Optional, so you can generally set it to None.

When noise is to be added, the latent is passed into this method, which should return a Tensor of the same shape containing the noise.

See the noise mixing example

Sampler

The SAMPLER datatype represents a sampler, which is represented as a Python object providing a sample method. Stable diffusion sampling is beyond the scope of this guide; see comfy/samplers.py if you want to dig into this part of the code.

Sigmas

The SIGMAS datatypes represents the values of sigma before and after each step in the sampling process, as produced by a scheduler. This is represented as a one-dimensional tensor, of length steps+1, where each element represents the noise expected to be present before the corresponding step, with the final value representing the noise present after the final step.

A normal scheduler, with 20 steps and denoise of 1, for an SDXL model, produces:

tensor([14.6146, 10.7468,  8.0815,  6.2049,  4.8557,  
         3.8654,  3.1238,  2.5572,  2.1157,  1.7648,  
         1.4806,  1.2458,  1.0481,  0.8784,  0.7297,  
         0.5964,  0.4736,  0.3555,  0.2322,  0.0292,  0.0000])

The starting value of sigma depends on the model, which is why a scheduler node requires a MODEL input to produce a SIGMAS output

Guider

A GUIDER is a generalisation of the denoising process, as 'guided' by a prompt or any other form of conditioning. In Comfy the guider is represented by a callable Python object providing a __call__(*args, **kwargs) method which is called by the sample.

The __call__ method takes (in args[0]) a batch of noisy latents (tensor [B,C,H,W]), and returns a prediction of the noise (a tensor of the same shape).

Model datatypes

There are a number of more technical datatypes for stable diffusion models. The most significant ones are MODEL, CLIP, VAE and CONDITIONING.

MODEL

The MODEL data type represents the main diffusion model (UNet). It contains:

model: The actual PyTorch model instance
model_config: Configuration parameters for the model
model_options: Runtime options and settings
device: Target device (CPU/GPU) for model execution

# Accessing model information
def get_model_info(model):
    config = model.model_config
    return {
        "model_type": config.unet_config.get("model_type", "unknown"),
        "in_channels": config.unet_config.get("in_channels", 4),
        "out_channels": config.unet_config.get("out_channels", 4),
        "attention_resolutions": config.unet_config.get("attention_resolutions", [])
    }

CLIP

The CLIP data type represents text encoder models:

cond_stage_model: The text encoder model
tokenizer: Text tokenization functionality
layer_idx: Which layer to extract embeddings from
device: Target device for text encoding

# Working with CLIP models
def encode_text_with_clip(clip, text):
    tokens = clip.tokenize(text)
    cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True)
    return [[cond, {"pooled_output": pooled}]]

VAE

The VAE data type handles encoding/decoding between pixel and latent space:

first_stage_model: The VAE model instance
device: Target device for VAE operations
dtype: Data type for VAE computations
memory_used_encode: Memory usage tracking for encoding
memory_used_decode: Memory usage tracking for decoding

# VAE operations
def encode_with_vae(vae, image):
    # Image should be in [B, H, W, C] format, values 0-1
    latent = vae.encode(image)
    return {"samples": latent}

def decode_with_vae(vae, latent):
    # Latent should be in [B, C, H, W] format
    image = vae.decode(latent["samples"])
    return image

CONDITIONING

Processed text embeddings and associated metadata:

cond: The conditioning tensor from text encoding
pooled_output: Pooled text embeddings (for SDXL and newer models)
control: Additional control information for ControlNet
gligen: GLIGEN positioning data
area: Conditioning area specifications
strength: Conditioning strength multiplier
set_area_to_bounds: Automatic area boundary setting
mask: Conditioning masks for regional prompting

# Working with conditioning
def modify_conditioning(conditioning, strength=1.0):
    modified = []
    for cond in conditioning:
        new_cond = cond.copy()
        new_cond[1] = cond[1].copy()
        new_cond[1]["strength"] = strength
        modified.append(new_cond)
    return modified

Additional Parameters

Below is a list of officially supported keys that can be used in the 'extra options' portion of an input definition.

You can use additional keys for your own custom widgets, but should not reuse any of the keys below for other purposes.

Display and UI Parameters:

tooltip: Hover text description for the input
serialize: Whether to serialize this input in saved workflows
round: Number of decimal places for float display (FLOAT inputs only)
display: Display format ("number", "slider", etc.)
control_after_generate: Whether to show control after generation

Validation Parameters:

min: Minimum allowed value (INT, FLOAT)
max: Maximum allowed value (INT, FLOAT)
step: Step size for sliders (INT, FLOAT)
multiline: Enable multiline text input (STRING)
dynamicPrompts: Enable dynamic prompt processing (STRING)

Behavior Parameters:

forceInput: Force this parameter to be an input socket
defaultInput: Mark as the default input for this node type
lazy: Enable lazy evaluation for this input
hidden: Hide this input from the UI (for internal parameters)

File and Path Parameters:

image_upload: Enable image upload widget
directory: Restrict to directory selection
extensions: Allowed file extensions list

Advanced Parameters:

control_after_generate: Show control widget after generation
affect_alpha: Whether changes affect alpha channel
key: Custom key for parameter storage

Key	Description
`default`	The default value of the widget
`min`	The minimum value of a number (`FLOAT` or `INT`)
`max`	The maximum value of a number (`FLOAT` or `INT`)
`step`	The amount to increment or decrement a widget
`label_on`	The label to use in the UI when the bool is `True` (`BOOL`)
`label_off`	The label to use in the UI when the bool is `False` (`BOOL`)
`defaultInput`	Defaults to an input socket rather than a supported widget
`forceInput`	`defaultInput` and also don't allow converting to a widget
`multiline`	Use a multiline text box (`STRING`)
`placeholder`	Placeholder text to display in the UI when empty (`STRING`)
`dynamicPrompts`	Causes the front-end to evaluate dynamic prompts
`lazy`	Declares that this input uses Lazy Evaluation
`rawLink`	When a link exists, rather than receiving the evaluated value, you will receive the link (i.e. `["nodeId", <outputIndex>]`). Primarily useful when your node uses Node Expansion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comfy datatypes

COMBO

Primitive and reroute

Python datatypes

INT

FLOAT

STRING

BOOLEAN

Tensor datatypes

IMAGE

LATENT

MASK

AUDIO

Custom Sampling datatypes

Noise

Sampler

Sigmas

Guider

Model datatypes

MODEL

CLIP

VAE

CONDITIONING

Additional Parameters

FilesExpand file tree

datatypes.mdx

Latest commit

History

datatypes.mdx

File metadata and controls

Comfy datatypes

COMBO

Primitive and reroute

Python datatypes

INT

FLOAT

STRING

BOOLEAN

Tensor datatypes

IMAGE

LATENT

MASK

AUDIO

Custom Sampling datatypes

Noise

Sampler

Sigmas

Guider

Model datatypes

MODEL

CLIP

VAE

CONDITIONING

Additional Parameters