イントロダクション Introduction

「もっと意図通りに、高精度な画像を生成したい」 "I want to generate high-precision images exactly as intended"
SD Visual Prompt Editor は、Stable Diffusionのプロンプト（呪文）を単に管理するだけでなく、構造を可視化して「推敲（ブラッシュアップ）」し、生成精度を高めるためのビジュアルエディタです。 SD Visual Prompt Editor is a visual editor designed not just to manage Stable Diffusion prompts, but to visualize their structure for "refining (brushing up)" and improving generation accuracy.

長いプロンプトを作成していると、「どのタグが効いているのか分からない」「重複や矛盾が発生している」「トークン数制限を超えて後ろのタグが無視されている」といった問題に直面します。
本ツールは、テキストを「色のついたタグブロック」として可視化することで、これらの問題を一目で発見し、直感的な操作で修正することを可能にします。 When creating long prompts, you often face problems such as not knowing which tags are effective, occurrences of duplication or contradiction, or tags being ignored because they exceed the token limit.
This tool allows you to discover these issues at a glance and fix them intuitively by visualizing text as "colored tag blocks."

✨ 3つのコア・コンセプト ✨ Three Core Concepts

👁️ 構造の可視化 👁️ Structural Visualization

テキストの羅列ではなく、意味のある「タグのブロック」としてプロンプトを捉えます。これにより、文字の羅列では見落としがちな重複記述や、不自然な並び順を視覚的に発見しやすくなります。 Think of prompts as meaningful "tag blocks" rather than just strings of text. This makes it visually easier to discover duplicate descriptions or unnatural ordering that are often overlooked in plain text.

🎨 強度の識別 (Frequency) 🎨 Frequency Identification

Danbooruタグデータセットに基づき、そのタグが「学習データにどれだけ含まれているか（人気度）」を色で可視化します。

⚠️ 重要: この色付けはあくまで「Danbooruタグ辞書」との照合結果です。
Stable DiffusionのAIはCLIP（テキストエンコーダー）を通じて自然言語も理解できるため、「色がついていない＝AIが理解できない」わけではありません。

※将来的には、CLIP辞書にトークンが存在するかどうかを判定・可視化する仕組みも実装予定です。

Based on the Danbooru tag dataset, it visualizes "how much that tag is included in the training data (popularity)" with colors.

⚠️ Important: This coloring is only the result of matching with the "Danbooru tag dictionary."
Since Stable Diffusion AI can also understand natural language through CLIP (Text Encoder), "no color does not mean the AI cannot understand it."

*In the future, we plan to implement a mechanism to determine and visualize whether tokens exist in the CLIP dictionary.

📏 トークンの最適化 📏 Token Optimization

75トークンの区切り線を可視化することで、重要なプロンプトが無視されるのを防ぎ、AIに伝わりやすい最適な構成へ導きます。

💡 なぜ75トークン？
Stable Diffusion（CLIP）は、プロンプトを75トークンごとの塊（バッチ）として処理する性質があります。
この区切りをまたぐと、単語同士の結びつきが弱くなったり、後半の要素が反映されにくくなったりすることがあります。
このラインを意識して「絶対に外せない要素」を前半に配置することで、生成結果が安定しやすくなります。

👉 画質崩壊を防ぐ「チャンク分割」の詳しい解説はこちら

By visualizing the 75-token separator, it prevents important prompts from being ignored and leads to an optimal configuration that is easy for the AI to understand.

💡 Why 75 tokens?
Stable Diffusion (CLIP) processes prompts in chunks (batches) of 75 tokens.
Crossing this separator can weaken the connection between words or make elements in the latter half harder to reflect.
By being aware of this line and placing "essential elements" in the first half, the generation results tend to be more stable.

👉 Click here for details on "Chunking" to prevent image collapse

🚀 クイックスタート (利用サイクル) 🚀 Quick Start (Usage Cycle)

本ツールは、WebUI (Automatic1111等) とセットで使用することを想定しています。 This tool is intended to be used in conjunction with a WebUI (such as Automatic1111).

入力 (Import):
WebUIにある既存のプロンプトをコピーし、本ツールの「プロンプトエディタ」に貼り付けます。

Input (Import):
Copy an existing prompt from the WebUI and paste it into the "Prompt Editor" of this tool.
可視化 (Convert):
画面中央の ⬇️ ビジュアルエディタに反映ボタンを押します。テキストが解析され、タグとして展開されます。

Visualization (Convert):
Click the ⬇️ Reflect in Visual Editor button in the center of the screen. The text is analyzed and expanded as tags.
推敲 (Refine):
ドラッグ＆ドロップでタグの順番を入れ替えたり、検索パレットから不足しているタグを追加したりして、構成を練り直します。

Refinement (Refine):
Refine the composition by rearranging tags with drag-and-drop or adding missing tags from the search palette.
利用 (Export):
プロンプトとしてコピーボタンを押し、修正されたプロンプトをWebUIに戻して画像を生成します。

Use (Export):
Click the Copy as Prompt button and return the modified prompt to the WebUI to generate an image.

次のページからは、各機能の詳細な使い方を解説します。 The following pages explain how to use each function in detail.
1. 基本表示・モード切替へ進む ➡️ Go to 1. Basic Display / Modes ➡️