- Download the latest release from here
- Right click on the downloaded file and select
Run
orOpen
- Wait for the application to start
(Windows excutable gives false positive on some antivirus software, you can build the executable yourself if you don't trust the pre-built one)
- Create a virtual environment and activate it
python3 -m venv venv
for windows
./venv/scripts/activate
for macos and linux
source venv/bin/activate
- Install the requirements
pip install -r requirements.txt
- Run the application
python gui.py
It may take a while to initialize the application for the first time.
- Copy an image to your clipboard or select a file
- Click on the
load image from clipboard
orload image from file
button - Click on the
analyze image
button or press a keybinding. - The tags will be displayed in a new window (First time will take a while to download pre-trained model)
- You can copy the tags to your clipboard by clicking on the
copy tags to clipboard
buttonExtra:
- Check
Unload model after every analysis
can save you some memory, but it will take longer to analyze the image - You can choose tag format, currently support
Booru
andStable Diffusion
format
After the first run, a config.ini
file will be created in the same directory as the script. You can change the configuration there.
[GUI]
shortcut = Ctrl+Shift+I
unload_model_when_done = False
tag_format = booru
[Tagger]
model = wd-swinv2-v3
threshold = 0.35
Default model is wd-swinv2-v3
and I also recommend these models:
wd-swinv2-v3
(default, with overall good performance)wd-convnext-v3
(might deals rotated images better than other models)wd-vit-v3
(good at character recognition)wd14-moat-v2
(Incase you want to use the old model)
Default confidence threshold is 0.35
, lower it if you want more tags (less accurate).
- User from china mainland might have trouble downloading the model from huggingface
- macOS keybinding works by excute the script in IDEs (e.g. PyCharm or VSCode), but not in terminal. And it needs you to trust the IDE in
System Preferences -> Security & Privacy -> Privacy -> Input Monitoring
(Not a safe practice, use at your own risk) - switch keybinding through GUI crahes on macOS (not sure why)
Original code by https://github.com/picobyte/stable-diffusion-webui-wd14-tagger
Public domain, except borrowed parts (e.g. dbimutils.py
)