Compare commits

...

37 Commits

Author SHA1 Message Date
181ab5a8e7 Optimize /api/similar
All checks were successful
Alpine 3.20 Success
Debian Bookworm Success
2024-12-29 18:10:15 +01:00
fd192310c7 Add a forget function to dispose of orphans
All checks were successful
Alpine 3.20 Success
Debian Bookworm Success
Previously, there was no way of removing images from the database.
2024-12-29 16:22:50 +01:00
b73e0b4622 Order orphans by path
All checks were successful
Alpine 3.20 Success
Debian Bookworm Success
It costs more cycles, but the SHA-1 they got implicitly ordered by
is pseudo-random.
2024-12-29 14:47:11 +01:00
0530c5d95f Fix /api/orphans with removed parent nodes 2024-12-29 14:17:07 +01:00
ce2e58b6bc Fix extremely slow removals 2024-12-29 13:41:07 +01:00
ca462ac005 Remember to optimize the database 2024-12-29 12:32:44 +01:00
e895beadb7 Add a check option to garbage collect DB files
All checks were successful
Alpine 3.20 Success
Debian Bookworm Success
2024-12-21 12:18:54 +01:00
615af97043 Add a sync option to exclude paths by regexp 2024-12-21 11:12:00 +01:00
595db869e5 Add .gitignore 2024-12-21 09:38:44 +01:00
537b48dc22 deeptagger: flush batches
All checks were successful
Alpine 3.20 Success
Debian Bookworm Success
So that crashes do not disturb the output as much.
2024-12-14 22:56:26 +01:00
2c09745a9f deeptagger: fix README.adoc instructions
All checks were successful
Alpine 3.20 Success
Debian Bookworm Success
The images are under normal circumstances all symlinks.

What we're actually trying to express is `-not -type d`,
however that is not completely portable.
2024-12-08 22:16:11 +01:00
beb7c5e337 gallery: create DB directory in initialization
So that README.adoc instructions actually work.
2024-12-08 22:09:11 +01:00
19705527a0 Cleanup 2024-02-13 15:44:42 +01:00
9e22bd0e20 gallery: improve the README
All checks were successful
Alpine 3.19 Success
Debian Bookworm Success
2024-01-27 18:30:57 +01:00
d27d8655bb gallery: make it reverse proxy friendly 2024-01-27 18:09:48 +01:00
6d75ec60bf gallery: go back to ImageMagick v6
To cater to Debian.
2024-01-27 18:09:07 +01:00
84a94933b3 gallery: make it possible to collapse tag spaces 2024-01-23 11:30:55 +01:00
5e0e9f8a42 gallery: clean up, search in a transaction 2024-01-22 19:52:37 +01:00
083739fd4e gallery: implement AND/NOT for tag search 2024-01-22 19:52:35 +01:00
4f174972e3 gallery: move out a query from CTE 2024-01-22 19:51:13 +01:00
f9f22ba42c gallery: optimize the related tags query 2024-01-22 15:06:53 +01:00
7300773b96 deeptagger: give up on Windows altogether
ORT requires MSVC to build.  MSVC can't use MSYS2 libraries.

CMake + clang-cl doesn't work in MSYS2.

GraphicsMagick doesn't provide development packages for Windows.

There is no getopt on Windows.
2024-01-21 17:47:56 +01:00
05c3687ab1 gallery: use AsciiDoc for the README 2024-01-21 12:43:27 +01:00
aa65466a49 deeptagger: add an example of how to use it
And refer to CAFormer correctly.
2024-01-21 12:43:27 +01:00
454cfd688c gallery: document IM version requirement 2024-01-21 11:23:16 +01:00
1e3800cc16 deeptagger: fix Caformer
By using the smaller resolution, it starts noticing 2girls,
otherwise the output appears similar.
2024-01-21 10:38:46 +01:00
d37e9e821a gallery: try to avoid OOM while thumbnailing 2024-01-20 19:58:42 +01:00
de0dc58b99 gallery: don't be silent about signalled children 2024-01-20 18:35:29 +01:00
059825f169 gallery: add dhashes in one big DB transaction
iotop showed gigabytes of writes for a DB in the order of 100 MB.
2024-01-20 18:15:12 +01:00
4b054ac9cc deeptagger: swap columns to match 'galery tag' 2024-01-20 18:00:28 +01:00
4131bc5d31 Add benchmarks against WDMassTagger 2024-01-19 20:02:24 +01:00
fd5e3bb166 Add CoreML benchmarks 2024-01-19 15:33:49 +01:00
77e988365d Add some benchmarks and information 2024-01-18 18:31:10 +01:00
8df76dbaab Make consistent batches a simple edit 2024-01-18 18:31:10 +01:00
819d2d80e0 Limit concurrency to number of hardware threads 2024-01-18 18:31:10 +01:00
36f6612603 Load images in multiple threads
This worsens CPU-only times by some five percent,
but can also make GPU-accelerated runtime twice as fast.
2024-01-18 18:31:10 +01:00
b4f28814b7 Add a deep tagger in C++ 2024-01-18 18:31:09 +01:00
16 changed files with 1794 additions and 169 deletions

11
.gitignore vendored Normal file
View File

@@ -0,0 +1,11 @@
/gallery
/initialize.go
/public/mithril.js
/gallery.cflags
/gallery.config
/gallery.creator
/gallery.creator.user
/gallery.cxxflags
/gallery.files
/gallery.includes

View File

@@ -1,4 +1,4 @@
Copyright (c) 2023, Přemysl Eric Janouch <p@janouch.name>
Copyright (c) 2023 - 2024, Přemysl Eric Janouch <p@janouch.name>
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted.

14
README
View File

@@ -1,14 +0,0 @@
This is gallery software designed to maintain a shadow structure
of your filesystem, in which you can attach metadata to your media,
and query your collections in various ways.
All media is content-addressed by its SHA-1 hash value, and at your option
also perceptually hashed. Duplicate search is an essential feature.
Prerequisites: Go, ImageMagick, xdg-utils
The gallery is designed for simplicity, and easy interoperability.
sqlite3, curl, jq, and the filesystem will take you a long way.
The intended mode of use is running daily automated sync/thumbnail/dhash/tag
batches in a cron job, or from a system timer. See test.sh for usage hints.

39
README.adoc Normal file
View File

@@ -0,0 +1,39 @@
gallery
=======
This is gallery software designed to maintain a shadow structure
of your filesystem, in which you can attach metadata to your media,
and query your collections in various ways.
All media is content-addressed by its SHA-1 hash value, and at your option
also perceptually hashed. Duplicate search is an essential feature.
The gallery is designed for simplicity, and easy interoperability.
sqlite3, curl, jq, and the filesystem will take you a long way.
Prerequisites: Go, ImageMagick, xdg-utils
ImageMagick v7 is preferred, it doesn't shoot out of memory as often.
Getting it to work
------------------
# apt install build-essential git golang imagemagick xdg-utils
$ git clone https://git.janouch.name/p/gallery.git
$ cd gallery
$ make
$ ./gallery init G
$ ./gallery sync G ~/Pictures
$ ./gallery thumbnail G # parallelized, with memory limits
$ ./gallery -threads 1 thumbnail G # one thread only gets more memory
$ ./gallery dhash G
$ ./gallery web G :8080
The intended mode of use is running daily automated sync/thumbnail/dhash/tag
batches in a cron job, or from a systemd timer.
The _web_ command needs to see the _public_ directory,
and is friendly to reverse proxying.
Demo
----
https://holedigging.club/gallery/

20
deeptagger/CMakeLists.txt Normal file
View File

@@ -0,0 +1,20 @@
# Ubuntu 20.04 LTS
cmake_minimum_required (VERSION 3.16)
project (deeptagger VERSION 0.0.1 LANGUAGES CXX)
# Hint: set ONNXRuntime_ROOT to a directory with a pre-built GitHub release.
# (Useful for development, otherwise you may need to adjust the rpath.)
set (CMAKE_MODULE_PATH "${PROJECT_SOURCE_DIR}")
find_package (ONNXRuntime REQUIRED)
find_package (PkgConfig REQUIRED)
pkg_check_modules (GM REQUIRED GraphicsMagick++)
add_executable (deeptagger deeptagger.cpp)
target_compile_features (deeptagger PRIVATE cxx_std_17)
target_include_directories (deeptagger PRIVATE
${GM_INCLUDE_DIRS} ${ONNXRuntime_INCLUDE_DIRS})
target_link_directories (deeptagger PRIVATE
${GM_LIBRARY_DIRS})
target_link_libraries (deeptagger PRIVATE
${GM_LIBRARIES} ${ONNXRuntime_LIBRARIES})

View File

@@ -0,0 +1,11 @@
# Public Domain
find_path (ONNXRuntime_INCLUDE_DIRS onnxruntime_c_api.h
PATH_SUFFIXES onnxruntime)
find_library (ONNXRuntime_LIBRARIES NAMES onnxruntime)
include (FindPackageHandleStandardArgs)
FIND_PACKAGE_HANDLE_STANDARD_ARGS (ONNXRuntime DEFAULT_MSG
ONNXRuntime_INCLUDE_DIRS ONNXRuntime_LIBRARIES)
mark_as_advanced (ONNXRuntime_LIBRARIES ONNXRuntime_INCLUDE_DIRS)

238
deeptagger/README.adoc Normal file
View File

@@ -0,0 +1,238 @@
deeptagger
==========
This is an automatic image tagger/classifier written in C++,
primarily targeting various anime models.
Unfortunately, you will still need Python 3, as well as some luck, to prepare
the models, achieved by running download.sh. You will need about 20 gigabytes
of space for this operation.
"WaifuDiffusion v1.4" models are officially distributed with ONNX model exports
that do not support symbolic batch sizes. The script attempts to fix this
by running custom exports.
You're invited to change things to suit your particular needs.
Getting it to work
------------------
To build the evaluator, install a C++ compiler, CMake, and development packages
of GraphicsMagick and ONNX Runtime.
Prebuilt ONNX Runtime can be most conveniently downloaded from
https://github.com/microsoft/onnxruntime/releases[GitHub releases].
Remember to also install CUDA packages, such as _nvidia-cudnn_ on Debian,
if you plan on using the GPU-enabled options.
$ cmake -DONNXRuntime_ROOT=/path/to/onnxruntime -B build
$ cmake --build build
$ ./download.sh
$ build/deeptagger models/deepdanbooru-v3-20211112-sgd-e28.model image.jpg
The project requires a POSIX-compatible system to build.
Options
-------
--batch 1::
This program makes use of batches by decoding and preparing multiple images
in parallel before sending them off to models.
Batching requires appropriate models.
--cpu::
Force CPU inference, which is usually extremely slow.
--debug::
Increase verbosity.
--options "CUDAExecutionProvider;device_id=0"::
Set various ONNX Runtime execution provider options.
--pipe::
Take input filenames from the standard input.
--threshold 0.1::
Output weight threshold. Needs to be set higher on ML-Danbooru models.
Tagging galleries
-----------------
The appropriate invocation depends on your machine, and the chosen model.
Unless you have a powerful machine, or use a fast model, it may take forever.
$ find "$GALLERY/images" -type l \
| build/deeptagger --pipe -b 16 -t 0.5 \
models/ml_caformer_m36_dec-5-97527.model \
| sed 's|[^\t]*/||' \
| gallery tag "$GALLERY" caformer "ML-Danbooru CAFormer"
Model benchmarks (Linux)
------------------------
These were measured with ORT 1.16.3 on a machine with GeForce RTX 4090 (24G),
and Ryzen 9 7950X3D (32 threads), on a sample of 704 images,
which took over eight hours. Times include model loading.
There is room for further performance tuning.
GPU inference
~~~~~~~~~~~~~
[cols="<,>,>", options=header]
|===
|Model|Batch size|Time
|WD v1.4 ViT v2 (batch)|16|19 s
|DeepDanbooru|16|21 s
|WD v1.4 SwinV2 v2 (batch)|16|21 s
|ML-Danbooru CAFormer dec-5-97527|16|25 s
|WD v1.4 ViT v2 (batch)|4|27 s
|WD v1.4 SwinV2 v2 (batch)|4|30 s
|DeepDanbooru|4|31 s
|ML-Danbooru TResNet-D 6-30000|16|31 s
|WD v1.4 MOAT v2 (batch)|16|31 s
|WD v1.4 ConvNeXT v2 (batch)|16|32 s
|ML-Danbooru CAFormer dec-5-97527|4|32 s
|WD v1.4 ConvNeXTV2 v2 (batch)|16|36 s
|ML-Danbooru TResNet-D 6-30000|4|39 s
|WD v1.4 ConvNeXT v2 (batch)|4|39 s
|WD v1.4 MOAT v2 (batch)|4|39 s
|WD v1.4 ConvNeXTV2 v2 (batch)|4|43 s
|WD v1.4 ViT v2|1|43 s
|WD v1.4 ViT v2 (batch)|1|43 s
|ML-Danbooru CAFormer dec-5-97527|1|52 s
|DeepDanbooru|1|53 s
|WD v1.4 MOAT v2|1|53 s
|WD v1.4 ConvNeXT v2|1|54 s
|WD v1.4 MOAT v2 (batch)|1|54 s
|WD v1.4 SwinV2 v2|1|54 s
|WD v1.4 SwinV2 v2 (batch)|1|54 s
|WD v1.4 ConvNeXT v2 (batch)|1|56 s
|WD v1.4 ConvNeXTV2 v2|1|56 s
|ML-Danbooru TResNet-D 6-30000|1|58 s
|WD v1.4 ConvNeXTV2 v2 (batch)|1|58 s
|===
CPU inference
~~~~~~~~~~~~~
[cols="<,>,>", options=header]
|===
|Model|Batch size|Time
|DeepDanbooru|16|45 s
|DeepDanbooru|4|54 s
|DeepDanbooru|1|88 s
|ML-Danbooru TResNet-D 6-30000|4|139 s
|ML-Danbooru TResNet-D 6-30000|16|162 s
|ML-Danbooru TResNet-D 6-30000|1|167 s
|WD v1.4 ConvNeXT v2|1|208 s
|WD v1.4 ConvNeXT v2 (batch)|4|226 s
|WD v1.4 ConvNeXT v2 (batch)|16|238 s
|WD v1.4 ConvNeXTV2 v2|1|245 s
|WD v1.4 ConvNeXTV2 v2 (batch)|4|268 s
|WD v1.4 ViT v2 (batch)|16|270 s
|ML-Danbooru CAFormer dec-5-97527|4|270 s
|WD v1.4 ConvNeXT v2 (batch)|1|272 s
|WD v1.4 SwinV2 v2 (batch)|4|277 s
|WD v1.4 ViT v2 (batch)|4|277 s
|WD v1.4 ConvNeXTV2 v2 (batch)|16|294 s
|WD v1.4 SwinV2 v2 (batch)|1|300 s
|WD v1.4 SwinV2 v2|1|302 s
|WD v1.4 SwinV2 v2 (batch)|16|305 s
|ML-Danbooru CAFormer dec-5-97527|16|305 s
|WD v1.4 MOAT v2 (batch)|4|307 s
|WD v1.4 ViT v2|1|308 s
|WD v1.4 ViT v2 (batch)|1|311 s
|WD v1.4 ConvNeXTV2 v2 (batch)|1|312 s
|WD v1.4 MOAT v2|1|332 s
|WD v1.4 MOAT v2 (batch)|16|335 s
|WD v1.4 MOAT v2 (batch)|1|339 s
|ML-Danbooru CAFormer dec-5-97527|1|352 s
|===
Model benchmarks (macOS)
------------------------
These were measured with ORT 1.16.3 on a MacBook Pro, M1 Pro (16GB),
macOS Ventura 13.6.2, on a sample of 179 images. Times include model loading.
There was often significant memory pressure and swapping,
which may explain some of the anomalies. CoreML often makes things worse,
and generally consumes a lot more memory than pure CPU execution.
The kernel panic was repeatable.
GPU inference
~~~~~~~~~~~~~
[cols="<2,>1,>1", options=header]
|===
|Model|Batch size|Time
|DeepDanbooru|1|24 s
|DeepDanbooru|8|31 s
|DeepDanbooru|4|33 s
|WD v1.4 SwinV2 v2 (batch)|4|71 s
|WD v1.4 SwinV2 v2 (batch)|1|76 s
|WD v1.4 ViT v2 (batch)|4|97 s
|WD v1.4 ViT v2 (batch)|8|97 s
|ML-Danbooru TResNet-D 6-30000|8|100 s
|ML-Danbooru TResNet-D 6-30000|4|101 s
|WD v1.4 ViT v2 (batch)|1|105 s
|ML-Danbooru TResNet-D 6-30000|1|125 s
|WD v1.4 ConvNeXT v2 (batch)|8|126 s
|WD v1.4 SwinV2 v2 (batch)|8|127 s
|WD v1.4 ConvNeXT v2 (batch)|4|128 s
|WD v1.4 ConvNeXTV2 v2 (batch)|8|132 s
|WD v1.4 ConvNeXTV2 v2 (batch)|4|133 s
|WD v1.4 ViT v2|1|146 s
|WD v1.4 ConvNeXT v2 (batch)|1|149 s
|WD v1.4 ConvNeXTV2 v2 (batch)|1|160 s
|WD v1.4 MOAT v2 (batch)|1|165 s
|WD v1.4 SwinV2 v2|1|166 s
|ML-Danbooru CAFormer dec-5-97527|1|263 s
|WD v1.4 ConvNeXT v2|1|273 s
|WD v1.4 MOAT v2|1|273 s
|WD v1.4 ConvNeXTV2 v2|1|340 s
|ML-Danbooru CAFormer dec-5-97527|4|445 s
|ML-Danbooru CAFormer dec-5-97527|8|1790 s
|WD v1.4 MOAT v2 (batch)|4|kernel panic
|===
CPU inference
~~~~~~~~~~~~~
[cols="<2,>1,>1", options=header]
|===
|Model|Batch size|Time
|DeepDanbooru|8|54 s
|DeepDanbooru|4|55 s
|DeepDanbooru|1|75 s
|WD v1.4 SwinV2 v2 (batch)|8|93 s
|WD v1.4 SwinV2 v2 (batch)|4|94 s
|ML-Danbooru TResNet-D 6-30000|8|97 s
|WD v1.4 SwinV2 v2 (batch)|1|98 s
|ML-Danbooru TResNet-D 6-30000|4|99 s
|WD v1.4 SwinV2 v2|1|99 s
|ML-Danbooru CAFormer dec-5-97527|4|110 s
|ML-Danbooru CAFormer dec-5-97527|8|110 s
|WD v1.4 ViT v2 (batch)|4|111 s
|WD v1.4 ViT v2 (batch)|8|111 s
|WD v1.4 ViT v2 (batch)|1|113 s
|WD v1.4 ViT v2|1|113 s
|ML-Danbooru TResNet-D 6-30000|1|118 s
|ML-Danbooru CAFormer dec-5-97527|1|122 s
|WD v1.4 ConvNeXT v2 (batch)|8|124 s
|WD v1.4 ConvNeXT v2 (batch)|4|125 s
|WD v1.4 ConvNeXTV2 v2 (batch)|8|129 s
|WD v1.4 ConvNeXT v2|1|130 s
|WD v1.4 ConvNeXTV2 v2 (batch)|4|131 s
|WD v1.4 MOAT v2 (batch)|8|134 s
|WD v1.4 ConvNeXTV2 v2|1|136 s
|WD v1.4 MOAT v2 (batch)|4|136 s
|WD v1.4 ConvNeXT v2 (batch)|1|146 s
|WD v1.4 MOAT v2 (batch)|1|156 s
|WD v1.4 MOAT v2|1|156 s
|WD v1.4 ConvNeXTV2 v2 (batch)|1|157 s
|===
Comparison with WDMassTagger
----------------------------
Using CUDA, on the same Linux computer as above, on a sample of 6352 images.
We're a bit slower, depending on the model.
Batch sizes of 16 and 32 give practically equivalent results for both.
[cols="<,>,>,>", options="header,autowidth"]
|===
|Model|WDMassTagger|deeptagger (batch)|Ratio
|wd-v1-4-convnext-tagger-v2 |1:18 |1:55 |68 %
|wd-v1-4-convnextv2-tagger-v2 |1:20 |2:10 |62 %
|wd-v1-4-moat-tagger-v2 |1:22 |1:52 |73 %
|wd-v1-4-swinv2-tagger-v2 |1:28 |1:34 |94 %
|wd-v1-4-vit-tagger-v2 |1:16 |1:22 |93 %
|===

51
deeptagger/bench-interpret.sh Executable file
View File

@@ -0,0 +1,51 @@
#!/bin/sh -e
parse() {
awk 'BEGIN {
OFS = FS = "\t"
} {
name = $1
path = $2
cpu = $3 != ""
batch = $4
time = $5
if (path ~ "/batch-")
name = name " (batch)"
else if (name ~ /^WD / && batch > 1)
next
} {
group = name FS cpu FS batch
if (lastgroup != group) {
if (lastgroup)
print lastgroup, mintime
lastgroup = group
mintime = time
} else {
if (mintime > time)
mintime = time
}
} END {
print lastgroup, mintime
}' "${BENCH_LOG:-bench.out}"
}
cat <<END
GPU inference
~~~~~~~~~~~~~
[cols="<,>,>", options=header]
|===
|Model|Batch size|Time
$(parse | awk -F'\t' 'BEGIN { OFS = "|" }
!$2 { print "", $1, $3, $4 " s" }' | sort -t'|' -nk4)
|===
CPU inference
~~~~~~~~~~~~~
[cols="<,>,>", options=header]
|===
|Model|Batch size|Time
$(parse | awk -F'\t' 'BEGIN { OFS = "|" }
$2 { print "", $1, $3, $4 " s" }' | sort -t'|' -nk4)
|===
END

36
deeptagger/bench.sh Executable file
View File

@@ -0,0 +1,36 @@
#!/bin/sh -e
if [ $# -lt 2 ] || ! [ -x "$1" ]
then
echo "Usage: $0 DEEPTAGGER FILE..."
echo "Run this after using download.sh, from the same directory."
exit 1
fi
runner=$1
shift
log=bench.out
: >$log
run() {
opts=$1 batch=$2 model=$3
shift 3
for i in $(seq 1 3)
do
start=$(date +%s)
"$runner" $opts -b "$batch" -t 0.75 "$model" "$@" >/dev/null || :
end=$(date +%s)
printf '%s\t%s\t%s\t%s\t%s\n' \
"$name" "$model" "$opts" "$batch" "$((end - start))" | tee -a $log
done
}
for model in models/*.model
do
name=$(sed -n 's/^name=//p' "$model")
for batch in 1 4 16
do
run "" $batch "$model" "$@"
run --cpu $batch "$model" "$@"
done
done

746
deeptagger/deeptagger.cpp Normal file
View File

@@ -0,0 +1,746 @@
#include <getopt.h>
#include <Magick++.h>
#include <onnxruntime_cxx_api.h>
#ifdef __APPLE__
#include <coreml_provider_factory.h>
#endif
#include <algorithm>
#include <condition_variable>
#include <filesystem>
#include <fstream>
#include <iostream>
#include <mutex>
#include <queue>
#include <regex>
#include <set>
#include <stdexcept>
#include <string>
#include <thread>
#include <tuple>
#include <cstdio>
#include <cstdint>
#include <climits>
static struct {
bool cpu = false;
int debug = 0;
long batch = 1;
float threshold = 0.1;
// Execution provider name → Key → Value
std::map<std::string, std::map<std::string, std::string>> options;
} g;
// --- Configuration -----------------------------------------------------------
// Arguably, input normalization could be incorporated into models instead.
struct Config {
std::string name;
enum class Shape {NHWC, NCHW} shape = Shape::NHWC;
enum class Channels {RGB, BGR} channels = Channels::RGB;
bool normalize = false;
enum class Pad {WHITE, EDGE, STRETCH} pad = Pad::WHITE;
int size = -1;
bool sigmoid = false;
std::vector<std::string> tags;
};
static void
read_tags(const std::string &path, std::vector<std::string> &tags)
{
std::ifstream f(path);
f.exceptions(std::ifstream::badbit);
if (!f)
throw std::runtime_error("cannot read tags");
std::string line;
while (std::getline(f, line)) {
if (!line.empty() && line.back() == '\r')
line.erase(line.size() - 1);
tags.push_back(line);
}
}
static void
read_field(Config &config, std::string key, std::string value)
{
if (key == "name") {
config.name = value;
} else if (key == "shape") {
if (value == "nhwc") config.shape = Config::Shape::NHWC;
else if (value == "nchw") config.shape = Config::Shape::NCHW;
else throw std::invalid_argument("bad value for: " + key);
} else if (key == "channels") {
if (value == "rgb") config.channels = Config::Channels::RGB;
else if (value == "bgr") config.channels = Config::Channels::BGR;
else throw std::invalid_argument("bad value for: " + key);
} else if (key == "normalize") {
if (value == "true") config.normalize = true;
else if (value == "false") config.normalize = false;
else throw std::invalid_argument("bad value for: " + key);
} else if (key == "pad") {
if (value == "white") config.pad = Config::Pad::WHITE;
else if (value == "edge") config.pad = Config::Pad::EDGE;
else if (value == "stretch") config.pad = Config::Pad::STRETCH;
else throw std::invalid_argument("bad value for: " + key);
} else if (key == "size") {
config.size = std::stoi(value);
} else if (key == "interpret") {
if (value == "false") config.sigmoid = false;
else if (value == "sigmoid") config.sigmoid = true;
else throw std::invalid_argument("bad value for: " + key);
} else {
throw std::invalid_argument("unsupported config key: " + key);
}
}
static void
read_config(Config &config, const char *path)
{
std::ifstream f(path);
f.exceptions(std::ifstream::badbit);
if (!f)
throw std::runtime_error("cannot read configuration");
std::regex re(R"(^\s*([^#=]+?)\s*=\s*([^#]*?)\s*(?:#|$))",
std::regex::optimize);
std::smatch m;
std::string line;
while (std::getline(f, line)) {
if (std::regex_match(line, m, re))
read_field(config, m[1].str(), m[2].str());
}
read_tags(
std::filesystem::path(path).replace_extension("tags").string(),
config.tags);
}
// --- Data preparation --------------------------------------------------------
static float *
image_to_nhwc(float *data, Magick::Image &image, Config::Channels channels)
{
unsigned int width = image.columns();
unsigned int height = image.rows();
auto pixels = image.getConstPixels(0, 0, width, height);
switch (channels) {
case Config::Channels::RGB:
for (unsigned int y = 0; y < height; y++) {
for (unsigned int x = 0; x < width; x++) {
auto pixel = *pixels++;
*data++ = ScaleQuantumToChar(pixel.red);
*data++ = ScaleQuantumToChar(pixel.green);
*data++ = ScaleQuantumToChar(pixel.blue);
}
}
break;
case Config::Channels::BGR:
for (unsigned int y = 0; y < height; y++) {
for (unsigned int x = 0; x < width; x++) {
auto pixel = *pixels++;
*data++ = ScaleQuantumToChar(pixel.blue);
*data++ = ScaleQuantumToChar(pixel.green);
*data++ = ScaleQuantumToChar(pixel.red);
}
}
}
return data;
}
static float *
image_to_nchw(float *data, Magick::Image &image, Config::Channels channels)
{
unsigned int width = image.columns();
unsigned int height = image.rows();
auto pixels = image.getConstPixels(0, 0, width, height), pp = pixels;
switch (channels) {
case Config::Channels::RGB:
for (unsigned int y = 0; y < height; y++)
for (unsigned int x = 0; x < width; x++)
*data++ = ScaleQuantumToChar((*pp++).red);
pp = pixels;
for (unsigned int y = 0; y < height; y++)
for (unsigned int x = 0; x < width; x++)
*data++ = ScaleQuantumToChar((*pp++).green);
pp = pixels;
for (unsigned int y = 0; y < height; y++)
for (unsigned int x = 0; x < width; x++)
*data++ = ScaleQuantumToChar((*pp++).blue);
break;
case Config::Channels::BGR:
for (unsigned int y = 0; y < height; y++)
for (unsigned int x = 0; x < width; x++)
*data++ = ScaleQuantumToChar((*pp++).blue);
pp = pixels;
for (unsigned int y = 0; y < height; y++)
for (unsigned int x = 0; x < width; x++)
*data++ = ScaleQuantumToChar((*pp++).green);
pp = pixels;
for (unsigned int y = 0; y < height; y++)
for (unsigned int x = 0; x < width; x++)
*data++ = ScaleQuantumToChar((*pp++).red);
}
return data;
}
static Magick::Image
load(const std::string filename,
const Config &config, int64_t width, int64_t height)
{
Magick::Image image;
try {
image.read(filename);
} catch (const Magick::Warning &warning) {
if (g.debug)
fprintf(stderr, "%s: %s\n", filename.c_str(), warning.what());
}
image.autoOrient();
Magick::Geometry adjusted(width, height);
switch (config.pad) {
case Config::Pad::EDGE:
case Config::Pad::WHITE:
adjusted.greater(true);
break;
case Config::Pad::STRETCH:
adjusted.aspect(false);
}
image.resize(adjusted, Magick::LanczosFilter);
// The GraphicsMagick API doesn't offer any good options.
if (config.pad == Config::Pad::EDGE) {
MagickLib::SetImageVirtualPixelMethod(
image.image(), MagickLib::EdgeVirtualPixelMethod);
auto x = (int64_t(image.columns()) - width) / 2;
auto y = (int64_t(image.rows()) - height) / 2;
auto source = image.getConstPixels(x, y, width, height);
std::vector<MagickLib::PixelPacket>
pixels(source, source + width * height);
Magick::Image edged(Magick::Geometry(width, height), "black");
edged.classType(Magick::DirectClass);
auto target = edged.setPixels(0, 0, width, height);
memcpy(target, pixels.data(), pixels.size() * sizeof pixels[0]);
edged.syncPixels();
image = edged;
}
// Center it in a square patch of white, removing any transparency.
// image.extent() could probably be used to do the same thing.
Magick::Image white(Magick::Geometry(width, height), "white");
auto x = (white.columns() - image.columns()) / 2;
auto y = (white.rows() - image.rows()) / 2;
white.composite(image, x, y, Magick::OverCompositeOp);
white.fileName(filename);
if (g.debug > 2)
white.display();
return white;
}
// --- Inference ---------------------------------------------------------------
static void
run(std::vector<Magick::Image> &images, const Config &config,
Ort::Session &session, std::vector<int64_t> shape)
{
// For consistency, this value may be bumped to always be g.batch,
// but it does not seem to have an effect on anything.
shape[0] = images.size();
Ort::AllocatorWithDefaultOptions allocator;
auto tensor = Ort::Value::CreateTensor<float>(
allocator, shape.data(), shape.size());
auto input_len = tensor.GetTensorTypeAndShapeInfo().GetElementCount();
auto input_data = tensor.GetTensorMutableData<float>(), pi = input_data;
for (int64_t i = 0; i < images.size(); i++) {
switch (config.shape) {
case Config::Shape::NCHW:
pi = image_to_nchw(pi, images.at(i), config.channels);
break;
case Config::Shape::NHWC:
pi = image_to_nhwc(pi, images.at(i), config.channels);
}
}
if (config.normalize) {
pi = input_data;
for (size_t i = 0; i < input_len; i++)
*pi++ /= 255.0;
}
std::string input_name =
session.GetInputNameAllocated(0, allocator).get();
std::string output_name =
session.GetOutputNameAllocated(0, allocator).get();
std::vector<const char *> input_names = {input_name.c_str()};
std::vector<const char *> output_names = {output_name.c_str()};
auto outputs = session.Run(Ort::RunOptions{},
input_names.data(), &tensor, input_names.size(),
output_names.data(), output_names.size());
if (outputs.size() != 1 || !outputs[0].IsTensor()) {
fprintf(stderr, "Wrong output\n");
return;
}
auto output_len = outputs[0].GetTensorTypeAndShapeInfo().GetElementCount();
auto output_data = outputs.front().GetTensorData<float>(), po = output_data;
if (output_len != shape[0] * config.tags.size()) {
fprintf(stderr, "Tags don't match the output\n");
return;
}
for (size_t i = 0; i < images.size(); i++) {
for (size_t t = 0; t < config.tags.size(); t++) {
float value = *po++;
if (config.sigmoid)
value = 1 / (1 + std::exp(-value));
if (value > g.threshold) {
printf("%s\t%s\t%.2f\n", images.at(i).fileName().c_str(),
config.tags.at(t).c_str(), value);
}
}
}
fflush(stdout);
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
static void
parse_options(const std::string &options)
{
auto semicolon = options.find(";");
auto name = options.substr(0, semicolon);
auto sequence = options.substr(semicolon);
std::map<std::string, std::string> kv;
std::regex re(R"(;*([^;=]+)=([^;=]+))", std::regex::optimize);
std::sregex_iterator it(sequence.begin(), sequence.end(), re), end;
for (; it != end; ++it)
kv[it->str(1)] = it->str(2);
g.options.insert_or_assign(name, std::move(kv));
}
static std::tuple<std::vector<const char *>, std::vector<const char *>>
unpack_options(const std::string &provider)
{
std::vector<const char *> keys, values;
if (g.options.count(provider)) {
for (const auto &kv : g.options.at(provider)) {
keys.push_back(kv.first.c_str());
values.push_back(kv.second.c_str());
}
}
return {keys, values};
}
static void
add_providers(Ort::SessionOptions &options)
{
auto api = Ort::GetApi();
auto v_providers = Ort::GetAvailableProviders();
std::set<std::string> providers(v_providers.begin(), v_providers.end());
if (g.debug) {
printf("Providers:");
for (const auto &it : providers)
printf(" %s", it.c_str());
printf("\n");
}
// There is a string-based AppendExecutionProvider() method,
// but it cannot be used with all providers.
// TODO: Make it possible to disable providers.
// TODO: Providers will deserve some performance tuning.
if (g.cpu)
return;
#ifdef __APPLE__
if (providers.count("CoreMLExecutionProvider")) {
try {
Ort::ThrowOnError(
OrtSessionOptionsAppendExecutionProvider_CoreML(options, 0));
} catch (const std::exception &e) {
fprintf(stderr, "CoreML unavailable: %s\n", e.what());
}
}
#endif
#if TENSORRT
// TensorRT should be the more performant execution provider, however:
// - it is difficult to set up (needs logging in to download),
// - with WD v1.4 ONNX models, one gets "Your ONNX model has been generated
// with INT64 weights, while TensorRT does not natively support INT64.
// Attempting to cast down to INT32." and that's not nice.
if (providers.count("TensorrtExecutionProvider")) {
OrtTensorRTProviderOptionsV2* tensorrt_options = nullptr;
Ort::ThrowOnError(api.CreateTensorRTProviderOptions(&tensorrt_options));
auto [keys, values] = unpack_options("TensorrtExecutionProvider");
if (!keys.empty()) {
Ort::ThrowOnError(api.UpdateTensorRTProviderOptions(
tensorrt_options, keys.data(), values.data(), keys.size()));
}
try {
options.AppendExecutionProvider_TensorRT_V2(*tensorrt_options);
} catch (const std::exception &e) {
fprintf(stderr, "TensorRT unavailable: %s\n", e.what());
}
api.ReleaseTensorRTProviderOptions(tensorrt_options);
}
#endif
// See CUDA-ExecutionProvider.html for documentation.
if (providers.count("CUDAExecutionProvider")) {
OrtCUDAProviderOptionsV2* cuda_options = nullptr;
Ort::ThrowOnError(api.CreateCUDAProviderOptions(&cuda_options));
auto [keys, values] = unpack_options("CUDAExecutionProvider");
if (!keys.empty()) {
Ort::ThrowOnError(api.UpdateCUDAProviderOptions(
cuda_options, keys.data(), values.data(), keys.size()));
}
try {
options.AppendExecutionProvider_CUDA_V2(*cuda_options);
} catch (const std::exception &e) {
fprintf(stderr, "CUDA unavailable: %s\n", e.what());
}
api.ReleaseCUDAProviderOptions(cuda_options);
}
if (providers.count("ROCMExecutionProvider")) {
OrtROCMProviderOptions rocm_options = {};
auto [keys, values] = unpack_options("ROCMExecutionProvider");
if (!keys.empty()) {
Ort::ThrowOnError(api.UpdateROCMProviderOptions(
&rocm_options, keys.data(), values.data(), keys.size()));
}
try {
options.AppendExecutionProvider_ROCM(rocm_options);
} catch (const std::exception &e) {
fprintf(stderr, "ROCM unavailable: %s\n", e.what());
}
}
// The CPU provider is the default fallback, if everything else fails.
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
struct Thumbnailing {
std::mutex input_mutex;
std::condition_variable input_cv;
std::queue<std::string> input; // All input paths
int work = 0; // Number of images requested
std::mutex output_mutex;
std::condition_variable output_cv;
std::vector<Magick::Image> output; // Processed images
int done = 0; // Finished worker threads
};
static void
thumbnail(const Config &config, int64_t width, int64_t height,
Thumbnailing &ctx)
{
while (true) {
std::unique_lock<std::mutex> input_lock(ctx.input_mutex);
ctx.input_cv.wait(input_lock,
[&]{ return ctx.input.empty() || ctx.work; });
if (ctx.input.empty())
break;
auto path = ctx.input.front();
ctx.input.pop();
ctx.work--;
input_lock.unlock();
Magick::Image image;
try {
image = load(path, config, width, height);
if (height != image.rows() || width != image.columns())
throw std::runtime_error("tensor mismatch");
std::unique_lock<std::mutex> output_lock(ctx.output_mutex);
ctx.output.push_back(image);
output_lock.unlock();
ctx.output_cv.notify_all();
} catch (const std::exception &e) {
fprintf(stderr, "%s: %s\n", path.c_str(), e.what());
std::unique_lock<std::mutex> input_lock(ctx.input_mutex);
ctx.work++;
input_lock.unlock();
ctx.input_cv.notify_all();
}
}
std::unique_lock<std::mutex> output_lock(ctx.output_mutex);
ctx.done++;
output_lock.unlock();
ctx.output_cv.notify_all();
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
static std::string
print_shape(const Ort::ConstTensorTypeAndShapeInfo &info)
{
std::vector<const char *> names(info.GetDimensionsCount());
info.GetSymbolicDimensions(names.data(), names.size());
auto shape = info.GetShape();
std::string result;
for (size_t i = 0; i < shape.size(); i++) {
if (shape[i] < 0)
result.append(names.at(i));
else
result.append(std::to_string(shape[i]));
result.append(" x ");
}
if (!result.empty())
result.erase(result.size() - 3);
return result;
}
static void
print_shapes(const Ort::Session &session)
{
Ort::AllocatorWithDefaultOptions allocator;
for (size_t i = 0; i < session.GetInputCount(); i++) {
std::string name = session.GetInputNameAllocated(i, allocator).get();
auto info = session.GetInputTypeInfo(i);
auto shape = print_shape(info.GetTensorTypeAndShapeInfo());
printf("Input: %s: %s\n", name.c_str(), shape.c_str());
}
for (size_t i = 0; i < session.GetOutputCount(); i++) {
std::string name = session.GetOutputNameAllocated(i, allocator).get();
auto info = session.GetOutputTypeInfo(i);
auto shape = print_shape(info.GetTensorTypeAndShapeInfo());
printf("Output: %s: %s\n", name.c_str(), shape.c_str());
}
}
static void
infer(Ort::Env &env, const char *path, const std::vector<std::string> &images)
{
Config config;
read_config(config, path);
Ort::SessionOptions session_options;
add_providers(session_options);
Ort::Session session = Ort::Session(env,
std::filesystem::path(path).replace_extension("onnx").c_str(),
session_options);
if (g.debug)
print_shapes(session);
if (session.GetInputCount() != 1 || session.GetOutputCount() != 1) {
fprintf(stderr, "Invalid input or output shape\n");
exit(EXIT_FAILURE);
}
auto input_info = session.GetInputTypeInfo(0);
auto shape = input_info.GetTensorTypeAndShapeInfo().GetShape();
if (shape.size() != 4) {
fprintf(stderr, "Incompatible input tensor format\n");
exit(EXIT_FAILURE);
}
if (shape.at(0) > 1) {
fprintf(stderr, "Fixed batching not supported\n");
exit(EXIT_FAILURE);
}
if (shape.at(0) >= 0 && g.batch > 1) {
fprintf(stderr, "Requested batching for a non-batching model\n");
exit(EXIT_FAILURE);
}
int64_t *height = {}, *width = {}, *channels = {};
switch (config.shape) {
case Config::Shape::NCHW:
channels = &shape[1];
height = &shape[2];
width = &shape[3];
break;
case Config::Shape::NHWC:
height = &shape[1];
width = &shape[2];
channels = &shape[3];
break;
}
// Variable dimensions don't combine well with batches.
if (*height < 0)
*height = config.size;
if (*width < 0)
*width = config.size;
if (*channels != 3 || *height < 1 || *width < 1) {
fprintf(stderr, "Incompatible input tensor format\n");
return;
}
// By only parallelizing image loads here during batching,
// they never compete for CPU time with inference.
Thumbnailing ctx;
for (const auto &path : images)
ctx.input.push(path);
auto workers = g.batch;
if (auto threads = std::thread::hardware_concurrency())
workers = std::min(workers, long(threads));
for (auto i = workers; i--; )
std::thread(thumbnail, std::ref(config), *width, *height,
std::ref(ctx)).detach();
while (true) {
std::unique_lock<std::mutex> input_lock(ctx.input_mutex);
ctx.work = g.batch;
input_lock.unlock();
ctx.input_cv.notify_all();
std::unique_lock<std::mutex> output_lock(ctx.output_mutex);
ctx.output_cv.wait(output_lock,
[&]{ return ctx.output.size() == g.batch || ctx.done == workers; });
if (!ctx.output.empty()) {
run(ctx.output, config, session, shape);
ctx.output.clear();
}
if (ctx.done == workers)
break;
}
}
int
main(int argc, char *argv[])
{
auto invocation_name = argv[0];
auto print_usage = [=] {
fprintf(stderr,
"Usage: %s [-b BATCH] [--cpu] [-d] [-o EP;KEY=VALUE...] "
"[-t THRESHOLD] MODEL { --pipe | [IMAGE...] }\n", invocation_name);
};
static option opts[] = {
{"batch", required_argument, 0, 'b'},
{"cpu", no_argument, 0, 'c'},
{"debug", no_argument, 0, 'd'},
{"help", no_argument, 0, 'h'},
{"options", required_argument, 0, 'o'},
{"pipe", no_argument, 0, 'p'},
{"threshold", required_argument, 0, 't'},
{nullptr, 0, 0, 0},
};
bool pipe = false;
while (1) {
int option_index = 0;
auto c = getopt_long(argc, const_cast<char *const *>(argv),
"b:cdho:pt:", opts, &option_index);
if (c == -1)
break;
char *end = nullptr;
switch (c) {
case 'b':
errno = 0, g.batch = strtol(optarg, &end, 10);
if (errno || *end || g.batch < 1 || g.batch > SHRT_MAX) {
fprintf(stderr, "Batch size must be a positive number\n");
exit(EXIT_FAILURE);
}
break;
case 'c':
g.cpu = true;
break;
case 'd':
g.debug++;
break;
case 'h':
print_usage();
return 0;
case 'o':
parse_options(optarg);
break;
case 'p':
pipe = true;
break;
case 't':
errno = 0, g.threshold = strtod(optarg, &end);
if (errno || *end || !std::isfinite(g.threshold) ||
g.threshold < 0 || g.threshold > 1) {
fprintf(stderr, "Threshold must be a number within 0..1\n");
exit(EXIT_FAILURE);
}
break;
default:
print_usage();
return 1;
}
}
argv += optind;
argc -= optind;
// TODO: There's actually no need to slurp all the lines up front.
std::vector<std::string> paths;
if (pipe) {
if (argc != 1) {
print_usage();
return 1;
}
std::string line;
while (std::getline(std::cin, line))
paths.push_back(line);
} else {
if (argc < 1) {
print_usage();
return 1;
}
paths.assign(argv + 1, argv + argc);
}
// Load batched images in parallel (the first is for GM, the other for IM).
if (g.batch > 1) {
auto value = std::to_string(
std::max(long(std::thread::hardware_concurrency()) / g.batch, 1L));
setenv("OMP_NUM_THREADS", value.c_str(), true);
setenv("MAGICK_THREAD_LIMIT", value.c_str(), true);
}
// XXX: GraphicsMagick initializes signal handlers here,
// one needs to use MagickLib::InitializeMagickEx()
// with MAGICK_OPT_NO_SIGNAL_HANDER to prevent that.
//
// ImageMagick conveniently has the opposite default.
Magick::InitializeMagick(nullptr);
OrtLoggingLevel logging = g.debug > 1
? ORT_LOGGING_LEVEL_VERBOSE
: ORT_LOGGING_LEVEL_WARNING;
// Creating an environment before initializing providers in order to avoid:
// "Attempt to use DefaultLogger but none has been registered."
Ort::Env env(logging, invocation_name);
infer(env, argv[0], paths);
return 0;
}

163
deeptagger/download.sh Executable file
View File

@@ -0,0 +1,163 @@
#!/bin/sh -e
# Requirements: Python ~ 3.11, curl, unzip, git-lfs, awk
#
# This script downloads a bunch of models into the models/ directory,
# after any necessary transformations to run them using the deeptagger binary.
#
# Once it succeeds, feel free to remove everything but *.{model,tags,onnx}
git lfs install
mkdir -p models
cd models
# Create a virtual environment for model conversion.
#
# If any of the Python stuff fails,
# retry from within a Conda environment with a different version of Python.
export VIRTUAL_ENV=$(pwd)/venv
export TF_ENABLE_ONEDNN_OPTS=0
if ! [ -f "$VIRTUAL_ENV/ready" ]
then
python3 -m venv "$VIRTUAL_ENV"
#"$VIRTUAL_ENV/bin/pip3" install tensorflow[and-cuda]
"$VIRTUAL_ENV/bin/pip3" install tf2onnx 'deepdanbooru[tensorflow]'
touch "$VIRTUAL_ENV/ready"
fi
status() {
echo "$(tput bold)-- $*$(tput sgr0)"
}
# Using the deepdanbooru package makes it possible to use other models
# trained with the project.
deepdanbooru() {
local name=$1 url=$2
status "$name"
local basename=$(basename "$url")
if ! [ -e "$basename" ]
then curl -LO "$url"
fi
local modelname=${basename%%.*}
if ! [ -d "$modelname" ]
then unzip -d "$modelname" "$basename"
fi
if ! [ -e "$modelname.tags" ]
then ln "$modelname/tags.txt" "$modelname.tags"
fi
if ! [ -d "$modelname.saved" ]
then "$VIRTUAL_ENV/bin/python3" - "$modelname" "$modelname.saved" <<-'END'
import sys
import deepdanbooru.project as ddp
model = ddp.load_model_from_project(
project_path=sys.argv[1], compile_model=False)
model.export(sys.argv[2])
END
fi
if ! [ -e "$modelname.onnx" ]
then "$VIRTUAL_ENV/bin/python3" -m tf2onnx.convert \
--saved-model "$modelname.saved" --output "$modelname.onnx"
fi
cat > "$modelname.model" <<-END
name=$name
shape=nhwc
channels=rgb
normalize=true
pad=edge
END
}
# ONNX preconversions don't have a symbolic first dimension, thus doing our own.
wd14() {
local name=$1 repository=$2
status "$name"
local modelname=$(basename "$repository")
if ! [ -d "$modelname" ]
then git clone "https://huggingface.co/$repository"
fi
# Though link the original export as well.
if ! [ -e "$modelname.onnx" ]
then ln "$modelname/model.onnx" "$modelname.onnx"
fi
if ! [ -e "$modelname.tags" ]
then awk -F, 'NR > 1 { print $2 }' "$modelname/selected_tags.csv" \
> "$modelname.tags"
fi
cat > "$modelname.model" <<-END
name=$name
shape=nhwc
channels=bgr
normalize=false
pad=white
END
if ! [ -e "batch-$modelname.onnx" ]
then "$VIRTUAL_ENV/bin/python3" -m tf2onnx.convert \
--saved-model "$modelname" --output "batch-$modelname.onnx"
fi
if ! [ -e "batch-$modelname.tags" ]
then ln "$modelname.tags" "batch-$modelname.tags"
fi
if ! [ -e "batch-$modelname.model" ]
then ln "$modelname.model" "batch-$modelname.model"
fi
}
# These models are an undocumented mess, thus using ONNX preconversions.
mldanbooru() {
local name=$1 size=$2 basename=$3
status "$name"
if ! [ -d ml-danbooru-onnx ]
then git clone https://huggingface.co/deepghs/ml-danbooru-onnx
fi
local modelname=${basename%%.*}
if ! [ -e "$basename" ]
then ln "ml-danbooru-onnx/$basename"
fi
if ! [ -e "$modelname.tags" ]
then awk -F, 'NR > 1 { print $1 }' ml-danbooru-onnx/tags.csv \
> "$modelname.tags"
fi
cat > "$modelname.model" <<-END
name=$name
shape=nchw
channels=rgb
normalize=true
pad=stretch
size=$size
interpret=sigmoid
END
}
status "Downloading models, beware that git-lfs doesn't indicate progress"
deepdanbooru DeepDanbooru \
'https://github.com/KichangKim/DeepDanbooru/releases/download/v3-20211112-sgd-e28/deepdanbooru-v3-20211112-sgd-e28.zip'
#wd14 'WD v1.4 ViT v1' 'SmilingWolf/wd-v1-4-vit-tagger'
wd14 'WD v1.4 ViT v2' 'SmilingWolf/wd-v1-4-vit-tagger-v2'
#wd14 'WD v1.4 ConvNeXT v1' 'SmilingWolf/wd-v1-4-convnext-tagger'
wd14 'WD v1.4 ConvNeXT v2' 'SmilingWolf/wd-v1-4-convnext-tagger-v2'
wd14 'WD v1.4 ConvNeXTV2 v2' 'SmilingWolf/wd-v1-4-convnextv2-tagger-v2'
wd14 'WD v1.4 SwinV2 v2' 'SmilingWolf/wd-v1-4-swinv2-tagger-v2'
wd14 'WD v1.4 MOAT v2' 'SmilingWolf/wd-v1-4-moat-tagger-v2'
# As suggested by author https://github.com/IrisRainbowNeko/ML-Danbooru-webui
mldanbooru 'ML-Danbooru CAFormer dec-5-97527' \
448 'ml_caformer_m36_dec-5-97527.onnx'
mldanbooru 'ML-Danbooru TResNet-D 6-30000' \
640 'TResnet-D-FLq_ema_6-30000.onnx'

View File

@@ -23,6 +23,7 @@ CREATE TABLE IF NOT EXISTS node(
) STRICT;
CREATE INDEX IF NOT EXISTS node__sha1 ON node(sha1);
CREATE INDEX IF NOT EXISTS node__parent ON node(parent);
CREATE UNIQUE INDEX IF NOT EXISTS node__parent_name
ON node(IFNULL(parent, 0), name);
@@ -76,7 +77,7 @@ CREATE TABLE IF NOT EXISTS tag_space(
id INTEGER NOT NULL,
name TEXT NOT NULL,
description TEXT,
CHECK (name NOT LIKE '%:%'),
CHECK (name NOT LIKE '%:%' AND name NOT LIKE '-%'),
PRIMARY KEY (id)
) STRICT;

579
main.go
View File

@@ -41,6 +41,9 @@ import (
"golang.org/x/image/webp"
)
// #include <unistd.h>
import "C"
var (
db *sql.DB // sqlite database
galleryDirectory string // gallery directory
@@ -59,19 +62,47 @@ func hammingDistance(a, b int64) int {
return bits.OnesCount64(uint64(a) ^ uint64(b))
}
type productAggregator float64
func (pa *productAggregator) Step(v float64) {
*pa = productAggregator(float64(*pa) * v)
}
func (pa *productAggregator) Done() float64 {
return float64(*pa)
}
func newProductAggregator() *productAggregator {
pa := productAggregator(1)
return &pa
}
func init() {
sql.Register("sqlite3_custom", &sqlite3.SQLiteDriver{
ConnectHook: func(conn *sqlite3.SQLiteConn) error {
return conn.RegisterFunc("hamming", hammingDistance, true /*pure*/)
if err := conn.RegisterFunc(
"hamming", hammingDistance, true /*pure*/); err != nil {
return err
}
if err := conn.RegisterAggregator(
"product", newProductAggregator, true /*pure*/); err != nil {
return err
}
return nil
},
})
}
func openDB(directory string) error {
galleryDirectory = directory
var err error
db, err = sql.Open("sqlite3_custom", "file:"+filepath.Join(directory,
nameOfDB+"?_foreign_keys=1&_busy_timeout=1000"))
galleryDirectory = directory
if err != nil {
return err
}
_, err = db.Exec(initializeSQL)
return err
}
@@ -270,11 +301,10 @@ func cmdInit(fs *flag.FlagSet, args []string) error {
if fs.NArg() != 1 {
return errWrongUsage
}
if err := openDB(fs.Arg(0)); err != nil {
if err := os.MkdirAll(fs.Arg(0), 0755); err != nil {
return err
}
if _, err := db.Exec(initializeSQL); err != nil {
if err := openDB(fs.Arg(0)); err != nil {
return err
}
@@ -291,49 +321,7 @@ func cmdInit(fs *flag.FlagSet, args []string) error {
return nil
}
// --- Web ---------------------------------------------------------------------
var hashRE = regexp.MustCompile(`^/.*?/([0-9a-f]{40})$`)
var staticHandler http.Handler
var page = template.Must(template.New("/").Parse(`<!DOCTYPE html><html><head>
<title>Gallery</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel=stylesheet href=style.css>
</head><body>
<noscript>This is a web application, and requires Javascript.</noscript>
<script src=mithril.js></script>
<script src=gallery.js></script>
</body></html>`))
func handleRequest(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/" {
staticHandler.ServeHTTP(w, r)
return
}
if err := page.Execute(w, nil); err != nil {
log.Println(err)
}
}
func handleImages(w http.ResponseWriter, r *http.Request) {
if m := hashRE.FindStringSubmatch(r.URL.Path); m == nil {
http.NotFound(w, r)
} else {
http.ServeFile(w, r, imagePath(m[1]))
}
}
func handleThumbs(w http.ResponseWriter, r *http.Request) {
if m := hashRE.FindStringSubmatch(r.URL.Path); m == nil {
http.NotFound(w, r)
} else {
http.ServeFile(w, r, thumbPath(m[1]))
}
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// --- API: Browse -------------------------------------------------------------
func getSubdirectories(tx *sql.Tx, parent int64) (names []string, err error) {
return dbCollectStrings(`SELECT name FROM node
@@ -413,7 +401,7 @@ func handleAPIBrowse(w http.ResponseWriter, r *http.Request) {
}
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// --- API: Tags ---------------------------------------------------------------
type webTagNamespace struct {
Description string `json:"description"`
@@ -499,7 +487,7 @@ func handleAPITags(w http.ResponseWriter, r *http.Request) {
}
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// --- API: Duplicates ---------------------------------------------------------
type webDuplicateImage struct {
SHA1 string `json:"sha1"`
@@ -642,7 +630,7 @@ func handleAPIDuplicates(w http.ResponseWriter, r *http.Request) {
}
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// --- API: Orphans ------------------------------------------------------------
type webOrphanImage struct {
SHA1 string `json:"sha1"`
@@ -670,7 +658,9 @@ func getOrphanReplacement(webPath string) (*webOrphanImage, error) {
}
parent, err := idForDirectoryPath(tx, path[:len(path)-1], false)
if err != nil {
if errors.Is(err, sql.ErrNoRows) {
return nil, nil
} else if err != nil {
return nil, err
}
@@ -697,7 +687,8 @@ func getOrphans() (result []webOrphan, err error) {
FROM orphan AS o
JOIN image AS i ON o.sha1 = i.sha1
LEFT JOIN tag_assignment AS ta ON o.sha1 = ta.sha1
GROUP BY o.sha1`)
GROUP BY o.sha1
ORDER BY path`)
if err != nil {
return nil, err
}
@@ -739,7 +730,7 @@ func handleAPIOrphans(w http.ResponseWriter, r *http.Request) {
}
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// --- API: Image view ---------------------------------------------------------
func getImageDimensions(sha1 string) (w int64, h int64, err error) {
err = db.QueryRow(`SELECT width, height FROM image WHERE sha1 = ?`,
@@ -842,7 +833,7 @@ func handleAPIInfo(w http.ResponseWriter, r *http.Request) {
}
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// --- API: Image similar ------------------------------------------------------
type webSimilarImage struct {
SHA1 string `json:"sha1"`
@@ -854,15 +845,17 @@ type webSimilarImage struct {
func getSimilar(sha1 string, dhash int64, pixels int64, distance int) (
result []webSimilarImage, err error) {
// For distance ∈ {0, 1}, this query is quite inefficient.
// In exchange, it's generic.
//
// If there's a dhash, there should also be thumbnail dimensions,
// so not bothering with IFNULL on them.
rows, err := db.Query(`
SELECT sha1, width * height, IFNULL(thumbw, 0), IFNULL(thumbh, 0)
FROM image WHERE sha1 <> ? AND dhash IS NOT NULL
AND hamming(dhash, ?) = ?`, sha1, dhash, distance)
// If there's a dhash, there should also be thumbnail dimensions.
var rows *sql.Rows
common := `SELECT sha1, width * height, IFNULL(thumbw, 0), IFNULL(thumbh, 0)
FROM image WHERE sha1 <> ? AND `
if distance == 0 {
rows, err = db.Query(common+`dhash = ?`, sha1, dhash)
} else {
// This is generic, but quite inefficient for distance ∈ {0, 1}.
rows, err = db.Query(common+`dhash IS NOT NULL
AND hamming(dhash, ?) = ?`, sha1, dhash, distance)
}
if err != nil {
return nil, err
}
@@ -952,35 +945,90 @@ func handleAPISimilar(w http.ResponseWriter, r *http.Request) {
}
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// --- API: Search -------------------------------------------------------------
// The SQL building is the most miserable part of the whole program.
// NOTE: AND will mean MULTIPLY(IFNULL(ta.weight, 0)) per SHA1.
const searchCTE = `WITH
const searchCTE1 = `WITH
matches(sha1, thumbw, thumbh, score) AS (
SELECT i.sha1, i.thumbw, i.thumbh, ta.weight AS score
FROM tag_assignment AS ta
JOIN image AS i ON i.sha1 = ta.sha1
WHERE ta.tag = ?
),
supertags(tag) AS (
SELECT DISTINCT ta.tag
FROM tag_assignment AS ta
JOIN matches AS m ON m.sha1 = ta.sha1
),
scoredtags(tag, score) AS (
-- The cross join is a deliberate optimization,
-- and this query may still be really slow.
SELECT st.tag, AVG(IFNULL(ta.weight, 0)) AS score
FROM matches AS m
CROSS JOIN supertags AS st
LEFT JOIN tag_assignment AS ta
ON ta.sha1 = m.sha1 AND ta.tag = st.tag
GROUP BY st.tag
-- Using the column alias doesn't fail, but it also doesn't work.
HAVING AVG(IFNULL(ta.weight, 0)) >= 0.01
WHERE ta.tag = %d
)
`
const searchCTEMulti = `WITH
positive(tag) AS (VALUES %s),
filtered(sha1) AS (%s),
matches(sha1, thumbw, thumbh, score) AS (
SELECT i.sha1, i.thumbw, i.thumbh,
product(IFNULL(ta.weight, 0)) AS score
FROM image AS i, positive AS p
JOIN filtered AS c ON i.sha1 = c.sha1
LEFT JOIN tag_assignment AS ta ON ta.sha1 = i.sha1 AND ta.tag = p.tag
GROUP BY i.sha1
)
`
func searchQueryToCTE(tx *sql.Tx, query string) (string, error) {
positive, negative := []int64{}, []int64{}
for _, word := range strings.Split(query, " ") {
if word == "" {
continue
}
space, tag, _ := strings.Cut(word, ":")
negated := false
if strings.HasPrefix(space, "-") {
space = space[1:]
negated = true
}
var tagID int64
err := tx.QueryRow(`
SELECT t.id FROM tag AS t
JOIN tag_space AS ts ON t.space = ts.id
WHERE ts.name = ? AND t.name = ?`, space, tag).Scan(&tagID)
if err != nil {
return "", err
}
if negated {
negative = append(negative, tagID)
} else {
positive = append(positive, tagID)
}
}
// Don't return most of the database, and simplify the following builder.
if len(positive) == 0 {
return "", errors.New("search is too wide")
}
// Optimise single tag searches.
if len(positive) == 1 && len(negative) == 0 {
return fmt.Sprintf(searchCTE1, positive[0]), nil
}
values := fmt.Sprintf(`(%d)`, positive[0])
filtered := fmt.Sprintf(
`SELECT sha1 FROM tag_assignment WHERE tag = %d`, positive[0])
for _, tagID := range positive[1:] {
values += fmt.Sprintf(`, (%d)`, tagID)
filtered += fmt.Sprintf(` INTERSECT
SELECT sha1 FROM tag_assignment WHERE tag = %d`, tagID)
}
for _, tagID := range negative {
filtered += fmt.Sprintf(` EXCEPT
SELECT sha1 FROM tag_assignment WHERE tag = %d`, tagID)
}
return fmt.Sprintf(searchCTEMulti, values, filtered), nil
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
type webTagMatch struct {
SHA1 string `json:"sha1"`
ThumbW int64 `json:"thumbW"`
@@ -988,10 +1036,10 @@ type webTagMatch struct {
Score float32 `json:"score"`
}
func getTagMatches(tag int64) (matches []webTagMatch, err error) {
rows, err := db.Query(searchCTE+`
func getTagMatches(tx *sql.Tx, cte string) (matches []webTagMatch, err error) {
rows, err := tx.Query(cte + `
SELECT sha1, IFNULL(thumbw, 0), IFNULL(thumbh, 0), score
FROM matches`, tag)
FROM matches`)
if err != nil {
return nil, err
}
@@ -1009,32 +1057,78 @@ func getTagMatches(tag int64) (matches []webTagMatch, err error) {
return matches, rows.Err()
}
type webTagRelated struct {
Tag string `json:"tag"`
Score float32 `json:"score"`
type webTagSupertag struct {
space string
tag string
score float32
}
func getTagRelated(tag int64) (result map[string][]webTagRelated, err error) {
rows, err := db.Query(searchCTE+`
SELECT ts.name, t.name, st.score FROM scoredtags AS st
JOIN tag AS t ON st.tag = t.id
JOIN tag_space AS ts ON ts.id = t.space
ORDER BY st.score DESC`, tag)
func getTagSupertags(tx *sql.Tx, cte string) (
result map[int64]*webTagSupertag, err error) {
rows, err := tx.Query(cte + `
SELECT DISTINCT ta.tag, ts.name, t.name
FROM tag_assignment AS ta
JOIN matches AS m ON m.sha1 = ta.sha1
JOIN tag AS t ON ta.tag = t.id
JOIN tag_space AS ts ON ts.id = t.space`)
if err != nil {
return nil, err
}
defer rows.Close()
result = make(map[string][]webTagRelated)
result = make(map[int64]*webTagSupertag)
for rows.Next() {
var (
space string
r webTagRelated
tag int64
st webTagSupertag
)
if err = rows.Scan(&space, &r.Tag, &r.Score); err != nil {
if err = rows.Scan(&tag, &st.space, &st.tag); err != nil {
return nil, err
}
result[space] = append(result[space], r)
result[tag] = &st
}
return result, rows.Err()
}
type webTagRelated struct {
Tag string `json:"tag"`
Score float32 `json:"score"`
}
func getTagRelated(tx *sql.Tx, cte string, matches int) (
result map[string][]webTagRelated, err error) {
// Not sure if this level of efficiency is achievable directly in SQL.
supertags, err := getTagSupertags(tx, cte)
if err != nil {
return nil, err
}
rows, err := tx.Query(cte + `
SELECT ta.tag, ta.weight
FROM tag_assignment AS ta
JOIN matches AS m ON m.sha1 = ta.sha1`)
if err != nil {
return nil, err
}
defer rows.Close()
for rows.Next() {
var (
tag int64
weight float32
)
if err = rows.Scan(&tag, &weight); err != nil {
return nil, err
}
supertags[tag].score += weight
}
result = make(map[string][]webTagRelated)
for _, info := range supertags {
if score := info.score / float32(matches); score >= 0.1 {
r := webTagRelated{Tag: info.tag, Score: score}
result[info.space] = append(result[info.space], r)
}
}
return result, rows.Err()
}
@@ -1053,13 +1147,14 @@ func handleAPISearch(w http.ResponseWriter, r *http.Request) {
Related map[string][]webTagRelated `json:"related"`
}
space, tag, _ := strings.Cut(params.Query, ":")
tx, err := db.Begin()
if err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
defer tx.Rollback()
var tagID int64
err := db.QueryRow(`
SELECT t.id FROM tag AS t
JOIN tag_space AS ts ON t.space = ts.id
WHERE ts.name = ? AND t.name = ?`, space, tag).Scan(&tagID)
cte, err := searchQueryToCTE(tx, params.Query)
if errors.Is(err, sql.ErrNoRows) {
http.Error(w, err.Error(), http.StatusNotFound)
return
@@ -1068,11 +1163,12 @@ func handleAPISearch(w http.ResponseWriter, r *http.Request) {
return
}
if result.Matches, err = getTagMatches(tagID); err != nil {
if result.Matches, err = getTagMatches(tx, cte); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
if result.Related, err = getTagRelated(tagID); err != nil {
if result.Related, err = getTagRelated(tx, cte,
len(result.Matches)); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
@@ -1082,7 +1178,47 @@ func handleAPISearch(w http.ResponseWriter, r *http.Request) {
}
}
// - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
// --- Web ---------------------------------------------------------------------
var hashRE = regexp.MustCompile(`^/.*?/([0-9a-f]{40})$`)
var staticHandler http.Handler
var page = template.Must(template.New("/").Parse(`<!DOCTYPE html><html><head>
<title>Gallery</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel=stylesheet href=style.css>
</head><body>
<noscript>This is a web application, and requires Javascript.</noscript>
<script src=mithril.js></script>
<script src=gallery.js></script>
</body></html>`))
func handleRequest(w http.ResponseWriter, r *http.Request) {
if r.URL.Path != "/" {
staticHandler.ServeHTTP(w, r)
return
}
if err := page.Execute(w, nil); err != nil {
log.Println(err)
}
}
func handleImages(w http.ResponseWriter, r *http.Request) {
if m := hashRE.FindStringSubmatch(r.URL.Path); m == nil {
http.NotFound(w, r)
} else {
http.ServeFile(w, r, imagePath(m[1]))
}
}
func handleThumbs(w http.ResponseWriter, r *http.Request) {
if m := hashRE.FindStringSubmatch(r.URL.Path); m == nil {
http.NotFound(w, r)
} else {
http.ServeFile(w, r, thumbPath(m[1]))
}
}
// cmdWeb runs a web UI against GD on ADDRESS.
func cmdWeb(fs *flag.FlagSet, args []string) error {
@@ -1156,6 +1292,9 @@ type syncContext struct {
stmtDisposeSub *sql.Stmt
stmtDisposeAll *sql.Stmt
// exclude specifies filesystem paths that should be seen as missing.
exclude *regexp.Regexp
// linked tracks which image hashes we've checked so far in the run.
linked map[string]struct{}
}
@@ -1250,7 +1389,7 @@ func syncIsImage(path string) (bool, error) {
}
func syncPingImage(path string) (int, int, error) {
out, err := exec.Command("magick", "identify", "-limit", "thread", "1",
out, err := exec.Command("identify", "-limit", "thread", "1",
"-ping", "-format", "%w %h", path+"[0]").Output()
if err != nil {
return 0, 0, err
@@ -1415,7 +1554,11 @@ func syncPostProcess(c *syncContext, info syncFileInfo) error {
case info.err != nil:
// * → error
if ee, ok := info.err.(*exec.ExitError); ok {
syncPrintf(c, "%s: %s", info.fsPath, ee.Stderr)
message := string(ee.Stderr)
if message == "" {
message = ee.String()
}
syncPrintf(c, "%s: %s", info.fsPath, message)
} else {
return info.err
}
@@ -1560,6 +1703,12 @@ func syncDirectory(c *syncContext, dbParent int64, fsPath string) error {
fs = nil
}
if c.exclude != nil {
fs = slices.DeleteFunc(fs, func(f syncFile) bool {
return c.exclude.MatchString(filepath.Join(fsPath, f.fsName))
})
}
// Convert differences to a form more convenient for processing.
iDB, iFS, pairs := 0, 0, []syncPair{}
for iDB < len(db) && iFS < len(fs) {
@@ -1735,9 +1884,21 @@ const disposeCTE = `WITH RECURSIVE
HAVING count = total
)`
type excludeRE struct{ re *regexp.Regexp }
func (re *excludeRE) String() string { return fmt.Sprintf("%v", re.re) }
func (re *excludeRE) Set(value string) error {
var err error
re.re, err = regexp.Compile(value)
return err
}
// cmdSync ensures the given (sub)roots are accurately reflected
// in the database.
func cmdSync(fs *flag.FlagSet, args []string) error {
var exclude excludeRE
fs.Var(&exclude, "exclude", "exclude paths matching regular expression")
fullpaths := fs.Bool("fullpaths", false, "don't basename arguments")
if err := fs.Parse(args); err != nil {
return err
@@ -1775,7 +1936,7 @@ func cmdSync(fs *flag.FlagSet, args []string) error {
}
c := syncContext{ctx: ctx, tx: tx, pb: newProgressBar(-1),
linked: make(map[string]struct{})}
exclude: exclude.re, linked: make(map[string]struct{})}
defer c.pb.Stop()
if c.stmtOrphan, err = c.tx.Prepare(disposeCTE + `
@@ -1871,6 +2032,88 @@ func cmdRemove(fs *flag.FlagSet, args []string) error {
return tx.Commit()
}
// --- Forgetting --------------------------------------------------------------
// cmdForget is for purging orphaned images from the database.
func cmdForget(fs *flag.FlagSet, args []string) error {
if err := fs.Parse(args); err != nil {
return err
}
if fs.NArg() < 2 {
return errWrongUsage
}
if err := openDB(fs.Arg(0)); err != nil {
return err
}
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
// Creating a temporary database seems justifiable in this case.
_, err = tx.Exec(
`CREATE TEMPORARY TABLE forgotten (sha1 TEXT PRIMARY KEY)`)
if err != nil {
return err
}
stmt, err := tx.Prepare(`INSERT INTO forgotten (sha1) VALUES (?)`)
if err != nil {
return err
}
defer stmt.Close()
for _, sha1 := range fs.Args()[1:] {
if _, err := stmt.Exec(sha1); err != nil {
return err
}
}
rows, err := tx.Query(`DELETE FROM forgotten
WHERE sha1 IN (SELECT sha1 FROM node)
OR sha1 NOT IN (SELECT sha1 FROM image)
RETURNING sha1`)
if err != nil {
return err
}
defer rows.Close()
for rows.Next() {
var sha1 string
if err := rows.Scan(&sha1); err != nil {
return err
}
log.Printf("not an orphan or not known at all: %s", sha1)
}
if _, err = tx.Exec(`
DELETE FROM tag_assignment WHERE sha1 IN (SELECT sha1 FROM forgotten);
DELETE FROM orphan WHERE sha1 IN (SELECT sha1 FROM forgotten);
DELETE FROM image WHERE sha1 IN (SELECT sha1 FROM forgotten);
`); err != nil {
return err
}
rows, err = tx.Query(`SELECT sha1 FROM forgotten`)
if err != nil {
return err
}
defer rows.Close()
for rows.Next() {
var sha1 string
if err := rows.Scan(&sha1); err != nil {
return err
}
if err := os.Remove(imagePath(sha1)); err != nil &&
!os.IsNotExist(err) {
log.Printf("%s", err)
}
if err := os.Remove(thumbPath(sha1)); err != nil &&
!os.IsNotExist(err) {
log.Printf("%s", err)
}
}
return tx.Commit()
}
// --- Tagging -----------------------------------------------------------------
// cmdTag mass imports tags from data passed on stdin as a TSV
@@ -1993,36 +2236,54 @@ func collectFileListing(root string) (paths []string, err error) {
return
}
func checkFiles(root, suffix string, hashes []string) (bool, []string, error) {
func checkFiles(gc bool,
root, suffix string, hashes []string) (bool, []string, error) {
db := hashesToFileListing(root, suffix, hashes)
fs, err := collectFileListing(root)
if err != nil {
return false, nil, err
}
iDB, iFS, ok, intersection := 0, 0, true, []string{}
// There are two legitimate cases of FS-only database files:
// 1. There is no code to unlink images at all
// (although sync should create orphan records for everything).
// 2. thumbnail: failures may result in an unreferenced garbage image.
ok := true
onlyDB := func(path string) {
ok = false
fmt.Printf("only in DB: %s\n", path)
}
onlyFS := func(path string) {
if !gc {
ok = false
fmt.Printf("only in FS: %s\n", path)
} else if err := os.Remove(path); err != nil {
ok = false
fmt.Printf("only in FS (removing failed): %s: %s\n", path, err)
} else {
fmt.Printf("only in FS (removing): %s\n", path)
}
}
iDB, iFS, intersection := 0, 0, []string{}
for iDB < len(db) && iFS < len(fs) {
if db[iDB] == fs[iFS] {
intersection = append(intersection, db[iDB])
iDB++
iFS++
} else if db[iDB] < fs[iFS] {
ok = false
fmt.Printf("only in DB: %s\n", db[iDB])
onlyDB(db[iDB])
iDB++
} else {
ok = false
fmt.Printf("only in FS: %s\n", fs[iFS])
onlyFS(fs[iFS])
iFS++
}
}
for _, path := range db[iDB:] {
ok = false
fmt.Printf("only in DB: %s\n", path)
onlyDB(path)
}
for _, path := range fs[iFS:] {
ok = false
fmt.Printf("only in FS: %s\n", path)
onlyFS(path)
}
return ok, intersection, nil
}
@@ -2070,6 +2331,7 @@ func checkHashes(paths []string) (bool, error) {
// cmdCheck carries out various database consistency checks.
func cmdCheck(fs *flag.FlagSet, args []string) error {
full := fs.Bool("full", false, "verify image hashes")
gc := fs.Bool("gc", false, "garbage collect database files")
if err := fs.Parse(args); err != nil {
return err
}
@@ -2106,13 +2368,13 @@ func cmdCheck(fs *flag.FlagSet, args []string) error {
// This somewhat duplicates {image,thumb}Path().
log.Println("checking SQL against filesystem")
okImages, intersection, err := checkFiles(
okImages, intersection, err := checkFiles(*gc,
filepath.Join(galleryDirectory, nameOfImageRoot), "", allSHA1)
if err != nil {
return err
}
okThumbs, _, err := checkFiles(
okThumbs, _, err := checkFiles(*gc,
filepath.Join(galleryDirectory, nameOfThumbRoot), ".webp", thumbSHA1)
if err != nil {
return err
@@ -2121,11 +2383,11 @@ func cmdCheck(fs *flag.FlagSet, args []string) error {
ok = false
}
log.Println("checking for dead symlinks")
log.Println("checking for dead symlinks (should become orphans on sync)")
for _, path := range intersection {
if _, err := os.Stat(path); err != nil {
ok = false
fmt.Printf("%s: %s\n", path, err)
fmt.Printf("%s: %s\n", path, err.(*os.PathError).Unwrap())
}
}
@@ -2172,8 +2434,13 @@ func makeThumbnail(load bool, pathImage, pathThumb string) (
return 0, 0, err
}
// This is still too much, but it will be effective enough.
memoryLimit := strconv.FormatInt(
int64(C.sysconf(C._SC_PHYS_PAGES)*C.sysconf(C._SC_PAGE_SIZE))/
int64(len(taskSemaphore)), 10)
// Create a normalized thumbnail. Since we don't particularly need
// any complex processing, such as surrounding of metadata,
// any complex processing, such as surrounding metadata,
// simply push it through ImageMagick.
//
// - http://www.ericbrasseur.org/gamma.html
@@ -2185,8 +2452,17 @@ func makeThumbnail(load bool, pathImage, pathThumb string) (
//
// TODO: See if we can optimize resulting WebP animations.
// (Do -layers optimize* apply to this format at all?)
cmd := exec.Command("magick", "-limit", "thread", "1", pathImage,
"-coalesce", "-colorspace", "RGB", "-auto-orient", "-strip",
cmd := exec.Command("convert", "-limit", "thread", "1",
// Do not invite the OOM killer, a particularly unpleasant guest.
"-limit", "memory", memoryLimit,
// ImageMagick creates files in /tmp, but that tends to be a tmpfs,
// which is backed by memory. The path could also be moved elsewhere:
// -define registry:temporary-path=/var/tmp
"-limit", "map", "0", "-limit", "disk", "0",
pathImage, "-coalesce", "-colorspace", "RGB", "-auto-orient", "-strip",
"-resize", "256x128>", "-colorspace", "sRGB",
"-format", "%w %h", "+write", pathThumb, "-delete", "1--1", "info:")
@@ -2237,7 +2513,10 @@ func cmdThumbnail(fs *flag.FlagSet, args []string) error {
w, h, err := makeThumbnail(*load, pathImage, pathThumb)
if err != nil {
if ee, ok := err.(*exec.ExitError); ok {
return string(ee.Stderr), nil
if message = string(ee.Stderr); message != "" {
return message, nil
}
return ee.String(), nil
}
return "", err
}
@@ -2390,14 +2669,29 @@ func cmdDhash(fs *flag.FlagSet, args []string) error {
}
}
stmt, err := db.Prepare(`UPDATE image SET dhash = ? WHERE sha1 = ?`)
// Commits are very IO-expensive in both WAL and non-WAL SQLite,
// so write this in one go. For a middle ground, we could batch the updates.
tx, err := db.Begin()
if err != nil {
return err
}
defer tx.Rollback()
// Mild hack: upgrade the transaction to a write one straight away,
// in order to rule out deadlocks (preventable failure).
if _, err := tx.Exec(`END TRANSACTION;
BEGIN IMMEDIATE TRANSACTION`); err != nil {
return err
}
stmt, err := tx.Prepare(`UPDATE image SET dhash = ? WHERE sha1 = ?`)
if err != nil {
return err
}
defer stmt.Close()
var mu sync.Mutex
return parallelize(hexSHA1, func(sha1 string) (message string, err error) {
err = parallelize(hexSHA1, func(sha1 string) (message string, err error) {
hash, err := makeDhash(sha1)
if errors.Is(err, errIsAnimation) {
// Ignoring this common condition.
@@ -2411,6 +2705,10 @@ func cmdDhash(fs *flag.FlagSet, args []string) error {
_, err = stmt.Exec(int64(hash), sha1)
return "", err
})
if err != nil {
return err
}
return tx.Commit()
}
// --- Main --------------------------------------------------------------------
@@ -2427,6 +2725,7 @@ var commands = map[string]struct {
"tag": {cmdTag, "GD SPACE [DESCRIPTION]", "Import tags."},
"sync": {cmdSync, "GD ROOT...", "Synchronise with the filesystem."},
"remove": {cmdRemove, "GD PATH...", "Remove database subtrees."},
"forget": {cmdForget, "GD SHA1...", "Dispose of orphans."},
"check": {cmdCheck, "GD", "Run consistency checks."},
"thumbnail": {cmdThumbnail, "GD [SHA1...]", "Generate thumbnails."},
"dhash": {cmdDhash, "GD [SHA1...]", "Compute perceptual hashes."},
@@ -2452,6 +2751,8 @@ func usage() {
}
func main() {
threads := flag.Int("threads", -1, "level of parallelization")
// This implements the -h switch for us by default.
// The rest of the handling here closely follows what flag does internally.
flag.Usage = usage
@@ -2477,12 +2778,20 @@ func main() {
fs.PrintDefaults()
}
taskSemaphore = newSemaphore(runtime.NumCPU())
if *threads > 0 {
taskSemaphore = newSemaphore(*threads)
} else {
taskSemaphore = newSemaphore(runtime.NumCPU())
}
err := cmd.handler(fs, flag.Args()[1:])
// Note that the database object has a closing finalizer,
// we just additionally print any errors coming from there.
if db != nil {
if _, err := db.Exec(`PRAGMA optimize`); err != nil {
log.Println(err)
}
if err := db.Close(); err != nil {
log.Println(err)
}

View File

@@ -10,7 +10,7 @@ function call(method, params) {
callActive++
return m.request({
method: "POST",
url: `/api/${method}`,
url: `api/${method}`,
body: params,
}).then(result => {
callActive--
@@ -98,7 +98,7 @@ let Thumbnail = {
if (!e.thumbW || !e.thumbH)
return m('.thumbnail.missing', {...vnode.attrs, info: null})
return m('img.thumbnail', {...vnode.attrs, info: null,
src: `/thumb/${e.sha1}`, width: e.thumbW, height: e.thumbH,
src: `thumb/${e.sha1}`, width: e.thumbW, height: e.thumbH,
loading})
},
}
@@ -472,13 +472,15 @@ let ViewBar = {
m('ul', ViewModel.paths.map(path =>
m('li', m(ViewBarPath, {path})))),
m('h2', "Tags"),
Object.entries(ViewModel.tags).map(([space, tags]) => [
m("h3", m(m.route.Link, {href: `/tags/${space}`}, space)),
m("ul.tags", Object.entries(tags)
.sort(([t1, w1], [t2, w2]) => (w2 - w1))
.map(([tag, score]) =>
m(ScoredTag, {space, tagname: tag, score}))),
]),
Object.entries(ViewModel.tags).map(([space, tags]) =>
m('details[open]', [
m('summary', m("h3",
m(m.route.Link, {href: `/tags/${space}`}, space))),
m("ul.tags", Object.entries(tags)
.sort(([t1, w1], [t2, w2]) => (w2 - w1))
.map(([tag, score]) =>
m(ScoredTag, {space, tagname: tag, score}))),
])),
])
},
}
@@ -492,7 +494,7 @@ let View = {
view(vnode) {
const view = m('.view', [
ViewModel.sha1 !== undefined
? m('img', {src: `/image/${ViewModel.sha1}`,
? m('img', {src: `image/${ViewModel.sha1}`,
width: ViewModel.width, height: ViewModel.height})
: "No image.",
])
@@ -609,13 +611,14 @@ let SearchRelated = {
view(vnode) {
return Object.entries(SearchModel.related)
.sort((a, b) => a[0].localeCompare(b[0]))
.map(([space, tags]) => [
m('h2', space),
.map(([space, tags]) => m('details[open]', [
m('summary', m('h2',
m(m.route.Link, {href: `/tags/${space}`}, space))),
m('ul.tags', tags
.sort((a, b) => (b.score - a.score))
.map(({tag, score}) =>
m(ScoredTag, {space, tagname: tag, score}))),
])
]))
},
}
@@ -646,7 +649,11 @@ let Search = {
m(Header),
m('.body', {}, [
m('.sidebar', [
m('p', SearchModel.query),
m('input', {
value: SearchModel.query,
onchange: event => m.route.set(
`/search/:key`, {key: event.target.value}),
}),
m(SearchRelated),
]),
m(SearchView),

View File

@@ -24,11 +24,15 @@ a { color: inherit; }
.header .activity { padding: .25rem .5rem; align-self: center; color: #fff; }
.header .activity.error { color: #f00; }
summary h2, summary h3 { display: inline-block; }
.sidebar { padding: .25rem .5rem; background: var(--shade-color);
border-right: 1px solid #ccc; overflow: auto;
min-width: 10rem; max-width: 20rem; flex-shrink: 0; }
.sidebar input { width: 100%; box-sizing: border-box; margin: .5rem 0;
font-size: inherit; }
.sidebar h2 { margin: 0.5em 0 0.25em 0; padding: 0; font-size: 1.2rem; }
.sidebar ul { margin: .5rem 0; padding: 0; }
.sidebar ul { margin: 0; padding: 0; }
.sidebar .path { margin: .5rem -.5rem; }
.sidebar .path li { margin: 0; padding: 0; }
@@ -79,7 +83,7 @@ img.thumbnail, .thumbnail.missing { box-shadow: 0 0 3px rgba(0, 0, 0, 0.75);
.viewbar { padding: .25rem .5rem; background: #eee;
border-left: 1px solid #ccc; min-width: 20rem; overflow: auto; }
.viewbar h2 { margin: 0.5em 0 0.25em 0; padding: 0; font-size: 1.2rem; }
.viewbar h3 { margin: 0.25em 0; padding: 0; font-size: 1.1rem; }
.viewbar h3 { margin: 0.5em 0 0.25em 0; padding: 0; font-size: 1.1rem; }
.viewbar ul { margin: 0; padding: 0 0 0 1.25em; list-style-type: "- "; }
.viewbar ul.tags { padding: 0; list-style-type: none; }
.viewbar li { margin: 0; padding: 0; }

View File

@@ -16,6 +16,9 @@ sha1duplicate=$sha1
cp $input/Test/dhash.png \
$input/Test/multiple-paths.png
gen -seed 15 -size 256x256 plasma:fractal \
$input/Test/excluded.png
gen -seed 20 -size 160x128 plasma:fractal \
-bordercolor transparent -border 64 \
$input/Test/transparent-wide.png
@@ -36,7 +39,7 @@ gen $input/Test/animation-small.gif \
$input/Test/video.mp4
./gallery init $target
./gallery sync $target $input "$@"
./gallery sync -exclude '/excluded[.]' $target $input "$@"
./gallery thumbnail $target
./gallery dhash $target
./gallery tag $target test "Test space" <<-END
@@ -47,7 +50,7 @@ END
# TODO: Test all the various possible sync transitions.
mv $input/Test $input/Plasma
./gallery sync $target $input
./gallery sync -exclude '/excluded[.]' $target $input
./gallery web $target :8080 &
web=$!