DOCS: document patchwork algorithm

Signed-off-by: Stefan Westerfeld <stefan@space.twc.de>

DOCS: document patchwork algorithm
Signed-off-by: Stefan Westerfeld <stefan@space.twc.de>
b2ca2218 · Stefan Westerfeld · 5d6c16e7 · b2ca2218 · b2ca2218 · b2ca2218
Commit b2ca2218 authored Feb 21, 2024 by Stefan Westerfeld
Showing with 123 additions and 2 deletions

Makefile.am docs/Makefile.am +4 -2

audiowmark.md docs/audiowmark.md +74 -0

example-spectrum.dat docs/example-spectrum.dat +16 -0

example-spectrum.gp docs/example-spectrum.gp +29 -0

No files found.
--- a/docs/Makefile.am
+++ b/docs/Makefile.am
@@ -4,12 +4,14 @@ doc_DATA  = audiowmark.pdf audiowmark.html

 GRAPHVIZ_PY = graphviz.py

-audiowmark.pdf: audiowmark.md $(GRAPHVIZ_PY)
+audiowmark.pdf: audiowmark.md $(GRAPHVIZ_PY) example-spectrum.png
 	pandoc -F $(GRAPHVIZ_PY) -V papersize:a4 -V geometry:margin=2cm $< -o $@

-audiowmark.html: audiowmark.md $(GRAPHVIZ_PY)
+audiowmark.html: audiowmark.md $(GRAPHVIZ_PY) example-spectrum.png
 	pandoc -F $(GRAPHVIZ_PY) $< -o $@

+example-spectrum.png: example-spectrum.dat example-spectrum.gp
+	gnuplot example-spectrum.gp
 # DEPS: apt install -y python3-pygraphviz python3-pandocfilters

 clean:

--- a/docs/audiowmark.md
+++ b/docs/audiowmark.md
@@ -662,6 +662,80 @@ During decoding, the same Pseudo Random Number Generator sequences R1…R6 are u
 By using the same AES key and a cryptographically secure PRNG, the sequences are uniformly distributed and deterministically reproducible but cannot be extrapolated.
 This prevents watermark extraction or modification by anyone without possession of the exact encoding key.

+## The Patchwork Algorithm
+
+![Example Spectrum](example-spectrum.png)
+
+To store one single bit inside a spectrum, **audiowmark** uses the patchwork
+algorithm. From the frequency bands of the spectrum (generated by computing the
+FFT of one frame), two groups are choosen in the frequency range of the watermark
+using the pseudo random number generator. These are called up- and down-bands.
+In the example above, the up-bands are red and the down-bands are green.
+Typically there are 30 up- and 30 down-bands and the other bands do not carry
+information.
+
+To embed a single bit, the following changes are made to the spectrum:
+
+ * to **store a 1 bit**, each magnitude of each up-band is increased by a small amount,
+   and each magnitude of each down-band is decreased by a small amount (this is
+   shown by the small arrows in the example image)
+
+ * to **store a 0 bit**, each magnitude of each up-band is decreased, and each magnitude of
+   each up-band is increased (the opposite of the small arrows in the example image)
+
+Since we have pseudo-randomly choosen the up- and down-bands from the spectrum,
+we can expect that if we sum up all values of the up-bands and sum up all
+values of the down-bands **before** embedding the bit, we will get a similar
+result (because the mean value of all spectrum bins is shared between the two).
+
+However, since we increased all elements of the up-bands and decreased all
+elements of the down-bands **after embedding a 1 bit**, the sum of the up-bands
+should be **greater than** the sum of the down-bands.
+
+So to decode the bit from the spectrum, we can simply use the rule
+
+ * **decode as 1 bit**, if the sum of the up-bands is greater than the sum
+   of the down-bands
+
+ * **decode as 0 bit**, if the sum of the up-bands is smaller than the sum
+   of the down-bands
+
+In the actual implementation, increasing/decreasing the magnitude of the
+up-/down-bands is done by generating a watermark signal with the right
+magnitude/phase for each frame that only contains the changes. So we
+compute a delta spectrum, which is then passed to the IFFT, windowed and then
+added to the original audio, so that the sum has the desired modified spectrum
+magnitude.
+
+The detection is performed on dB values of the magnitudes of the spectrum
+obtained from the FFT, so the sums of the dB values of up-/down-bands are
+computed and compared to decide whether a 0 bit or 1 bit was received.
+
+The patchwork algorithm does not guarantee that encoding/decoding will always
+yield the right result at the lowest level of embedding/decoding one bit (as
+the difference of the up-/down-bands can be too big before embedding due to
+the original signal). However error correction and redundancy by embedding a
+bit in more than one frame makes the whole process reliable at a higher level.
+
+There are three improvements over the basic patch work algorithm described
+above, which make the watermark detection more accurate:
+
+* To use soft-decoding for the convolutional decoder, instead of deciding
+whether a 0 or 1 bit was received by comparing the two sums directly before
+decoding the convolutional code to obtain the message bits, the difference
+between the two sums is normalized and is used as a soft-bit input for the
+Viterbi algorithm.
+
+* Instead of storing one data bit in each frame spectrum, a data bit uses up-
+and down-bands from different frames. This is called mix-encoding, which
+spreads the information of each data bit over many frames.
+
+* As described above, the original signal can have some negative effect
+on the performance of the decoder, since the sum of the up-bands and the
+sum of the down-bands will be different even before embedding the bits.
+To make detection more reliable, the original signal level for each bin is
+estimated by taking the average value of the previous and next spectrum and
+subtracted before computing the sum of the up- and down-bands.

 ## Speed Detection


--- a/docs/example-spectrum.dat
+++ b/docs/example-spectrum.dat
+1 10513 3
+2 13061 2
+3 11180 1
+4 8732 2
+5 13586 3
+6 14266 1
+7 14647 1
+8 7102 3
+9 16472 2
+10 6856 2
+11 13923 1
+12 16685 2
+13 16025 3
+14 16708 3
+15 11310 1
+16 16395 3
--- a/docs/example-spectrum.gp
+++ b/docs/example-spectrum.gp
+set terminal pngcairo size 800,450
+set output 'example-spectrum.png'
+
+# Set bar width
+set boxwidth 0.5
+set style fill solid border -1
+set linetype 1 lc rgb "#990000" lw 1
+set linetype 2 lc rgb "#009900" lw 1
+set linetype 3 lc rgb "#666666" lw 1
+
+set arrow from graph 0,1 to graph 0,1.1 filled
+set arrow from graph 1,0 to graph 1.1,0 filled
+set tmargin 5
+set rmargin 10
+set border 3
+set tics nomirror
+set noxtics
+set noytics
+set grid
+set ylabel "Magnitude"
+set xlabel "Frequency"
+
+# Set the range for y-axis
+set yrange [0:*]
+
+# Plot the data with different colors
+plot "example-spectrum.dat" using 1:2:3 with boxes lc variable notitle, \
+     "<grep '1$' example-spectrum.dat" u 1:2:(0):(1500) with vectors lc 1 notitle, \
+     "<grep '2$' example-spectrum.dat" u 1:($2+1500):(0):(-1500) with vectors lc 2 notitle