Give Your Screenshots a Sound — a Stream Deck Button That Dings When the Image Is on the Clipboard
A Stream Deck button that plays a ding the instant a phone screenshot lands in the clipboard — no more counting to three. Plus a foldable multi-display gotcha (and the fix),...

I have three phones wired to this machine and a Stream Deck button for each that grabs a screenshot and drops it straight into the clipboard. Handy — except for one dumb little ritual: I’d press the button, then count “one… two… three…” in my head and hope the image was actually there before I pasted. No feedback. Pure superstition.
My speech-to-text setup already solves this elegantly: when the transcribed text lands in the clipboard, it plays a soft “ding” (paplay + freedesktop’s complete.oga). You hear it, you paste, done. So why not steal the same trick for screenshots?
The button command was a one-liner:
adb -s <SERIAL> exec-out screencap -p | xclip -selection clipboard -t image/png
The key insight: on X11, xclip exits (forking to the background to hold the selection) at exactly the moment the image is in the clipboard. So I don’t need a timer — I just chain the ding onto the end of the pipeline. And while I’m at it, I make failure audible too: if the phone is unplugged and adb fails, play an error sound instead. set -o pipefail makes the pipe report the adb failure.
bash -c 'set -o pipefail; adb -s <SERIAL> exec-out screencap -p \
| xclip -selection clipboard -t image/png \
&& paplay /usr/share/sounds/freedesktop/stereo/complete.oga \
|| paplay /usr/share/sounds/freedesktop/stereo/dialog-error.oga'
Now: press button → ding the instant it’s ready, or an error blip if it didn’t copy. No more counting to three. Bonus: I run copyq as a clipboard manager, so every screenshot is also kept in history and persists even after the source process exits (an X11 gotcha worth its own post).
The foldable plot twist (welcome to large screens)
Two phones worked perfectly. Then I added a third — a Pixel Fold — and the button started lying to me: it played the success ding, but the clipboard had no usable image. Paste → nothing.
The ding fired because the pipeline exited 0. But the bytes weren’t a PNG. When I dumped them, the start was plain text:
[Warning] Multiple displays were found, but no display id was specified! Defaulting to the
first display found, however this default is not guaranteed to be consistent across captures.
That’s the catch with foldables: they have more than one display (inner + cover), and screencap -p with no display specified prints that warning to stdout, mixed into the image stream — corrupting the PNG. A single-screen phone never hits this.
My first fix was to name the display explicitly with -d <physical-id>, which you get from SurfaceFlinger (note -d 0 fails — it wants the full physical id):
adb shell dumpsys SurfaceFlinger --display-id
# Display 4619827677550801152 (HWC display 0) <- inner panel
# Display 4619827677550801153 (HWC display 1) <- cover panel
I pinned it to the cover screen and shipped. Then Roberto Orgiu (@_tiwiz), who does Developer Relations for large screens at Google, set me straight: the display id you want isn’t stable — it follows the fold state. “Display 0” maps to the inner screen when unfolded and the cover screen when folded. So hardcoding the cover id means a flawless screenshot of pure black the moment you open the device: the inner screen is now the active one, and the cover is off.
The robust fix is to stop hardcoding and just capture the active display — which is exactly what screencap with no -d does (“display 0” = the live screen) — and strip the warning text it prepends. One small wrapper:
adb -s <SERIAL> exec-out screencap -p \
| python3 -c 'import sys; d=sys.stdin.buffer.read(); i=d.find(b"\x89PNG\r\n\x1a\n"); sys.stdout.buffer.write(d[i:] if i>=0 else b"")' \
| xclip -selection clipboard -t image/png
Folded → you get the cover. Unfolded → you get the inner screen. Always the one you’re actually looking at. No display ids to chase.
Takeaway
Audio is an underrated UI: any command that ends by putting something on the clipboard can tell you it’s done with one extra &&. Your ears are a free progress bar. And if you support large-screen devices in any tooling, the lesson generalizes — always ask “which display?”. Big thanks to Roberto Orgiu (@_tiwiz) for the catch.