How the engine works
OLED Guard Pro is not a screensaver. It is a real-time, GPU-resident video pipeline that runs every frame, on every connected display, while you use your computer. This page is the technical tour.
The four-stage pipeline
┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
│ CAPTURE │ ──▶ │ MODEL │ ──▶ │ POLICY │ ──▶ │COMPOSITE │
│ DXGI │ │ exposure │ │ Auto/man │ │ DWM │
└──────────┘ └──────────┘ └──────────┘ └──────────┘
1. Capture — DXGI Desktop Duplication
Windows ships an API called Desktop Duplication that gives you a GPU texture handle to whatever DWM is composing for a given display, every time it changes. We use it because:
- It runs on the GPU. The frame buffer never leaves video memory.
- It is the only public Windows API that works inside borderless fullscreen games (which is the mode most gamers actually play in). Older approaches like BitBlt or PrintWindow do not.
- It supports HDR, multi-monitor, and high-refresh displays without us doing anything special.
2. Model — per-pixel exposure shader
A pixel shader processes each captured frame at native resolution. For every pixel it computes:
luminance = dot(pixelRGB, vec3(0.2126, 0.7152, 0.0722));
delta = luminance * frameTime;
exposure[p] = exposure[p] + delta;
Two parallel passes also run:
- Motion envelope. A cheap temporal high-pass: how much has this pixel changed in the last N frames? Pixels with high motion get their exposure decayed faster, because moving content does not concentrate aging.
- Stability detector. A low-pass on the motion envelope: pixels whose value has been stable for many frames are flagged as “static” and become candidates for protection.
The exposure histogram is double-buffered in GPU memory. There is no readback to the CPU on the hot path.
3. Policy — what to do about it
The policy stage decides three things, per pixel, every frame:
- How much protection to apply.
- What kind of overlay to render — noise (default), static dim, pixel shift, or dithering, with optional vignette-style edge weighting layered on top.
- Where the field is shaped — uniform, vignette-weighted toward the edges, or driven by app-profile geometry.
In manual mode these come from the sliders on the Overlay page. In Automatic Mode the controller looks at the live signal classification (work, gaming, video, idle), the per-pixel motion envelope, and the dynamism trace, and picks a configuration that the model says minimises risk subject to a perceptibility budget. You can watch this happen in the Advanced > Live Classifier readout.
4. Composite — DWM premultiplied alpha
A second shader renders the chosen overlay into a transparent always-on-top window. The Desktop Window Manager composites that window onto your desktop using premultiplied alpha — the same path it uses for Windows’ own animations. That is why the overlay works correctly in:
- SDR and HDR modes,
- borderless fullscreen games,
- variable-refresh displays (G-Sync / FreeSync),
- multi-monitor setups,
- mixed-DPI configurations.
DWM is doing the actual blending. We are just supplying a frame.
Per-display, in parallel
Each connected display runs its own copy of the pipeline. They do not share state. A monitor change, a hot-plug, a resolution change — the engine notices, drops the affected pipeline, and rebuilds it without disturbing the others.
What runs on the CPU
The CPU does:
- Shader compilation at startup,
- presets, configuration, and the React UI,
- foreground-window monitoring for app profiles,
- DDC/CI commands when you change brightness through the app.
The CPU does not see your screen contents. The frames stay in GPU memory the entire time.
Performance budget
A representative measurement at 1440p / 144 Hz on a mid-range GPU:
| Stage | Cost per frame | | ---------- | -------------: | | Capture | ~ 0.4 ms | | Model | ~ 0.3 ms | | Composite | ~ 0.5 ms | | Total | ~ 1.2 ms |
That is 7% of a 16.6 ms / 60 Hz budget, but it runs on the GPU side rather than blocking your game’s render path, so wall-clock impact in benchmarks is typically below 1%. Higher resolutions, higher refresh rates, and weaker GPUs scale the cost; the relative shape stays the same.
What was deliberately left out
A few design choices we re-validated more than once:
- No cell-based intelligence layer. A cell-based protection layer that reasoned over 32 × 32 tiles was prototyped and abandoned. Per-pixel modelling is more honest about the physics; tiles produced staircase artifacts at content boundaries.
- No CPU-side burn-in heuristics. The engine does not try to recognise “this is a Discord sidebar” or “this is a YouTube logo.” Recognition is fragile and ages badly. Exposure is the universal physical quantity.
- No telemetry pipeline. Per-pixel histograms never leave your machine. We have no servers that receive them, by design.
If you want to see the engine in motion, the Advanced engine page in the desktop app exposes the live classifier, the per-knob auto-controller traces, and the live signals strip at 60 Hz.