Real-hardware results
This document records the methodology and numbers for ViFi's real-hardware heart-rate estimation. It is written so a technical reviewer can verify what was measured, how, and what it does and does not prove.
4.15 bpm cross-session HR mean absolute error against a Polar H10 chest strap, on $44 of commodity ESP32-S3 hardware, single subject, single room, 4 paired captures totaling ~8 minutes of real-hardware data.
| Holdout | Trained on | HR MAE | Within ±5 bpm | Bias |
|---|---|---|---|---|
| session5 | sessions 3, 4 | 3.89 bpm | 68.2% | +0.94 bpm |
| session4 | sessions 3, 5 | 4.41 bpm | 65.2% | +3.02 bpm |
| Mean | — | 4.15 bpm | 66.7% | +1.98 bpm |
The model never saw the held-out session during training. Both cross-session evaluations are on completely independent recordings.
Heart rate (HR), in beats per minute, every 10 seconds (5-second stride).
Each window's prediction is compared against the Polar H10 ground-truth HR at the window center, interpolated linearly between the H10's 1 Hz reading.
Metrics reported:
|predicted − true| across all windows(predicted − true), indicating systematic offset| Component | Spec |
|---|---|
| Transmitter (TX) | ESP32-S3-DevKitC-1U-N8R8, external dual-band antenna |
| Receiver (RX) | ESP32-S3-DevKitC-1U-N8R8, external dual-band antenna |
| Antenna pair | 2.4/5 GHz omnidirectional, vertical orientation |
| Pigtails | IPEX1 U.FL → RP-SMA Female, 8" |
| Firmware | Espressif ESP-IDF v6.0 wifi_csi_rx example, one-line patch for the WIFI_BW_HT* enum rename |
| Channel | WiFi channel 11, HT40 (40 MHz bandwidth) |
| Subcarriers | 192 per packet (LLTF + HT-LTF combined) |
| Packet rate (measured) | ~67–77 Hz per session |
| Ground-truth HR | Polar H10 chest strap |
CSI: a custom serial reader records raw bytes from the RX board's UART for exactly 120 seconds, alongside a metadata sidecar containing the measured packet rate.
HR: the Polar H10 streams 1 Hz HR notifications over BLE; auto-reconnects on Windows BLE drops (which happen every ~30 seconds and are handled transparently).
Both run as PowerShell background jobs in parallel; total wall-clock per capture: ~2.5 minutes.
| Session | CSI packets | HR readings | HR mean | HR range | Packet rate |
|---|---|---|---|---|---|
| session3 | 8,489 | 115 | ~89 bpm | 85–93 | 73.7 Hz |
| session4 | 8,026 | 118 | ~87 bpm | 80–95 | 66.8 Hz |
| session5 | 9,283 | 123 | ~88 bpm | 84–93 | 77.4 Hz |
(Session2 was excluded from cross-session evaluation because it was captured before the metadata-sidecar feature shipped, so its packet rate had to be assumed at 100 Hz which biases the FFT analysis.)
When ESP-IDF doesn't emit per-line timestamps (the default configuration), the parser synthesizes a uniform timestamp grid. Default 100 Hz was wrong — the watchdog throttles real packet rate to ~67–77 Hz. The capture tool writes a metadata sidecar with the measured rate; the parser auto-loads the sidecar to use the correct synthesized fs. This single correction is the difference between MAE 24 bpm (wrong assumed rate) and MAE 4 bpm (correct rate, after retraining).
The retraining tool accepts multiple paired captures and trains an XGBoost regressor on the union of all of them. To produce an honest generalization estimate, one session is held out from training, the model is retrained on the remaining sessions, then evaluated on the holdout.
The XGBoost regressor never sees the held-out session during training. Predictions on the holdout are pure generalization, not memorization.
These are the boundaries of the current dataset and the milestones queued up to extend each one. The pipeline and tooling are designed to scale into all of them — the work ahead is data collection and validation.
| Work | Hardware | HR MAE | Subjects | Conditions |
|---|---|---|---|---|
| PhaseBeat (Wang et al., IEEE INFOCOM 2017) | Intel 5300 NIC (~$500) | ~1.5 bpm | multiple | seated, controlled |
| FullBreathe (Zeng et al., UbiComp 2018) | Intel 5300 NIC | sub-bpm RR | multiple | varied orientation |
| ResBeat (Zhang et al., 2020) | Intel 5300 NIC | sub-bpm joint HR+RR | multiple | seated, controlled |
| ViFi | 2x ESP32-S3 (~$20) | 4.15 bpm cross-session | 1 | seated, single room |
ViFi's contribution is not algorithmic novelty (the signal-processing approach is from these papers). It is hardware-cost reduction by ~25× and the application-layer engineering (capture pipeline, metadata sidecar, leave-one-session-out tooling) that makes the result reproducible end-to-end.
| Effort | Expected MAE drop | Status |
|---|---|---|
| Multi-subject dataset (10+ subjects) | 4.15 → 2.5–3 bpm | Summer 2026 (funding-dependent) |
| Multi-room dataset (5+ rooms) | further generalization | Summer 2026 (funding-dependent) |
| Phase-domain features (CFO/SFO calibration) | → 1.5–2 bpm | Winter 2026 – Spring 2027 |
| 4-receiver array + multi-antenna averaging | further improvement | Mid–Late 2027 |
| Higher CSI packet rate (firmware patch) | marginal | 2028+ |
See the roadmap for the full sequence.
The full code — DSP pipeline, capture orchestrator, training scripts, prediction service, calibration, RF fingerprinting, audit log, and 102-test suite — is kept private during pre-clearance development. Available for technical review under NDA for investors, hospital partners, and prospective collaborators. Email hello@vifi.health.