Search for a command to run...
Many clinics still archive 12-lead electrocardiograms (ECGs) as printed paper records or raster images embedded in PDF reports, which limits interoperability and downstream computational analysis. Here, we present a reproducible digitization protocol that converts ECG report images into WaveForm DataBase (WFDB)-compliant digital signals for standardized downstream use. The protocol consists of (i) grid detection and pixel-to-physical calibration, (ii) automated localization and labeling of the 12 lead regions using an object detection model, (iii) signal preprocessing and normalization, and (iv) waveform extraction using either a continuity-constrained path method (Viterbi) or an improved centerline-based approach. The workflow is designed to handle heterogeneous clinical layouts, variable grid visibility, and common imaging artifacts, and includes visual checkpoints for quality control at each stage. The protocol outputs time-aligned 12-lead WFDB records suitable for reproducible analysis in signal processing and machine-learning pipelines.
DOI: 10.3791/70359-v