iPsychonaut/EGAP: v3.3.9 Logging and Bug Squashing

20260 citationsOthergreen Open Access

Authors

Abstract

Overview This PR implements a complete end-to-end logging system for EGAP. Previously, no log file was ever written to disk -- all output was terminal-only and lost after each run. Every script in the pipeline now writes timestamped, color-coded log entries to a persistent log file on disk. What's New Persistent log files for every run utilities.py now exposes log_print() and initialize_logging_environment() as the single source of truth for all pipeline output Every print() call across all 15 sub-pipeline scripts has been replaced with log_print(), which simultaneously timestamps the message, writes it to disk, and prints it to the terminal with color coding by message type (NOTE, CMD, PASS, ERROR, WARN, SKIP, FAIL) generate_log_file() now opens in append mode ("a") so re-runs accumulate into the same log rather than overwriting it, with visible error reporting if file creation fails Per-sample log files in the right place Each sample gets its own {sample_id}_log.txt written into output_dir/{species_id}/ -- directly alongside ONT/, Illumina/, and assembly output folders where it is easy to find Implemented via an EGAP_LOG_DIR environment variable set by EGAP.py and EGAP_TUI.py at the start of each sample's iteration; child processes inherit it automatically so no subprocess script signatures needed to change The run-level orchestrator log ({output_dir_name}_log.txt) is still written at the root of output_dir Real-time log tee from subprocesses EGAP.py now uses a _run_and_tee() helper that streams each line of subprocess stdout/stderr through log_print() in real time, so the master log captures all sub-process output as it happens TUI log persistence EGAP_TUI.py now writes all log_line() output to a timestamped EGAP_{timestamp}_log.txt file in addition to the on-screen RichLog widget Per-sample EGAP_LOG_DIR is also set in the TUI per-sample loop Per-sample pipeline loop ordering qc_assessment(final) and html_reporter are now executed inside the per-sample loop (after each sample finishes) rather than after all samples complete, ensuring QC and reports are generated even if a later sample fails Bug fixes Fixed AttributeError: module 'datetime' has no attribute 'now' -- corrected to datetime.datetime.now() Fixed UNLOGGED ERROR: Unable to load the log file provided: None in all four preprocess_*.py scripts -- initialize_logging_environment() is now called before any log_print() in every main block Fixed SyntaxError: f-string expression part cannot include a backslash (Python < 3.12 compatibility) -- extracted os.path.basename() call into a plain variable before use in the f-string Bumped VERSION to 3.3.9 README Added Changelog section documenting v3.3.9 and v3.3.8 Documented log file output locations and behavior Updated Future Improvements section Files Changed EGAP.py -- _run_and_tee() helper, EGAP_LOG_DIR env var, per-sample loop restructure bin/utilities.py -- log_print(), initialize_logging_environment(), generate_log_file() improvements bin/EGAP_TUI.py -- TUI log persistence, EGAP_LOG_DIR, species_id handling All 15 sub-pipeline scripts -- print() replaced with log_print(), initialize_logging_environment() wired into every main block README.md -- Changelog and documentation updates What's Changed V3.3.9 by @iPsychonaut in https://github.com/iPsychonaut/EGAP/pull/9 Full Changelog: https://github.com/iPsychonaut/EGAP/compare/v3.3.8...v3.3.9

Topics & Keywords

Publication Details

Published in: Zenodo (CERN European Organization for Nuclear Research)

DOI: 10.5281/zenodo.19362889

Command Palette

iPsychonaut/EGAP: v3.3.9 Logging and Bug Squashing

Authors

Abstract

Topics & Keywords

Publication Details