Architecture¶
trait2gene is structured as a thin Python orchestration layer around external
MAGMA and upstream-compatible PoPS components.
Core responsibilities¶
The package owns:
the CLI surface and command routing
config parsing and validation
resource resolution and manifest writing
standardized output layout
provenance and run metadata
locus prioritization and report generation
MAGMA stays external, and PoPS compatibility is preserved through vendored upstream scripts plus a small compatibility layer.
Execution model¶
The full pipeline follows this high-level sequence:
validate config and input contracts
resolve resources and write manifests
run or copy MAGMA outputs
prepare feature matrices
execute PoPS
prioritize genes within loci
write tables, metadata, and reports
Package layout¶
Important package areas:
trait2gene.cliCLI commands and Typer app wiringtrait2gene.configPydantic models and config loaderstrait2gene.resourcesmanifest and resource resolution helperstrait2gene.engineplanning, logging, provenance, and pipeline orchestrationtrait2gene.workflowsstage implementationstrait2gene.domainloci, ranking, and QC helperstrait2gene.reportHTML and table renderingtrait2gene.vendorvendored upstream PoPS scripts
Why the stages stay separate¶
Each stage is exposed as its own command so you can:
validate configs without running heavy steps
reuse precomputed MAGMA outputs
prepare features independently of PoPS
rerun reporting or prioritization on existing outputs