Validate annotation engines against a truth table
Source:R/validation.R
validate_annotation_engines.RdRuns one or more annotation engines, then compares the chosen annotation fields against a user-supplied truth table keyed by Ensembl gene ID.
Usage
validate_annotation_engines(
expr,
meta = NULL,
truth,
truth_gene_col = "gene_id",
species = c("auto", "human", "mouse"),
annotation_preset = NULL,
engines = c("biomart", "orgdb", "ensdb", "hybrid"),
fields = NULL,
strip_version = TRUE,
biomart_version = 102,
biomart_host = NULL,
biomart_mirror = NULL,
assay_name = NULL,
gene_id_col = NULL,
sample_col = "sample",
output_file = NULL,
detail_file = NULL,
verbose = TRUE
)Arguments
- expr
Expression table with
gene_idfirst, or aSummarizedExperiment-like object.- meta
Metadata table with
samplefirst. LeaveNULLwhenexpris aSummarizedExperiment.- truth
A data frame containing a
gene_idcolumn and one or more truth annotation fields such assymbolorgene_name.- truth_gene_col
Column in
truthcontaining the Ensembl gene ID.- species
Either
"auto","human", or"mouse".- annotation_preset
Optional preset that fixes a reproducible annotation configuration. Supported values are
"human_v102","mouse_v102","human_tpm_v102","mouse_tpm_v102","human_count_v102", and"mouse_count_v102".- engines
Character vector of annotation engines to compare.
- fields
Optional annotation fields to validate. Defaults to the intersection between
truthand the package's standard annotation fields.- strip_version
Whether to remove Ensembl version suffixes in both prediction and truth tables.
- biomart_version
Fixed Ensembl release used by the
biomaRtbackend.- biomart_host
Optional explicit Ensembl host.
- biomart_mirror
Optional Ensembl mirror name.
- assay_name
Optional assay name when
expris aSummarizedExperiment.- gene_id_col
Optional row-data column containing Ensembl IDs when
expris aSummarizedExperiment.- sample_col
Metadata column to use as
samplewhenexpris aSummarizedExperiment.- output_file
Optional CSV path for the validation summary.
- detail_file
Optional CSV path for per-gene validation details.
- verbose
Whether to emit progress messages.