Tutorial: Harmonize the Basic Fixture
This tutorial uses the small fixture in examples/basic/ to show the full local
workflow without requiring network access. Run it from a clone of the
omicsmeta repository so the example files are available.
Install
python -m pip install omicsmeta
If you are editing the repository, install from the source checkout instead:
python -m pip install -e ".[dev,docs]"
Run Harmonization
omicsmeta harmonize examples/basic/metadata.tsv \
--output examples/basic/harmonized.tsv \
--unmapped examples/basic/unmapped.tsv \
--unmapped-summary-output examples/basic/unmapped_summary.tsv \
--sample-output examples/basic/samples.tsv \
--report examples/basic/qc_report.html
Inspect Outputs
The detailed output contains accepted ontology mappings:
head examples/basic/harmonized.tsv
The sample-wide table contains one row per input sample:
head examples/basic/samples.tsv
The unmapped summary is the best starting point for manual curation:
cat examples/basic/unmapped_summary.tsv
Benchmark the Fixture
Compare accepted direct mappings against the expected known-answer table:
python scripts/benchmark_mapping.py \
--input examples/basic/metadata.tsv \
--truth examples/basic/expected_harmonized.tsv \
--output-json examples/basic/benchmark.json
The benchmark reports precision, recall, and F1 overall and by semantic field. Inferred terms are excluded from this score so the metric reflects direct mapping behavior.
Run the Bundled Suite
The repository also includes a manifest of known-answer cases that covers the basic fixture and several GEO-style snippets:
python scripts/benchmark_mapping.py \
--manifest benchmarks/known_answer_suite.tsv \
--output-json benchmark_suite.json