bioartifact

Machine-readable validation for bioinformatics outputs

bioartifact

A lightweight Python package and command-line tool for inspecting workflow artifacts, validating named contracts, and returning deterministic JSON for agents, benchmark systems, and reproducibility pipelines.

$ bioartifact inspect variants.vcf.gz
{
  "schema_version": "1.0.0",
  "artifact_type": "vcf",
  "valid": true,
  "summary": {
    "records": 2,
    "sample_count": 1,
    "gzip": true
  }
}

bioartifact checks structure and workflow compatibility. It is not a workflow engine, biological interpretation layer, or replacement for samtools, bcftools, FastQC, or MultiQC.

Start

Install and run one check

The core package has no runtime dependencies. JSON is the default output format for every command.

pip install bioartifact
bioartifact inspect sample.bam
bioartifact validate peaks.narrowPeak \
  --contract narrowpeak
bioartifact validate-manifest workflow_manifest.json

Coverage

Artifacts and contracts

Artifact types

FASTQ, FASTA, SAM, BAM, VCF, BED, narrowPeak, GTF, GFF, CSV, TSV, and HTML reports.

Contracts

FASTQ structure, paired FASTQ synchronization, sorted and indexed BAM expectations, valid VCF structure, narrowPeak structure, and differential-expression table columns.

Manifests

Validate expected workflow outputs, named contracts, and required companion files such as BAM or VCF indexes.

Interface

Stable JSON outputs

Discovery commands expose supported artifact types, contract metadata, required arguments, and public JSON schemas. Schemas pin the current schema_version.

bioartifact contracts
bioartifact types
bioartifact schema
bioartifact schema contract_result