Skip to content

Exporting Data from Scout

There are many scenarios where users may want to export information from Scout. This document outlines the available export options and how to use them via the command line.

Main export command:

scout export

Available Export Commands

Scout provides the following subcommands for data export:

Options:
  --help  Show this message and exit.

Commands:
  cases        Export case data
  database     Export the Scout MongoDB database
  exons        Export exon regions
  genes        Export genes
  hpo_genes    Export HPO-associated genes
  managed      Export managed variants
  mt_report    Export a mitochondrial variants report
  panel        Export gene panels
  transcripts  Export transcripts
  variants     Export variants
  verified     Export verified variants

Each command is described in detail below.


Exporting Cases (cases)

Export case information based on institute, status, or various filters. When exporting cases, it is possible to restrict the list of he exported data by adding optional parameters:

Options:
  --case-id TEXT                  Case ID to search for
  -i, --institute TEXT            Institute ID
  -r, --reruns                    Include rerun-requested cases
  -f, --finished                  Include archived or solved cases
  --causatives                    Include cases with causative variants
  --research-requested            Include cases with research requested
  --rerun-monitor                 Include cases with continuous rerun monitoring
  --is-research                   Include cases in research mode
  -s, --status [prioritized|inactive|active|solved|archived|ignored]
                                  Filter by case status
  --within-days INTEGER           Filter by event date (days ago)
  --json                          Output result in JSON format

The command's output is a list of case documents in the same format as they are saved into the database (dictionaries). By specifying the --json, documents are printed in json format.


Exporting the Database (database)

Dump the entire Scout MongoDB database (or subsets).

Options:
  -o, --out PATH       Output directory for the dump [default: ./dump]
  --uri TEXT           MongoDB connection URI
  --all-collections    Include variant collections
  --help               Show this message and exit

Exporting Exons (exons)

Export all exons or limit to a specific gene. Default format is a .bed file, but they can be exported as a json file by using the --json option.

Options:
  -b, --build [37|38]             Genome build version
  -hgnc, --hgnc-id INTEGER        Filter by HGNC ID
  --json                          Output in JSON format

Output Format: A tab-delimited list with the following header:

#Chrom  Start   End ExonId  Transcripts HgncIDs HgncSymbols

Exporting Genes (genes)

Export all genes from the database. Default format is a .bed file, but they can be exported as a json file by using the --json option.

Options:
  -b, --build [37|38|GRCh38]      Genome build version [default: 37]
  --json                          Output in JSON format

Output Format: A tab-delimited list with the following header:

#Chromosome Start   End Hgnc_id Hgnc_symbol

Exporting HPO-Associated Genes (hpo_genes)

Export HGNC IDs for all genes associated with a given HPO term.

Example Usage:

scout export hpo_genes HP:0000371

Result:

#Gene_id    Count
9986    1
9987    1
9988    1
6193    1
7067    1

Exporting Managed Variants (managed)

Export managed variants in VCF format.

Example Output (VCF):

##fileformat=VCFv4.2
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant">
##INFO=<ID=TYPE,Number=1,Type=String,Description="Type of variant">
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO
1   1295250 .   C   T   .   .   END=1295250;TYPE=SNV
..

Exporting Mitochondrial Reports (mt_report)

Export a mitochondrial report for a given case.

Options:
  --case_id TEXT    Required case ID
  --outpath TEXT    Path to the output file
  --test            Run in test mode

Exporting Gene Panels (panel)

Export gene panels with clinical annotation in BED format.

Options:
  -b, --build [37|38|GRCh38]      Genome build version [default: 37]
  --version FLOAT                 Specify panel version
  --bed                           Export in BED-like format

Output Header (default format):

#hgnc_id    symbol  disease_associated_transcripts  reduced_penetrance  mosaicism   database_entry_version  inheritance_models  custom_inheritance_models   comment

Exporting Transcripts (transcripts)

Export a list of transcripts in BED format. Use the following command: scout export transcripts

Options:
  -b, --build [37|38|GRCh38]      Genome build version [default: 37]

Exporting Causative Variants (variants)

Export causative variants for a case or an institute. Output is VCF by default, or JSON if specified.

Options:
  -c, --collaborator TEXT         Collaborator ID [default: cust000]
  -d, --document-id TEXT          Search for a specific variant
  --case-id TEXT                  Case ID to search for
  --json                          Output in JSON format

Exporting Verified Variants (verified)

Similar to the variants export, but limited to verified variants. Output is VCF by default, or JSON if specified.

Options:
  -c, --collaborator TEXT         Collaborator ID [default: cust000]
  -d, --document-id TEXT          Search for a specific variant
  --case-id TEXT                  Case ID to search for
  --json                          Output in JSON format