1. Introduction to SeqWare
  2. Installation
  3. Getting Started
  4. SeqWare Pipeline
  5. SeqWare MetaDB
  6. SeqWare Portal
  7. SeqWare Web Service
  8. SeqWare Query Engine
  9. Glossary
  10. Frequently Asked Questions
  11. APIs
  12. Source Code
  13. Plugins
  14. Modules
  15. Advanced Topics

Batch Metadata Injection

Batch Metadata Injection is a SeqWare plugin that allow you to inject basic metadata for a sequencer run, study, and other associated information. It has several mechanisms available to insert these data. In the default mode, fields specified on the command line and other information provided in a file allow you to create the metadata without prompting. Alternatively, you can use interactive mode to be prompted for each step in creating the metadata, which may occur with or without an input file..

Description

The plugin was originally developed to import metadata needed to process data that originated outside of, or could not be entered into, the LIMs system at OICR. Therefore, it mimics the process of a sequencer run being entered into the LIMs and the import process necessary in the SeqWare MetaDB. The system uses the name as a unique identifier for Study and Experiment, allowing you to add additional samples to a pre-existing Study. Every sequencer run is considered to be unique though.

Information is collected first and stored in-memory before attempting to create the objects in the database, so you may exit the process at any time prior to database insertion without causing any database corruption.

Sequencer Run - this element descibes the flowcell or single run of a sequencer.

  • run name - a descriptive name of the flowcell
  • run directory - the absolute path to the directory where the data from the flowcell or run resides
  • run description - any other identifying information

Lane - this element describes a lane of sequencing in the flowcell. It may contain one or more samples or barcoded samples in the form of IUSes (below). It is unique to a Sequencer Run.

  • lane number - usually a number between 1 and 8
  • library strategy - XXXX
  • library selection - XXXX
  • library source - XXXX
  • study type

IUS - short for Individual Unit of Sequencing, this element describes a sample loaded in a lane, or a barcoded sample in a lane. It is unique to a Lane and Sequencer Run.

  • barcode - a short series of characters that is the identity of the barcode.
  • name - a descriptive name (optional)
  • description - any other information (optional)

Study - this element contains the information about the project that the samples belong to. This element can contain many samples from different lanes of sequencing.

  • title - the title of the project
  • description - any other information about the project
  • center name - the name of the coordinating center
  • center project - the name of the project at the coordinating center

Experiment - this element can be used to distinguish different types of experiments within a project, for example samples that relate to WG or exome sequencing.

  • name
  • description

Sample - this element describes a particular library used for sequencing. The exact name of the sample is created with the attributes specified at run time.

  • project code - a short string of characters used to distinguish samples from a particular project, e.g. PCSI or AOE
  • individual number - an integer or other short string that is used in combination with the project code to distinguish all samples with a single origin (e.g. donor)
  • tissue type - the type of tissue that it originated from
  • tissue origin - XXXX
  • library size code - the fragment size of the library
  • organism - the species of the organism
  • library source template type - XXXX
  • library type - XXXX
  • paired end - whether or not it was sequenced with paired end sequencing
  • tissue preparation -XXXX
  • targeted resequencing - if the sample is being resequenced, the kit that was used

Requirements

In order to run the WorkflowRunReporter plugin, you must have the following available to you:

  • SeqWare distribution jar (1.0+)
  • SeqWare settings file with connection to SeqWare Web service (contact your local SeqWare admin to get the path) and write privileges.
  • The local path to a sequencer run directory and all of the important information about the samples contained therein
  • Optionally: a sample sheet describing the samples

Command line parameters

Command-line option Description
–miseq-sample-sheet Optional: The absolute or relative path to the sample sheet. Use instead of –new
–interactive Enable interactive mode
–new Optional: Create a new sequencer run and study. Use instead of –miseq-sample-sheet

Examples

Here a number of examples of how to use the plugin.

Creating new metadata in interactive mode

java -jar $SEQWARE_HOME/seqware-distribution-1.0.0-full.jar --plugin net.sourceforge.seqware.pipeline.plugins.BatchMetadataInjection -- --new --interactive