DAS/2.1/Spec

From BioDAS
Jump to: navigation, search

This DAS/2.1 version is currently under development. Contributions to this new version are invited from both DAS/1 and DAS/2 developers with the goal of merging ideas from both camps to create a unified version of the spec suitable for all DAS-based applications.

See also:

Overview

The DAS 2.x specification addresses some shortcomings of the DAS 1.x protocol, including:

  • Better support for hierarchical structures (e.g. gene -> transcripts -> exons)
  • Ontology-based feature annotations
  • Allow multiple formats, including formats only appropriate for some feature types
  • A lock-based editing protocol for curational clients
  • An extensible namespacing system that allows annotations in non-genomic coordinates (e.g. uniprot protein coordinates or PDB structure coordinates)

DAS is a protocol for sharing biological data. This version of the specification, DAS 2.x, describes features located on the genomic sequence. Future versions will add support for sharing annotations of protein sequences, expression data, 3D structures and ontologies. The genomic DAS interface is deliberately designed so there will be a large core shared with the protein sequence DAS.

A DAS 2.x annotation server provides feature information about one or more genome sources. Each source may have one or more versions. Different versions are usually based on different assemblies. As an implementation detail an assembly and corresponding sequence data may be distributed via a different machine, which is called the reference server.

Annotations are located on the genomic sequence with a start and end position. The range may be specified multiple times if there are alternate coordinate systems. An annotation may contain multiple non-continguous parts, making it the parent of those parts. Some parts may have more than one parent. Annotations have a type based on terms in SOFA (Sequence Ontology for Feature Annotation). Stylesheets contain a set of properties used to depict a given type.

Annotations can be searched by range, type, and a properties table associated with each annotation. These are called feature filters.

DAS 2.x is implemented using a ReST architecture. Each document (also called an entity or object) has a name, which is a URL. Fetching the URL gets information about the document. The DAS-specific documents are all in XML. Other data types have existing widely used formats, and sometimes more than one for the same data. A DAS server may provide a distinct document for each of these formats, along with information about which formats are available.

Retrieval of genomic sequence and annotation

DAS/2.1/Spec/Get-Genomic for read-only access to genomic sequence and annotation data sources.

Stylesheet

Writeback

DAS/2.1/Spec/Writeback for write access (add, delete, edit) to annotation data sources.

Region locking

DAS/2.1/Spec/Locking to prevent editing conflicts for clients and servers supporting writeback.

Assay Retrievals

Ontology Retrievals