Difference between revisions of "DAS/1"
(Changed Dazzle link to go to biojava page) |
(→About the DAS Protocol) |
||
(17 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
[[Category:Documentation]] | [[Category:Documentation]] | ||
− | + | == About the DAS Protocol == | |
− | + | The DAS protocol was written by Lincoln Stein, Sean Eddy, and Robin Dowell in 2000, and is the basis for a large number of software implementations. Around 650 public DAS data sources are currently running worldwide, with many more private data sources known to be in use. A number of websites and software applications function as DAS clients. | |
− | + | As of January 2013, the latest version of the DAS protocol is [[DAS1.6|DAS 1.6]]. | |
− | The | + | The specification is being actively supported, and continues to be extended in order to cater for the needs of its existing users and expand its applicability to additional arenas. For example, though originally focussed on genomic annotation, the latest version includes support for protein sequences, structures and annotations. In addition, extensions have enabled DAS to be used to distribute alignment and molecular interaction data. These and other extensions are listed in the unofficial [[DAS1.6E|DAS 1.6E]] specification. |
+ | <b>Note:</b> This protocol should not be confused with that of [[DAS/2]], an entirely separate project. | ||
− | The | + | The [[DAS/1/Overview|DAS Overview]] provides a glossary and list of concepts. The [[DAS Plans]] page describes some details of possible future improvements to DAS. See the [[DAS specification]] page for more information on the different versions of DAS. |
− | + | == Architecture == | |
+ | The Distributed Annotation System comprises three types of component: | ||
− | + | === Data source === | |
− | + | A data source provides programmatic access to data over a network, typically the internet. Each <i>data source</i> contains a set of data from one provider, for example Pfam domains. One or more <i>data sources</i> may be hosted by a single <i>DAS server</i>. | |
− | + | === Registry === | |
− | |||
− | |||
− | |||
− | + | The [[DasRegistry| DAS Registry]] lists and describes public <i>data sources</i> and the types of data that may be communicated using DAS. It is accessible programatically. | |
− | |||
− | |||
− | == | + | === Client === |
− | A more | + | A client consumes and integrates the data contained within one or more <i>data sources</i>. It may also communicate with a <i>server</i> or <i>registry</i> to obtain information about available data sources. |
− | + | == How to set up a DAS server == | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | == | + | === Implementation === |
− | + | Theoretically it is quite easy to implement a DAS server (once you know how). However, there are also some well established multi-purpose server implementations designed to be as flexible and easy to set up as possible. It is strongly suggested to use one of these existing packages. Many distributions contain ready made data-adaptors (e.g. for GFF files). For custom data, simple plugins can be written to quickly provide your data via DAS: | |
− | |||
− | For custom data simple plugins can be written to quickly provide your data via DAS | ||
− | DAS server | + | * [http://www.sanger.ac.uk/proserver/ ProServer] is a well established Perl server, supports all DAS extensions and is the most heavily used. |
+ | * [http://biodas.org/servers/LDAS.html LDAS] is an older Perl server which is easy to set up but lacks support for the latest DAS features. | ||
+ | * [http://www.biojava.org/wiki/Dazzle Dazzle] is a well established Java server tied to the BioJava framework and support most extensions. | ||
+ | * [http://code.google.com/p/mydas/ MyDas] is a newer, more streamlined Java server but does not yet support all extensions. | ||
+ | * [http://www.ebi.ac.uk/panda-srv/easydas/ EasyDAS] is a wrapper around ProServer facilitating access to the EBI's compute power for those that don't have access to a server of their own. | ||
− | + | === Validation === | |
− | |||
− | |||
− | + | Servers should endeavour to conform strictly to the DAS data formats as this maximises compatibility across the network and minimises the maintenance required as a result of evolving client software. RelaxNG documents are now available from the [[DasRegistry|DAS Registry]] to help validate the XML produced by your servers. These are used by the registry to help servers conform to the DAS specification before being registered. | |
− | |||
− | |||
− | == Publishing and Discovery of DAS | + | === Publishing and Discovery of DAS sources === |
− | + | Publishing your source in the DAS Registry allows you to advertise its availability, and is described in the [[DasRegistry|DAS Registry]] documentation. Registered sources are also easier for users to visualise in clients that are capable of interacting with the DAS Registry, such as [http://www.ensembl.org/ Ensembl], [http://www.ebi.ac.uk Dasty] and [http://www.efamily.org.uk/software/dasclients/spice/ SPICE]. | |
+ | |||
+ | == Training in DAS == | ||
+ | |||
+ | DAS Tutorials, talks and hackathons take place once a year at the Genome Campus UK almost every year in the [[DASWorkshop2011|DAS Workshop]]. The next workshop will be in March 2011. Events, courses and workshops involving DAS are often listed on the [[Current_events|Current events]] page. |
Latest revision as of 11:35, 25 January 2013
Contents
About the DAS Protocol
The DAS protocol was written by Lincoln Stein, Sean Eddy, and Robin Dowell in 2000, and is the basis for a large number of software implementations. Around 650 public DAS data sources are currently running worldwide, with many more private data sources known to be in use. A number of websites and software applications function as DAS clients.
As of January 2013, the latest version of the DAS protocol is DAS 1.6.
The specification is being actively supported, and continues to be extended in order to cater for the needs of its existing users and expand its applicability to additional arenas. For example, though originally focussed on genomic annotation, the latest version includes support for protein sequences, structures and annotations. In addition, extensions have enabled DAS to be used to distribute alignment and molecular interaction data. These and other extensions are listed in the unofficial DAS 1.6E specification.
Note: This protocol should not be confused with that of DAS/2, an entirely separate project.
The DAS Overview provides a glossary and list of concepts. The DAS Plans page describes some details of possible future improvements to DAS. See the DAS specification page for more information on the different versions of DAS.
Architecture
The Distributed Annotation System comprises three types of component:
Data source
A data source provides programmatic access to data over a network, typically the internet. Each data source contains a set of data from one provider, for example Pfam domains. One or more data sources may be hosted by a single DAS server.
Registry
The DAS Registry lists and describes public data sources and the types of data that may be communicated using DAS. It is accessible programatically.
Client
A client consumes and integrates the data contained within one or more data sources. It may also communicate with a server or registry to obtain information about available data sources.
How to set up a DAS server
Implementation
Theoretically it is quite easy to implement a DAS server (once you know how). However, there are also some well established multi-purpose server implementations designed to be as flexible and easy to set up as possible. It is strongly suggested to use one of these existing packages. Many distributions contain ready made data-adaptors (e.g. for GFF files). For custom data, simple plugins can be written to quickly provide your data via DAS:
- ProServer is a well established Perl server, supports all DAS extensions and is the most heavily used.
- LDAS is an older Perl server which is easy to set up but lacks support for the latest DAS features.
- Dazzle is a well established Java server tied to the BioJava framework and support most extensions.
- MyDas is a newer, more streamlined Java server but does not yet support all extensions.
- EasyDAS is a wrapper around ProServer facilitating access to the EBI's compute power for those that don't have access to a server of their own.
Validation
Servers should endeavour to conform strictly to the DAS data formats as this maximises compatibility across the network and minimises the maintenance required as a result of evolving client software. RelaxNG documents are now available from the DAS Registry to help validate the XML produced by your servers. These are used by the registry to help servers conform to the DAS specification before being registered.
Publishing and Discovery of DAS sources
Publishing your source in the DAS Registry allows you to advertise its availability, and is described in the DAS Registry documentation. Registered sources are also easier for users to visualise in clients that are capable of interacting with the DAS Registry, such as Ensembl, Dasty and SPICE.
Training in DAS
DAS Tutorials, talks and hackathons take place once a year at the Genome Campus UK almost every year in the DAS Workshop. The next workshop will be in March 2011. Events, courses and workshops involving DAS are often listed on the Current events page.