http://biodas.open-bio.org/w/index.php?title=DAS/2.1/Spec/Get-Genomic/Segments&feed=atom&action=historyDAS/2.1/Spec/Get-Genomic/Segments - Revision history2024-03-29T07:32:34ZRevision history for this page on the wikiMediaWiki 1.29.3http://biodas.open-bio.org/w/index.php?title=DAS/2.1/Spec/Get-Genomic/Segments&diff=646&oldid=prevSteveC at 01:29, 31 October 20082008-10-31T01:29:52Z<p></p>
<table class="diff diff-contentalign-left" data-mw="interface">
<col class='diff-marker' />
<col class='diff-content' />
<col class='diff-marker' />
<col class='diff-content' />
<tr style='vertical-align: top;' lang='en'>
<td colspan='2' style="background-color: white; color:black; text-align: center;">← Older revision</td>
<td colspan='2' style="background-color: white; color:black; text-align: center;">Revision as of 01:29, 31 October 2008</td>
</tr><tr><td colspan="2" class="diff-lineno" id="mw-diff-left-l1" >Line 1:</td>
<td colspan="2" class="diff-lineno">Line 1:</td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div>'''<big><del class="diffchange diffchange-inline">DAS/2 Segments Document Specification</big>'''</del></div></td><td class='diff-marker'>+</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;"><div>'''<big>[[DAS/2<ins class="diffchange diffchange-inline">.1</ins>/Spec/Get-Genomic|<ins class="diffchange diffchange-inline">DAS/2.1</ins>]] <ins class="diffchange diffchange-inline">Segments Document Specification for Genome Data Retrieval</big>'''</ins></div></td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div> </div></td><td colspan="2"> </td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div><del class="diffchange diffchange-inline">* '''General specification:''' </del>[[DAS/2/Spec/Get-Genomic|<del class="diffchange diffchange-inline">Retrieving DAS2 genomic sequence and annotation feature records</del>]]</div></td><td colspan="2"> </td></tr>
<tr><td class='diff-marker'>−</td><td style="color:black; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;"><div> </div></td><td colspan="2"> </td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"></td></tr>
<tr><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>==Overview==</div></td><td class='diff-marker'> </td><td style="background-color: #f9f9f9; color: #333333; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #e6e6e6; vertical-align: top; white-space: pre-wrap;"><div>==Overview==</div></td></tr>
</table>SteveChttp://biodas.open-bio.org/w/index.php?title=DAS/2.1/Spec/Get-Genomic/Segments&diff=645&oldid=prevSteveC: DAS/2/Spec/Get-Genomic/Segments moved to DAS/2.1/Spec/Get-Genomic/Segments: To separate this evolving version of the DAS spec from the frozen DAS/2 version.2008-10-31T00:53:57Z<p><a href="/wiki/DAS/2/Spec/Get-Genomic/Segments" class="mw-redirect" title="DAS/2/Spec/Get-Genomic/Segments">DAS/2/Spec/Get-Genomic/Segments</a> moved to <a href="/wiki/DAS/2.1/Spec/Get-Genomic/Segments" title="DAS/2.1/Spec/Get-Genomic/Segments">DAS/2.1/Spec/Get-Genomic/Segments</a>: To separate this evolving version of the DAS spec from the frozen DAS/2 version.</p>
<table class="diff diff-contentalign-left" data-mw="interface">
<tr style='vertical-align: top;' lang='en'>
<td colspan='1' style="background-color: white; color:black; text-align: center;">← Older revision</td>
<td colspan='1' style="background-color: white; color:black; text-align: center;">Revision as of 00:53, 31 October 2008</td>
</tr><tr><td colspan='2' style='text-align: center;' lang='en'><div class="mw-diff-empty">(No difference)</div>
</td></tr></table>SteveChttp://biodas.open-bio.org/w/index.php?title=DAS/2.1/Spec/Get-Genomic/Segments&diff=644&oldid=prevSteveC: Importing text from the Segments sections of the original das2_get.html2008-10-30T08:30:57Z<p>Importing text from the Segments sections of the original das2_get.html</p>
<p><b>New page</b></p><div>'''<big>DAS/2 Segments Document Specification</big>'''<br />
<br />
* '''General specification:''' [[DAS/2/Spec/Get-Genomic|Retrieving DAS2 genomic sequence and annotation feature records]]<br />
<br />
<br />
==Overview==<br />
<br />
Features are located on segments. A segment is the largest chunk of<br />
contiguous sequence. For fully sequenced organisms a segment may be a<br />
chromosome. For partially assembled genomes where the distance<br />
between the assembled regions is not known then each assembled region<br />
may be its own segment. If a server provides annotations in contig<br />
space then each contig is a segment. A specific set of segments is<br />
also called a coordinate system.<br />
<br />
There are two ways for a versioned source record to describe which<br />
coordinate system it uses. The first is through a CAPABILITY element<br />
of type "segments". (The CAPABILITY elements list the different DAS<br />
interfaces and extensions supported by a server.) Fetching the<br />
corresponding query_uri returns a document of content-type<br />
'application/x-das-segments+xml' listing information about each<br />
segment.<br />
<br />
The second is through a COORDINATES element that uniquely<br />
characterizes the coordinate system but requires consulting other<br />
sources for details on the segments.<br />
<br />
===Example request and response===<br />
<br />
<P>Request:</P><br />
<blockquote class="url"><br />
http://www.biodas.org/das2/h.sapiens/v3/segments.xml<br />
</blockquote><br />
<br />
<P>Response:</P><br />
<pre class="speclist">Content-Type: application/x-das-segments+xml<br />
<br />
&lt;?xml version="1.0" encoding="UTF-8"?&gt;<br />
&lt;das:SEGMENTS xmlns:das="http://biodas.org/documents/das2"&gt;<br />
&lt;das:SEGMENT uri="http://www.biodas.org/das2/h.sapiens/v37/segment/Chr1.xml"<br />
title="Chr1" length="245522847"<br />
doc_href="http://www.ensembl.org/Homo_sapiens/mapview?chr=1"/&gt;<br />
&lt;das:SEGMENT uri="http://www.biodas.org/das2/h.sapiens/v37/segment/Chr2.xml"<br />
title="Chr2" length="243018229"<br />
doc_href="http://www.ensembl.org/Homo_sapiens/mapview?chr=2"/&gt;<br />
&lt;/das:SEGMENTS&gt;<br />
</pre><br />
</div><br />
<br />
Note that unlike the previous examples this document defined the new<br />
namespace abbreviation "das" instead of defining a default namespace.<br />
<br />
==Details==<br />
<br />
A versioned source entry contains two ways to get information about<br />
the top-level segments used as a coordinate system. One is through a<br />
<a href="#global_seqids">global registry of COORDINATES URIs</a>.<br />
This is an abstract identifier scheme. The other is through a<br />
concrete link to a segments document, which lists information about<br />
each segment and how to get the sequence information.<br />
<br />
The segments request is done through the query_uri of the "segments"<br />
CAPABILITY listed in the versioned source entry, as in the following:<br />
<br />
<pre class="speclist"><br />
&lt;CAPABILITY type="segments"<br />
query_uri="http://www.biodas.org/sequence/gallus_gallus/March2004.xml"&gt;<br />
</pre><br />
<br />
Fetching that URI returns a document of content-type<br />
"application/x-das-segments+xml".<br />
<br />
The versioned source may contain multiple segments CAPABILITIES. This<br />
occurs when there are multiple top-level coordinate systems for the<br />
annotation server, for example, with features annotated in both contig<br />
and chromosome coordinates.<br />
<br />
When that occurs each segments CAPABILITY must have a 'coordinates'<br />
attribute containing a URI linking it to a COORDINATES element, which<br />
must also exist in the versioned source. Note that the coordinates<br />
URIs should but are not required to be in the global registry. You<br />
may make up a URI if none exists . For example:<br />
<br />
<pre class="speclist"><br />
&lt;COORDINATES uri="http://sanger.ac.uk/das-registry/yeast-32-gene"<br />
taxid="4932" source="Gene_ID" authority="SGD32" version="3"/&gt;<br />
&lt;COORDINATES uri="http://sanger.ac.uk/das-registry/yeast-32-chromosome"<br />
taxid="4932" source="Chromosome" authority="SGD32" version="3"/&gt;<br />
<br />
&lt;CAPABILITY type="segments"<br />
query_uri="http://www.biodas.org/sequence/s.cerevisiae/genes.xml"<br />
coordinates="http://sanger.ac.uk/das-registry/yeast-32-gene" /&gt;<br />
<br />
&lt;CAPABILITY type="segments"<br />
query_uri="http://www.biodas.org/sequence/s.cerevisiae/chromosomes.xml"<br />
coordinates="http://sanger.ac.uk/das-registry/yeast-32-chromosome" /&gt;<br />
</pre><br />
<br />
===Example request and response===<br />
<br />
<P>Request:</P><br />
<blockquote class="url"><br />
http://www.biodas.org/sequence/gallus_gallus/March2004.xml<br />
</blockquote><br />
<br />
<P>Response:</P><br />
<pre class="speclist">Content-Type: application/x-das-segments+xml<br />
<br />
&lt;?xml version="1.0" encoding="UTF-8"?&gt;<br />
&lt;SEGMENTS xmlns="http://biodas.org/documents/das2"<br />
xml:base="http://www.biodas.org/das2/sequence/fly/release4/"&gt;<br />
&lt;FORMAT name="fasta" /&gt;<br />
&lt;FORMAT name="agp" /&gt;<br />
&lt;SEGMENT uri="2L" title="Chromosome 2L" length="186349"<br />
reference="http://www.flybase.org/genome/D_melanogaster/R4.3/dna/2L" /&gt;<br />
&lt;SEGMENT uri="2R" title="Chromosome 2R" length="464030"<br />
reference="http://www.flybase.org/genome/D_melanogaster/R4.3/dna/2R" /&gt;<br />
&lt;SEGMENT uri="3L" title="Chromosome 3L" length="419684"<br />
reference="http://www.flybase.org/genome/D_melanogaster/R4.3/dna/3L" /&gt;<br />
&lt;SEGMENT uri="3R" title="Chromosome 3R" length="1428"<br />
reference="http://www.flybase.org/genome/D_melanogaster/R4.3/dna/3R" /&gt;<br />
&lt;SEGMENT uri="4" title="Chromosome 4" length="43776"<br />
reference="http://www.flybase.org/genome/D_melanogaster/R4.3/dna/4" /&gt;<br />
&lt;SEGMENT uri="X" title="Chromosome X" length="311673"<br />
reference="http://www.flybase.org/genome/D_melanogaster/R4.3/dna/X" /&gt;<br />
&lt;/SEGMENTS&gt;<br />
</pre><br />
<br />
Note that in the example the SEGMENT uri attributes are relative URIs<br />
and are resolved using the xml:base defined in the SEGMENTS element.<br />
<br />
===SEGMENTS element===<br />
<br />
The SEGMENTS element has zero or more FORMAT elements. A FORMAT<br />
element has the single attribute named 'name' describing the supported<br />
format. This specification defines the following format names:<br />
<br />
<table border="1"><br />
<tr><br />
<th>Format name</th><br />
<th>Meaning</th><br />
</tr><br />
<tr><br />
<td>das2xml</td><br />
<td>a segments response of type application/x-das-segments+xml</td><br />
</tr><br />
<tr><br />
<td>fasta</td><br />
<td>sequence data in FASTA format</td><br />
</tr><br />
<tr><br />
<td>raw</td><br />
<td>sequence data with only residue names and newlines</td><br />
</tr><br />
<tr><br />
<td>agp</td><br />
<td>Assembly data in AGP format</td><br />
</tr><br />
<tr><br />
<td>count</td><br />
<td>the total number of segments, as a decimal string<br /><br />
Used to get the segment count before potentially requesting<br />
10,000s of segments</td><br />
</tr><br />
<tr><br />
<td>formats</td><br />
<td>the standard das2xml format but without SEGMENT elements<br /><br />
Used to get the format listing even when there are a large number<br />
of segments</td><br />
</tr><br />
</table><br />
<br />
</P><P><br />
<br />
All versioned sources that have a "segments" capability must support<br />
the "das2xml" format. The "das2xml" FORMAT entry does not need to be<br />
specified in a FORMAT element. For details of use, see the next<br />
section.<br />
<br />
===SEGMENT element===<br />
<br />
The SEGMENTS element has zero or more SEGMENT elements. Each SEGMENT<br />
element has several attributes. The 'uri' attribute is a URI and must<br />
be unique for each SEGMENT. The 'title' attribute is a short title of<br />
at most a few words, meant for people. The 'length' attribute is an<br />
integer count of the total number of residues in the segment. The<br />
optional 'doc_href' is a URL for a web browser to display<br />
human-readable documentation.<br />
<br />
The optional 'reference' attribute is a URI. It connects the given<br />
segment to the <a href="#global_seqids">globally agreed upon standard<br />
identifier</a> for that segment. For example, the following reference<br />
URI is the identifier for Human Chromosome 4, defined by NCBI assembly<br />
34:<br />
<br />
<blockquote><br />
http://www.ncbi.nlm.nih.gov/genome/H_sapiens/B34.3/dna/chr4 <br />
</blockquote><br />
<br />
A client uses the reference identifier to merge features from different DAS<br />
annotation servers into a common view because two segments from<br />
different servers with the same reference identifier must be copies of<br />
the same underlying segment.<br />
<br />
<br />
===Segments query parameters===<br />
<br />
The segment URIs must be fetchable. This is used to return the<br />
sequence data. Each segment URI must support <i><a href="http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.1">form-urlencoded</a></i><br />
query parameters. The optional 'format' query specifies the data<br />
format to return, using the values in the recently defined FORMAT<br />
elements. If not given the default format is 'das2xml', which returns<br />
a segments document of content-type 'application/x-das-segments+xml'<br />
containing the details for only that segment. If the format query<br />
parameter is "fasta", then the content-type for the returned FASTA<br />
document should be "text/plain". FASTA records should have a title<br />
containing the segment name but a client may ignore the title.<br />
<br />
The segment URI queries support a 'range' query parameter to limit the<br />
response to specific range of the segment. The query is of the form<br />
"$start:$end", for examples "100:200" and "345:9876". Note that the<br />
colon should be escaped for use in a URL.<br />
<br />
The response must only include the sequence in the specified range.<br />
As usual, the first residue is 0 and the range includes the start<br />
position but not the end position. For example, if the sequence is<br />
"CATAGGTA" then range=1:3 is the subsequence "AT".<br />
<br />
Suppose the following is the segment URI for human chromosome 2<br />
<blockquote class="url"><br />
http://www.biodas.org/h.sapiens/v22/Chr2<br />
</blockquote><br />
<br />
Fetching that URL will return a segment document for chromosome 2.<br />
Fetching the URL with "format=fasta" in the query string, as in<br />
<blockquote class="url"><br />
http://www.biodas.org/h.sapiens/v22/Chr2?format=fasta<br />
</blockquote><br />
<br />
returns the entire sequence for chromosome 2 in FASTA format. Adding<br />
"range=500:900", as in<br />
<blockquote class="url"><br />
http://www.biodas.org/h.sapiens/v22/Chr2?format=fasta&amp;range=500:900<br />
</blockquote><br />
<br />
returns the 400 residues starting from the 501st position of<br />
chromosome 2, still in FASTA format.<br />
<br />
If the server does not support a requested format name then it MUST <br />
respond with an HTTP error message with status code 400 "Bad Request", <br />
and the message body SHOULD indicate that the requested format is not <br />
supported. If the server considers the response too large then it <br />
MUST respond with an HTTP error message with status code 413 "Request <br />
Entity Too Large", and the message body SHOULD indicate what the <br />
server would consider an acceptable response size. It is up to the <br />
server to determine how large is "too large".<br />
<br />
If the start or end of the range is negative, or greater than the length <br />
of the segment, or beyond the limits of the integer data type used for <br />
the range then the server MAY respond with an HTTP error 400 "Bad Request". <br />
Servers MUST accept ranges from 0 up to and including the length of the segment, <br />
except for the case where the response is considered too large</div>SteveC