Background

Rationale for the existence of a web service for bio-ontologies

Bio-ontologies are an excellent tool for the describing the complexity of biological systems in a form that can be easily processed by computational means. To this point, the focus of ontologies in biology has largely been on the construction of the onotologies themselves and ensuring that the semantic model is structured correctly. It is the opinion of the authors that the structure of the ontologies is now sufficiently stable that they are ready to be more tightly integrated into curatorial applications. The groups maintaining the most prominent genome sequence feature and ontology curation tools, namely Apollo and OBO-Edit, are also expressing interest in the integration of ontology and sequence curation software.

The purpose of the DAS/2 protocol for ontology retrieval and manipulation furthers the aim of greater use of ontology in the annotation of biological objects by providing a platform-neutral mechanism for ontology access. This opens opportunities to annotate biological objects for which sophisticated editorial and visualization applications do not yet exist.

Status of this document

The general structure for formulating queries described here can be considered stable. It is intentionally designed to be similar to URIs used elsewhere in the DAS/2 specification, specifically the /das/assay and /das/genome domains.

GET Requests on DAS/2 Ontology URLs

The primary way to retrieve information from a DAS/2 server is to perform a GET request on a DAS/2 URL.

The "ontologies" Request: Fetching Information about Ontologies

The document returned contains a list of all ontologies available from the service.

text/xml

REQUEST:

GET http://das.biopackages.net/das/ontology/obo/1/ontology

RESPONSE:

Content-Type: text/xml

<?xml version="1.0" standalone="no"?>
<!DOCTYPE DAS2ONTOLOGIES SYSTEM "http://www.biodas.org/dtd/das2ontologies.dtd">
<ONTOLOGIES>
  <ONTOLOGY
       xmlns="http://www.biodas.org/ns/das/genome/2.00"
       xmlns:xlink="http://www.w3.org/1999/xlink"
       xml:base="http://jugular.ctrl.ucla.edu:8529/das/ontology/obo/1/ontology/">
       id="PATO"
       definition=""
  />
  <ONTOLOGY
       xmlns="http://www.biodas.org/ns/das/genome/2.00"
       xmlns:xlink="http://www.w3.org/1999/xlink"
       xml:base="http://jugular.ctrl.ucla.edu:8529/das/ontology/obo/1/ontology/">
       id="REX"
       definition=""
  />
</ONTOLOGIES>

The "ontology" Request: Fetching Information about a Specific Ontology

The document returned contains all terms and edge types for the specified ontology.

text/xml

REQUEST:

GET http://das.biopackages.net/das/ontology/obo/1/ontology/CL

RESPONSE:

Content-Type: text/xml

<?xml version="1.0" standalone="no"?>
<obo
     xmlns="http://www.biodas.org/ns/das/genome/2.00"
     xmlns:xlink="http://www.w3.org/1999/xlink"
     xml:base="http://das.biopackages.net/das/ontology/obo/1/ontology/">

>
  <source>
    <source_type>DAS/2 Server</source_type>
    <source_path/>
    <source_md5/>
    <source_mtime/>
  </source>
  <header>
    <format-version>1.0</format-version>
    <date/>
    <saved-by>DAS/2 Server</saved-by>
    <auto-generated-by>DAS/2 Server</auto-generated-by>
    <default-namespace/>
    <remark/>
  </header>
    
  <term id="CL/0000438">
    <id>CL/0000438</id>
    <name>luteinizing_hormone_secreting_cell</name>
    <namespace>cell.ontology</namespace>
    <is_a>CL/0000167</is_a>
    <synonym scope="exact">
      <synonym_text>delta_basophila</synonym_text>
      <synonym_text>gonadotroph</synonym_text>
    </synonym>
  </term>
  <term id="CL/0000141">
    <id>CL/0000141</id>
    <name>cementocyte</name>
    <namespace>cell.ontology</namespace>
    <develops_from>CL/0000061</develops_from>
    <develops_from>CL/0000134</develops_from>
    <synonym scope="exact"/>
  </term>
  <typedef id="OBO_REL/is_a">
    <id>OBO_REL/is_a</id>
    <name>is_a</name>
    <namespace>relationship</namespace>
  </typedef>
  <typedef id="OBO_REL/develops_from">
    <id>OBO_REL/develops_from</id>
    <name>develops_from</name>
    <namespace>relationship</namespace>
  </typedef>  
</obo>

The format of the response to the "ontology" request can be adjusted using the "format" argument. These formats are described in the next section.

Ontology Formats

The format argument selects the format for the feature list. Values may include any of those listed in Table 1, or any other formats that the client and server agree on. A DAS2 server is only obliged to provide das2xml (synonymous with OBO XML) format.

Table 1: Ontology formats

Format Description

das2xml http://www.godatabase.org/dev/xml/xsl/

compact a terse tab-delimited format, described below

owl http://www.w3.org/TR/owl-xmlsyntax/

rdf http://www.w3.org/TR/rdf-syntax-grammar/

Format	Description
das2xml	http://www.godatabase.org/dev/xml/xsl/
compact	a terse tab-delimited format, described below
owl	http://www.w3.org/TR/owl-xmlsyntax/
rdf	http://www.w3.org/TR/rdf-syntax-grammar/

Compact format

The compact response format presents one ontology term per line. Each term is represented by four tab delimited columns, as described in Table 2.

Table 2: Ontology "compact" format

Column Description

1 Ontology dbspace

2 Term accession

3 Term name (or synonym possibly, depending on request)

4 Term namespace

Column	Description
1	Ontology dbspace
2	Term accession
3	Term name (or synonym possibly, depending on request)
4	Term namespace

REQUEST:

GET http://das.biopackages.net/das/ontology/obo/1/ontology/CL?format=compact;term=egg*

RESPONSE:

Content-Type: text/plain

#base = http://das.biopackages.net/das/ontology/obo/1/ontology
CL      CL/0000254      egg_cell        cell.ontology
CL      CL/0000025      egg     cell.ontology
CL      CL/0000659      eggshell_secreting_cell cell.ontology

Ontology Filters

The list of terms returned, as well as their respective namespaces, can be filtered using a selection language based on query string argument=value pairs. Filter arguments are alphanumerics plus the underscore. Filter values are URI-escaped strings but otherwise unconstrained. In addition to the standard characters that must be URI escaped, the characters [:,/*] (colon, comma, slash) have significance to the filter processor and must also be URI escaped.

Returned terms can be filtered by combinations of their name, synonym, and ontology dbspace

term=string: Match ontology terms with name or synonym matching string. string may contain asterisks '*' as wildcards to match zero or more characters.
ontology=string: Match ontology terms from ontologies matching string. Wildcards can be used, as described above.

Filters can be combined in a limited boolean fashion. Within the URL query string, multiple criterion=value pairs are ANDED together:

ontology=Fb*;term=egg

(Terms named "egg" from ontologies with dbspace name beginning in "Fb".

To describe OR relationships, you may separate filter values by commas. For example, this retrieves all terms named "egg" or beginning with "ovu".

term=egg,ovu*

AND and OR relationships can be combined:

ontology=Fb*;term=egg,ovu*

The "term" Request: Fetching Information about an Ontology Term

The document returned is formatted the same as the "ontology" request, but contains only the requested fragment from the ontology

text/xml

REQUEST:

GET http://das.biopackages.net/das/ontology/obo/1/ontology/CL/egg

RESPONSE:

Content-Type: text/xml

<?xml version="1.0" standalone="no"?>
<obo
     xmlns="http://www.biodas.org/ns/das/genome/2.00"
     xmlns:xlink="http://www.w3.org/1999/xlink"
     xml:base="http://das.biopackages.net/das/ontology/obo/1/ontology/">
>
  <source>
    <source_type>DAS/2 Server</source_type>
    <source_path/>
    <source_md5/>
    <source_mtime/>
  </source>
  <header>
    <format-version>1.0</format-version>
    <date/>
    <saved-by>DAS/2 Server</saved-by>
    <auto-generated-by>DAS/2 Server</auto-generated-by>
    <default-namespace/>
    <remark/>
  </header>
  <term id="CL/0000025">
    <id>CL/0000025</id>
    <name>egg</name>
    <namespace>cell.ontology</namespace>
    <is_a>CL/0000675</is_a>
    <synonym scope="exact">
      <synonym_text>ovum</synonym_text>
    </synonym>
  </term>
  <typedef id="OBO_REL/is_a">
    <id>OBO_REL/is_a</id>
    <name>is_a</name>
    <namespace>relationship</namespace>
  </typedef>
</obo>

Notes and Discussion Points

related isn't in OBO_REL. OBO_REL is for relationships that can hold between biological entities, not between biological entities and information entities - so synonym types wouldn't go in there.

The relationship between a term and it's synonym(s) currently is typed as "related". This is the relationship type assigned by the go-db and go-dev tools that are used to parse and load the OBO ontologies to the database. If "related" is an incorrect way to represent term/synonym relationships, it needs to be addresed in the go-db/go-dev API with the chado database.

In some sense, this point will also be moot once we have made the transition to use obo-xml. The appearance will be correct and not use "related", but the backend implementation will still use it and may need to be corrected as described.
The ontology metadata section will become v important.

What does this mean?
I have concerns about the size of the API. Have you seen the go db api? ok a lot of that is for getting associations. and it certainly could have been designed better. but the number of potential canned queries could get huge. will you have some kind of way of doing queries - either primitive or advanced? will you have calls for fetching the transitive closure? what kind of programs will use this api?

The intent is to keep the core API as minimal as possible, and to allow accomodation of canned queries and more advance query capability via a filtering mechanism. For more information about how features work, refer to the "term" and "ontology" filters described above (TODO), or to the feature filter section of the /das/genome specification.

Use cases driving the current API
- Include a synopsis of the NINDS sample annotation system proposed by Ball & Sherlock.
- Include a synopsis of the /das/assay client being implmented by the Nelson lab.

Allen Day, allenday@ucla.edu
University of California, Los Angeles

Last modified: Sun Oct 30 17:07:16 PST 2005