Bio-ontologies are an excellent tool for the describing the complexity of biological systems in a form that can be easily processed by computational means. To this point, the focus of ontologies in biology has largely been on the construction of the onotologies themselves and ensuring that the semantic model is structured correctly. It is the opinion of the authors that the structure of the ontologies is now sufficiently stable that they are ready to be more tightly integrated into curatorial applications. The groups maintaining the most prominent genome sequence feature and ontology curation tools, namely Apollo and OBO-Edit, are also expressing interest in the integration of ontology and sequence curation software.
The purpose of the DAS/2 protocol for ontology retrieval and manipulation furthers the aim of greater use of ontology in the annotation of biological objects by providing a platform-neutral mechanism for ontology access. This opens opportunities to annotate biological objects for which sophisticated editorial and visualization applications do not yet exist.
The general structure for formulating queries described here can be considered stable. It is intentionally designed to be similar to URIs used elsewhere in the DAS/2 specification, specifically the /das/assay and /das/genome domains.
The primary way to retrieve information from a DAS/2 server is to perform a GET request on a DAS/2 URL.
The document returned contains a list of all ontologies available from the service.
REQUEST:
GET http://das.biopackages.net/das/ontology/obo/1/ontology
RESPONSE:
Content-Type: text/xml <?xml version="1.0" standalone="no"?> <!DOCTYPE DAS2ONTOLOGIES SYSTEM "http://www.biodas.org/dtd/das2ontologies.dtd"> <ONTOLOGIES> <ONTOLOGY xmlns="http://www.biodas.org/ns/das/genome/2.00" xmlns:xlink="http://www.w3.org/1999/xlink" xml:base="http://jugular.ctrl.ucla.edu:8529/das/ontology/obo/1/ontology/"> id="PATO" definition="" /> <ONTOLOGY xmlns="http://www.biodas.org/ns/das/genome/2.00" xmlns:xlink="http://www.w3.org/1999/xlink" xml:base="http://jugular.ctrl.ucla.edu:8529/das/ontology/obo/1/ontology/"> id="REX" definition="" /> </ONTOLOGIES>
The document returned contains all terms and edge types for the specified ontology.
REQUEST:
GET http://das.biopackages.net/das/ontology/obo/1/ontology/CL
RESPONSE:
Content-Type: text/xml <?xml version="1.0" standalone="no"?> <obo xmlns="http://www.biodas.org/ns/das/genome/2.00" xmlns:xlink="http://www.w3.org/1999/xlink" xml:base="http://das.biopackages.net/das/ontology/obo/1/ontology/"> > <source> <source_type>DAS/2 Server</source_type> <source_path/> <source_md5/> <source_mtime/> </source> <header> <format-version>1.0</format-version> <date/> <saved-by>DAS/2 Server</saved-by> <auto-generated-by>DAS/2 Server</auto-generated-by> <default-namespace/> <remark/> </header> <term id="CL/0000438"> <id>CL/0000438</id> <name>luteinizing_hormone_secreting_cell</name> <namespace>cell.ontology</namespace> <is_a>CL/0000167</is_a> <synonym scope="exact"> <synonym_text>delta_basophila</synonym_text> <synonym_text>gonadotroph</synonym_text> </synonym> </term> <term id="CL/0000141"> <id>CL/0000141</id> <name>cementocyte</name> <namespace>cell.ontology</namespace> <develops_from>CL/0000061</develops_from> <develops_from>CL/0000134</develops_from> <synonym scope="exact"/> </term> <typedef id="OBO_REL/is_a"> <id>OBO_REL/is_a</id> <name>is_a</name> <namespace>relationship</namespace> </typedef> <typedef id="OBO_REL/develops_from"> <id>OBO_REL/develops_from</id> <name>develops_from</name> <namespace>relationship</namespace> </typedef> </obo>
The format of the response to the "ontology" request can be adjusted using the "format" argument. These formats are described in the next section.
The format argument selects the format for the feature list. Values may include any of those listed in Table 1, or any other formats that the client and server agree on. A DAS2 server is only obliged to provide das2xml (synonymous with OBO XML) format.
Table 1: Ontology formats
Format Description das2xml http://www.godatabase.org/dev/xml/xsl/ compact a terse tab-delimited format, described below owl http://www.w3.org/TR/owl-xmlsyntax/ rdf http://www.w3.org/TR/rdf-syntax-grammar/
The compact response format presents one ontology term per line. Each term is represented by four tab delimited columns, as described in Table 2.
Table 2: Ontology "compact" format
Column Description 1 Ontology dbspace 2 Term accession 3 Term name (or synonym possibly, depending on request) 4 Term namespace
REQUEST:
GET http://das.biopackages.net/das/ontology/obo/1/ontology/CL?format=compact;term=egg*
RESPONSE:
Content-Type: text/plain #base = http://das.biopackages.net/das/ontology/obo/1/ontology CL CL/0000254 egg_cell cell.ontology CL CL/0000025 egg cell.ontology CL CL/0000659 eggshell_secreting_cell cell.ontology
The list of terms returned, as well as their respective namespaces, can be filtered using a selection language based on query string argument=value pairs. Filter arguments are alphanumerics plus the underscore. Filter values are URI-escaped strings but otherwise unconstrained. In addition to the standard characters that must be URI escaped, the characters [:,/*] (colon, comma, slash) have significance to the filter processor and must also be URI escaped.
Returned terms can be filtered by combinations of their name, synonym, and ontology dbspace
Filters can be combined in a limited boolean fashion. Within the URL query string, multiple criterion=value pairs are ANDED together:
(Terms named "egg" from ontologies with dbspace name beginning in "Fb".ontology=Fb*;term=egg
To describe OR relationships, you may separate filter values by commas. For example, this retrieves all terms named "egg" or beginning with "ovu".
term=egg,ovu*
AND and OR relationships can be combined:
ontology=Fb*;term=egg,ovu*
The document returned is formatted the same as the "ontology" request, but contains only the requested fragment from the ontology
REQUEST:
GET http://das.biopackages.net/das/ontology/obo/1/ontology/CL/egg
RESPONSE:
Content-Type: text/xml <?xml version="1.0" standalone="no"?> <obo xmlns="http://www.biodas.org/ns/das/genome/2.00" xmlns:xlink="http://www.w3.org/1999/xlink" xml:base="http://das.biopackages.net/das/ontology/obo/1/ontology/"> > <source> <source_type>DAS/2 Server</source_type> <source_path/> <source_md5/> <source_mtime/> </source> <header> <format-version>1.0</format-version> <date/> <saved-by>DAS/2 Server</saved-by> <auto-generated-by>DAS/2 Server</auto-generated-by> <default-namespace/> <remark/> </header> <term id="CL/0000025"> <id>CL/0000025</id> <name>egg</name> <namespace>cell.ontology</namespace> <is_a>CL/0000675</is_a> <synonym scope="exact"> <synonym_text>ovum</synonym_text> </synonym> </term> <typedef id="OBO_REL/is_a"> <id>OBO_REL/is_a</id> <name>is_a</name> <namespace>relationship</namespace> </typedef> </obo>
related isn't in OBO_REL. OBO_REL is for relationships that can hold between biological entities, not between biological entities and information entities - so synonym types wouldn't go in there.
The relationship between a term and it's synonym(s) currently is typed as "related". This is the relationship type assigned by the go-db and go-dev tools that are used to parse and load the OBO ontologies to the database. If "related" is an incorrect way to represent term/synonym relationships, it needs to be addresed in the go-db/go-dev API with the chado database.
In some sense, this point will also be moot once we have made the
transition to use obo-xml
. The appearance will
be correct and not use "related", but the backend
implementation will still use it and may need to be corrected
as described.
The ontology metadata section will become v important.
What does this mean?
I have concerns about the size of the API. Have you seen the go db api? ok a lot of that is for getting associations. and it certainly could have been designed better. but the number of potential canned queries could get huge. will you have some kind of way of doing queries - either primitive or advanced? will you have calls for fetching the transitive closure? what kind of programs will use this api?
The intent is to keep the core API as minimal as possible, and to allow accomodation of canned queries and more advance query capability via a filtering mechanism. For more information about how features work, refer to the "term" and "ontology" filters described above (TODO), or to the feature filter section of the /das/genome specification.
Use cases driving the current API