Scientific Ramblings: June 2010

(modified from an email that Mark Wilkinson sent)

SADI is a very lightweight "standard" (set of best-practices, really) for modeling and providing Web Services. It uses standards from the W3C Semantic Web initiative - in particular, it uses OWL for types, and RDF for instance data.

SADI is used to expose "resources" to the world in a manner that can be discovered automatically, and accessed automatically, by software. Those resources might be data inside databases (where SADI replaces the traditional Web Query page), or they might be analytical algorithms that consume data, chug away on it, and return output data. In both cases, the interfaces are structurally identical, so from the perspective of the client software, it doesn't have to know or care whether it is trying to get data out of a database or out of an analytical tool - the question/query structure is the same, and moreover, it is completely predictable.

This is critical advantage #1 for SADI over traditional Web Services frameworks - in traditional XML-based Web Services, you still must code your client software to access each service, since the service interfaces cannot be interpreted by the machine. In SADI, we can design ONE piece of software to access all resources exposed as SADI services - "one ring to rule them all!". (and we already have several different "rings" that expose SADI data in different ways)

Critical advantage #2 is a bit more obscure and hard to describe, but is likely to be the more important in the long-run. In SADI, data is "grounded" in explicit semantics. This means that all data in SADI carries with it information about what TYPE of data it is, and how that data relates to other data (e.g. genes transcribed into transcripts translated into proteins which regulate genes: Gene, Transcript, Protein are all data types, and "transcribed", "translated", "regulate" are relationships between them). With this explicit (and extensive!) grounding in semantics, we can start asking our machines to do a lot of the interpretation for us. For example, "what gene regulates gene X" is a nonsensical question biologically, but it's a question that biologists ask all the time! With a solid grounding in semantics, the machine would be able to follow the logical pathway above and say "well, to answer that question, I am going to have to go through transcripts and proteins to get there" and then automatically construct the pipeline of services that get to the answer. This is just one example of how Semantics can be used to facilitate question-answering.

There are several tutorials available.

for what it can do: http://www.slideshare.net/markmoby/sadi-swsip-09

then go to http://sadiframework.org to find the more specific tutorials on how to deploy services.

The current list of available services is at http://sadiframework.org/registry/services/ and that list will be growing rapidly over the next year (we have committed to having at least 400 more services, but I suspect that we'll go far beyond that number!)

Friday, June 4, 2010

SADI