Scientific Ramblings
Friday, October 14, 2011
Google Maps: break in the road bug!
Dear Google Maps,
I'm trying to plan a trip [1] in and around the Acatama Desert while i'm here in Chile, but have found an interesting bug. Basically, there is a break in the road (between A and B) which prevents maps from using that road: see here to reproduce http://g.co/maps/zwq8y
There's also a break on the road directly east of that break point.
I'd report the problem to you, if only you had the "Report a Problem" link anywhere on this page, despite your claims [2] to the contrary.
Best,
m.
[1] http://g.co/maps/3zwqt
[2] http://maps.google.com/support/bin/answer.py?hl=en&answer=162873&topic=1687362
Sunday, September 25, 2011
Provenance: what is it and how should we formalize it?
As a testament to the growing recognition of provenance for (e-)science, i'm glad to see that the W3C incubator group worked hard to think about the issues and make it possible to establish a W3C provenance interchange Working Group.
a good starting point:
"provenance is often represented as metadata, but not all metadata is necessarily provenance"
but
"Descriptive metadata of a resource only becomes part of its provenance when one also specifies its relationship to deriving the resource."
does not provide adequate description for identifying the conditions.
and
"Provenance of a resource is a record that describes entities and processes involved in producing and delivering or otherwise influencing that resource"
contains elements that are undefined (record), uncertain (are processes not also entities?), narrow (producing/delivering) and broad (influencing).
Of course, I appreciate the difficulty in crafting a good definition, and I understand that this is a definition from which useful work can be achieved. I will take the opportunity to express my thoughts on the matter.
i think there are two key aspects to provenance (not unlike what is suggested here: http://www.springerlink. com/content/edf0k68ccw3a22hu/)
1. how did the resource come about? (relates to creation and justification)
- important for reproducibility (which is an element of science)
- includes attribution (who created the resource), creation (process that generated the resource), reproduction (process in which a copy was made), derivation (process in which the resource was generated from some resource or portion of a resource), versioning (process of keeping count of sequential derivations)
2. what is the history of the resource (from the point of creation)
- important for authenticity
- includes origin, possession and the acts of transfer
Both have implications for trust, and can be used for accountability, among other things.
I find this part on recommendations of a provenance framework quite nice:
but get less excited when i see the collection of "provenance concepts"
particularly because we need to simplify the discourse such that we consider
an event (for 1 above)
- participants (and their roles; e.g. agents, targets, products)
- locations
- time instants (e.g. action timestamps) and durations (processual attributes)
and a sequence of events (for both 1 and 2 above)
this would certainly help to generate a specification with a minimal set of classes and relations to express this kind of information.
now, i'm writing this late at night, and I appreciate that I may not have considered all the issues that the provenance group has (along with others that have written about the subject), but perhaps there is still some good discussions to be had wrt provenance and how we formally represent it, as it is of strategic importance to the HCLSIG in our current and future efforts.
Labels:
hcls,
provenance,
representation,
semantic web,
w3c
Thursday, September 22, 2011
A letter to gmail: attachments
Dear Gmail team.
First, thank you for making it possible for me to see my unread mail - i sent you this idea some time ago, and i'm glad that you listened.
Second, I'm now somehow at 50% of my allocated capacity, and what i need is a way to filter my mails by attachment (which i can do!), but also order them from largest to smallest (which I can't do). Once i can order attachments by size, i can start deleting the big ones and free up more room! YAY!
m.
First, thank you for making it possible for me to see my unread mail - i sent you this idea some time ago, and i'm glad that you listened.
Second, I'm now somehow at 50% of my allocated capacity, and what i need is a way to filter my mails by attachment (which i can do!), but also order them from largest to smallest (which I can't do). Once i can order attachments by size, i can start deleting the big ones and free up more room! YAY!
m.
Tuesday, September 20, 2011
On the topic of ontology evaluation
With only a cursory look at the literature pertaining to the evaluation of ontologies, I already get the feeling that the current measures completely miss the point. The answer doesn't lie in the syntax (format) or structure of the ontology (the number of classes and properties, subsumed classes, axioms, etc), but rather the
effectiveness of an ontology ( a representation of knowledge ) is in whether the semantics can be used for some task. So what we really want, is to focus on the nature of the task, and whether ontology provides a competitive advantage over other technologies.
Off the top of my head, here are some tasks:
- search/browse (most sites using GO, etc)
- text annotation (gopubmed)
- data normalization and structured queries - bio2rdf
- answering questions that require background knowledge e.g. across a yeast database.
- data integration (heterogeneous types of data; data
from different sources, of differing granular detail) (see my translational medicine paper)
- classification e.g. domains or chemicals
- prediction e.g. predicting phenotypes
Perhaps others can suggest some?
Labels:
application,
evaluation,
ontology,
task,
tools,
use case,
utility
Sunday, September 11, 2011
New Charter for the W3C Health Care and Life Sciences Interest Group
Last week, the World Wide Web Consortium (W3C) approved a new charter for its Health Care and Life Sciences Interest Group (HCLSIG), in which I (Carleton University) along with Charlie Mead (NIH CBIIT) and Vijay Bulusu (Pfizer) were selected as co-chairs. This new charter directs us to develop, advocate and support the use of Semantic Web technologies for translational medicine and its three enabling domains: life sciences, clinical research and health care. While the core HCLSIG values - simplicity, pragmatism, effectiveness - remain firmly in place, Charlie, Vijay and I hope to make subtle changes to the operational strategy such that our efforts become increasingly recognized as critical in conferences and boardrooms across the globe.
As always, the HCLSIG will create both prototype implementations that demonstrate the value of formalizing and sharing knowledge using Semantic Web technologies. We will marshal our efforts towards fulfilling compelling use cases that have intrinsic value to not just W3C members, but ideally to a larger number of outside benefactors. Thus, our experts will now develop these use cases such that a priori we have a clearer picture of the rationale of the project, its resources, milestones and deliverables, and ultimately, which organizations and communities will directly and indirectly benefit. Coupled with an effective dissemination strategy including leverage our combined social networks, we hope to maximize the impact of the work of our members in this emerging area of knowledge management.
As part of our dissemination strategy, we also intend to produce more member contributions that describe methods for basic and advanced tasks, in addition to publishing recommendations arising from consensus among our members. Such recommendations will endorse and specify the use of terminological resources in the long term context of semantic interoperability across the three core domains. Thus, participation in the HCLSIG will be critical for those wanting to advocate RDF-representations of data, OWL representations of ontologies, for the purposes of semantic annotation and large scale, semantic integration of biomedical data.
With that, we invite non-members to join the W3C and work with our strong compliment of experts in what will surely be an exciting and productive time over the next few years for the W3C HCLSIG.
As always, the HCLSIG will create both prototype implementations that demonstrate the value of formalizing and sharing knowledge using Semantic Web technologies. We will marshal our efforts towards fulfilling compelling use cases that have intrinsic value to not just W3C members, but ideally to a larger number of outside benefactors. Thus, our experts will now develop these use cases such that a priori we have a clearer picture of the rationale of the project, its resources, milestones and deliverables, and ultimately, which organizations and communities will directly and indirectly benefit. Coupled with an effective dissemination strategy including leverage our combined social networks, we hope to maximize the impact of the work of our members in this emerging area of knowledge management.
As part of our dissemination strategy, we also intend to produce more member contributions that describe methods for basic and advanced tasks, in addition to publishing recommendations arising from consensus among our members. Such recommendations will endorse and specify the use of terminological resources in the long term context of semantic interoperability across the three core domains. Thus, participation in the HCLSIG will be critical for those wanting to advocate RDF-representations of data, OWL representations of ontologies, for the purposes of semantic annotation and large scale, semantic integration of biomedical data.
With that, we invite non-members to join the W3C and work with our strong compliment of experts in what will surely be an exciting and productive time over the next few years for the W3C HCLSIG.
Labels:
hcls,
health care,
industry,
life sciences,
owl,
pharmaceuticals,
rdf,
semantic web,
w3c
Subscribe to:
Posts (Atom)