Developing ontologies in decentralised settings

I have placed a e-prints of a manuscript, on Nature preceedings, that I have been working on, in collaboration with the authors listed on the manuscript. It presents a review of the available published ontology engineering methodologies, and then assess their suitability when applied to community ontology development (the decentralised setting).

It is a lengthy document. Here is the abstract:

This paper addresses two research questions: “How should a well-engineered methodology facilitate the development of ontologies within communities of practice?” and “What methodology should be used?” If ontologies are to be developed by communities then the ontology development life cycle should be better understood within this context. This paper presents the Melting Point (MP), a proposed new methodology for developing ontologies within decentralized settings. It describes how MP was developed by taking best practices from other methodologies, provides details on recommended steps and recommended processes, and compares MP with alternatives. The methodology presented here is the product of direct first-hand experience and observation of biological communities of practice in which some of the authors have been involved. The Melting Point is a methodology engineered for decentralised communities of practice for which the designers of technology and the users may be the same group. As such, MP provides a potential foundation for the establishment of standard practices for ontology engineering.

Content, Syntax and Semantics

These are the slides I gave at a DCC workshop entitled, “Digital curation 101″ which aimed to give and overview of what to consider regarding data curation and management in the context of applying for research funding. The presentation starts with definitions of content syntax and semantics, and example of how these concepts are being applied in the life-sciences, specifically proteomics.

Reblog this post [with Zemanta]

A trip to Cambridge in June

I’m taking a trip to Cambridge between June 7th and June 12th.

A trip to London in May

I’m taking a trip to London on May 14th.

I will be at the Molecular Regulation of Cardiac Disease Symposium May 14-15 2009, London, UK. Hosted by Abcam http://www.abcam.com/index.html?pageconfig=resource&rid=11503&sc_ql=1595&intGoUser=15

A trip to Manchester in May

I’m taking a trip to Manchester between May 11th and May 13th.

The Semantic Web of Life Science

This summary was born out of a question  on Twitter and percolated to FriendFeed, which was “Who is using RDF and integrating other resources at the minute and what are those resources? From this question, several resources were highlighted.

UniProt. The comprehensive resource of protein information is available as an RDF distribution and each Protein record has a corresponding RDF download option.

Phil pointed out Semantic Systems Biology, As systems biology is largely concerned with representing networks and interactions at a systems level, a language like RDF would seem an obvious choice to represent this type of knowledge, to aid semantic description and data integration.

Melanie pointed out the following resources such as Bio2RDF. This project aims to RDF-ize numerous public life-science resources using what they call a three step approach which they have developed. The following image illustrates some of the resources that are included in Bio2RDF.

Bio2RDF Cloud

Bio2RDF Cloud

The NeuroCommons project seeks to make all scientific research materials – research articles, annotations, data, physical materials – as available and as usable as they can be. As a result they have an RDF triple store which they encourage you to either contribute to or download and use.

For a more general overview of resources that exist as an RDF implementation, the Linked Open Data cloud provides a graphical summary of the resources that exists and the relationships between them.

If you know of any more life-science resources or projects using RDF, then please do comment below.  Egon has indicated he is working on RDF-ing the NMRShiftDB and ChEMBL’s Starlight, and Andrew Clegg is considering a project proposal involving RDF. As a result a very interesting discussion ensued on FF.

Reblog this post [with Zemanta]

HUPO PSI-PAR: standard format for protein affinity reagents

Schematic diagram of an {{w|antibody}} and ant...
Image via Wikipedia

HUPO PSI-PAR: standard format for protein affinity reagents is now available for Public Comment on the PSI Web site for the next 30 days. The public comment period enables the wider community to provide feedback on a proposed standard before it is formally accepted, and thus is an important step in the standardisation process. This message is to encourage you to contribute to the standards development activity by commenting on the material that is available online. We invite both positive and negative comments. If negative comments are being made, these could be on the relevance, clarity, correctness, appropriateness, etc, of the proposal as a whole or of specific parts of the proposal. If you do not feel well placed to comment on this document, but know someone who may be, please consider forwarding this request. There is no requirement that people commenting should have had any prior contact with the PSI. If you have comments that you would like to make but would prefer not to make public, please email the PSI-Editor directly.

Reblog this post [with Zemanta]

The BioSysBio conference 2009

The Genomics GTL Pictorial Program.
Image via Wikipedia

The premise of the BioSysBio conference is to

bring together the best young researchers working in Synthetic Biology, Systems Biology and Bioinformatics, providing a platform to hear and discuss the most recent and scientific advances and applications in these fascinating fields.

This years BioSysBio 09 has just taken place in Cambridge, UK. The program was more slanted towards synthetic biology rather than more traditional systems biology, which I think reflects the growing momentum that synthetic biology has gained in the past year. I think this is a good progress and  I was secretley glad as I did not want to spend 3 days looking at massive network diagrams squashed onto power point slides.

This was the first conference I had been to that the organisers actually requested that we use the BioSysBio FriendFeed room and Twitter to communicate, so I did. Half way through the first day the organisers demonstrated the FF room, which seemed to exist solely of Allyson’s posts, and questions were asked if she was a blogging bot. When we did confirm there was actually a female at an engineering conference, she was thereafter known as the BioSysBio poster girl.

As ever Ally was monumental in her blogging during the conference and all her posts can be found here. At one stage Simon did try to blog her talk to the same detail and speed, but he just kept coming up withe excuses about the wifi being slow - eventually he got there.

This was the first time I attended BioSysBio and I thoroughly enjoyed the experience. In general all of the talks were of a high standard most notable for me were Allyson Lister’s talk on Saint: a lightweight SBML annotation integration environment, Christina Smolke on  Programming RNA Devices to Control Cellular Information Processing, Piers Millet on Why Secure Synthetic Biology? and Drew Endy on Building a new Biology. It was also good to hear about improvements for the Registry of standard biological parts by Randy Rettberg and the wiki style community building of the product catalogue, or data sheet about each part.

There is no point in me re-posting coverage that has already been documented, so if you would like to follow what happened you can follow the #biosysbio twitter stream, the biosysbio FreindFeed Room, or if you want a more comprehensive overview, Ally’s blog.

This was also the first time I had used twitter (via tweetdeck) instead of Friendfeed to microblog a conference. This approach certainly generated alot of noise and random soundbites, and was probably a fast way to make notes. However, although everything is grouped under the #biosysbio tag, they are not grouped around a particular talk or discussion thread. I can’t help thinking that microblogging via FriendFeed would be more focused around a specific talk and provide a more focused discussion, as opposed to just covering what was happening second by second.

Reblog this post [with Zemanta]

PSI AnalysisXML Enters Public Comment

Section of a protein structure showing serine ...
Image via Wikipedia

The HUPO Proteomics Standards Initiative aims to develop community data standards for proteomics, that are developed, accepted and implemented by the proteomics community.  To this end,  The “AnalysisXML: exchange format for peptides and proteins identified from mass spectra” is now available for Public Comment on the PSI Web site.

The public comment period enables the wider community to provide feedback on a proposed standard before it is formally accepted, and thus is an important step in the standardisation process.

This message is to encourage you to contribute to the standards development activity by commenting on the material that is available online. We invite both positive and negative comments. If negative comments are being made, these could be on the relevance, clarity, correctness, appropriateness, etc, of the proposal as a whole or of specific parts of the proposal.

If you do not feel well placed to comment on this document, but know someone who may be, please consider forwarding this request. There is no requirement that people commenting should have had any prior contact with the PSI.

Announcement via the PSI editor, Norman Paton

Reblog this post [with Zemanta]