ICE as a blog editor

I noticed yesterday another cry for help from Peter Murray Rust:

[Note: I will continue to try to format the code – WordPress makes it very difficult]

Also yesterday I wrote about how we are breaking ICE up into more digestible pieces, one of which is the ability to post to a weblog using Atompub. Daniel de Byl has just posted a demo using OpenOffice.org writer to publish a nicely formatted blog post to WordPress.

I thought I’d try it out using PMR’s post and see what happens.

Here’s the post (embedded in mine as a blockquote):

CrystalEye: using the harvester

Jim Downing has written a harvester for CrystalEye. I thought I would have a try and see if I could iterate through all the entries and extract the temperature of the experiment. This is where XML really starts to show its value over legacy formats. Jims iterator reads each entry and copies it to a file; I decided to read the entry as an XML document, search for the temperature using XQuery and announce it. Its simple enough that I thought I could do it while watching Liverpool (I used to live on Merseyside). Unfortunately (or fortunately) the torrent of goals distracted me so it had to wait till today.

The temperature is described in the IUCr dictionary and held in CML as (example):

293.0

So this is trivially locatable by XQuery (with local-name() and @dictRef):

// iterate through all entries
for (DataEntry de : doc.getDataEnclosures()) {
if (downloaded >= maxHarvest) {
return downloaded;
}
InputStream in = null;
try {
in = get(de.url);
// standard XOM XML parsing, creates a
Element rootElement = new Builder().build(in).getRootElement();
// standard xquery
Nodes nodes = rootElement.query(
".//*[local-name()='scalar'"+
and @dictRef='iucr:_cell_measurement_temperature']");
// if there is a temperatute extract the value
String temp = (nodes.size() == 0) ? "no temp given" : nodes.get(0).getValue();
System.out.println("temperature for "+rootElement.getAttributeValue("id")+": "+temp);
downloaded++;
} catch (Exception e) {
e.printStackTrace();
} finally {
IOUtils.closeQuietly(in);
}
}

and heres the output:

1625 [main] DEBUG uk.ac.cam.ch.wwmm.crystaleye.client.Harvester  - Getting http://wwmm.ch.cam.ac.uk/crystaleye/summary/rsc/ob/2007/22/data/b712503h/b712503hsup1_pob0401m/b712503hsup1_pob0401m.complete.cml.xml
temperature for rsc_ob_2007_22_b712503hsup1_pob0401m: 115.0
2297 [main] DEBUG uk.ac.cam.ch.wwmm.crystaleye.client.Harvester  - Getting http://wwmm.ch.cam.ac.uk/crystaleye/summary/rsc/ob/2007/22/data/b710487a/b710487asup1_ljf130/b710487asup1_ljf130.complete.cml.xml
temperature for rsc_ob_2007_22_b710487asup1_ljf130: 150.0

etc.

It will take the best part of the day to iterate through the entries, but remember that CrystalEye is not a database. We are converting it to RDF (and anyone interested can also do this) when it can be searched in a trivial amount of time and with much more complex questions. (Remember that CrystalEye was not originally designed as a public resource). Until then anyone who wishes to use CrystalEye a lot would do best to download the entries and build their own index.

[Note: I will continue to try to format the code – WordPress makes it very difficult]

Easy enough to do in ICE apart from the work I had to do to get the quote formatted correctly. We really need the ability to import HTML properly formatted as a blockquote. This would be very important for PMR, as he likes to quote big chunks, in HTML all you do is wrap <blockqoute> tags around the source for the quote and you’re done. In ICE you have to imply the quote by marking the first paragraph as ‘bq1’ style using our easy-to-click toolbar buttons, then indent the subsequent paragraphs appropriately. We’ll work on automating that.

I used this tip to change my CSS so that stuff in <pre> tags wraps. PMR has used <code> inside a paragraph, not sure what the solution would be there.

You can see a draft version of this post on my test blog.

Advertisements

4 responses to “ICE as a blog editor

  1. Pingback: Unilever Centre for Molecular Informatics, Cambridge - petermr’s blog » Blog Archive » I have to eat Peter Sefton’s dogfood

  2. Test
    {“key”:”value”}


  3. {“original-text”:”dsasdas”,
    “reply-to”:”sdasda”}

    THis is my Commentasdsadasdas

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s