Archive for the ‘Blog’ Category
Paragraph Level Search Results on WordPress Using Digress.it and Yahoo Pipes
One of the many RSS related feature requests I put in when we were working on the JISCPress project was the ability to get a page level RSS feed out where each paragraph was represented as a separate item the page feed.
WordPress already delivers a single item RSS feed for each page containing just the substantive content of the page (i.e. the content without the header, footer and sidebar fluff), which means you can do things like this, but what I wanted is for the paragraphs on each page to be atomised as separate feed elements.
Eddie implemented support for this, but I didn’t do anything with it at the time, so here’s an example of just why I thought it might be handy – paragraph level search.
At the moment, searching a document on WriteToReply returns page level results – that is, you get a list of search results detailing the pages on which the search term(s) appear. As you might expect with WordPress, we can get access to these results as a feed by shoving feed in the URI, like this:
http://ouseful.wordpress.com/feed?s=test
Paragraph level feeds, as implemented in the Digress.it WordPress theme we were developing, are keyed by URLs of the form:
http://writetoreply.org/legaldeposit/feed/paragraphlevel/annex-c-online-content-to-be-published/#56
That is:
http://writetoreply.org/DOCNAME/feed/paragraphlevel/PAGENAME/#PARA_NUMBER
So can you guess what I’m gonna do yet…?
First of all, grab the search feed for a particular query on a particular document into a Yahoo Pipe:
Rewrite the URI of each page liked to in the results feed as the full fat, itemised paragraph feed for the page, and emit those items (that is, replace each original search results item with the set of paragraph items from that page).
The next step is to filter those paragrpah feed items for just the paragraphs that contain the original search terms:
We need to rewrite the link because (at the time of writing) the page paragraphs feed doesn’t link to each paragraph, it links to the parent page (a bug report has been made;-)
You can find the pipe here: Double dip JISCPress search
Note that at the time of writing, there’s also a problem with the paragraph number reported in the link (again a report has been made), a workaround patch for which is included in this pipe.
What this means is that we now have a workaround for indexing into individual paragraphs using a search term. If we tag content at the paragraph level, (e.g. by running a page-level paragraph feed, or double dip search results feed through OpenCalais), we can generate related search links into the document, or other documents on the platform, at a paragraph level, increasing the relevance, or resolution (in terms of increased focus), of the returned results.
Just by the by, the approach shown above is based on a search, expand and filter pattern, (cf. a search within results pattern) in which a search query is used to obtain an initial set of results which are then expanded to give higher resolution detail over the content, and then filtered using the original search query to deliver the final results. If a patent for this doesn’t already exist for this, then if I worked for Google, Yahoo, etc etc you could imagine it being patented. B*****ds.
PS here’s a trick I picked up from Joss’ blog somewhere for reversing the order of feed items published by WordPress:
http://writetoreply.org/legaldeposit/feed/?orderby=ID&order=ASC
I assume these parameters also work?

Inline Comments on WriteToReply?
One of the things that’s still very much on the WriteToReply to do list is to identify and address the various accessibility issues with the site that might prevent government agencies, and other publicly funded bodies, for adopting the platform for the republication of their own documents.
We face a similar problem in education, (the need to conform to quite stringent accessibility guidelines), so I started to wonder whether or not we could reuse tricks and tips from the OU’s Moodle VLE. Now I don’t think that the VLE supports document commenting in the way that we do on WriteToReply, but it does support forums. Which got me thinking: what would WTR look like if we supported inline comments, using a metaphor along the lines of: suppose each paragraph is a forum post, and each comment on a paragraph is like a reply to that post…
Here’s what an OU forum looks like:
And here’s how reply (comment) threads work:
So what sort of layout do we currently have on WriteToReply? Well, the comments are siloed in a floating comment box, with icons associated with each paragraph that allow uses to open up the related comments in the comment box, as well as displaying the number of comments associated with the corresponding paragraph.
The question now arises: can we learn anything from the way forums are presented about how we might render inline comments within a document? It’s important to bear in mind that we are exploring the notion of exactly what we mean by commentable documents, particularly atomised commentable documents, so don’t get too hung up on the idea that we might be proposing things that would make a PDF look clunky…
Here’s my first guess at what inline comments might look like, put forward purely as a straw man:
What do you think? Worth exploring further?
PS on the accessibility question, see also Name that tool: forthcoming ‘BBC Accessibility Settings Tool’ needs you








