« More newspapers added to Chipwrapper search | Main | Headline credits in the Chipwrapper RSS feeds »

Getting rid of the blobs in the Chipwrapper headlines

If you visited the Chipwrapper homepage on a few occasions last week, you may have noticed some very odd characters appearing on the page as blibs'n'blobs in the middle of the headlines.

This is due to the way that the headline list is produced.

First of all newspapers produce an RSS feed. Then I mash that up using the excellent Yahoo! Pipes service. The RSS output of my pipe is then picked up by the Chipwrapper server, which transforms it from XML into the correctly marked up XHTML required for it to appear on the homepage.

Somewhere during that process the character encoding of the contents was occasionally going awry if newspapers were using 'special' characters. This meant that occasionally a punctuation mark or pound symbol was emerging on the page as a blobby question mark.

I've changed the way that the feed is processed on the Chipwrapper server-side, meaning that only alphanumeric characters and a few odds'n'sods of punctuation can appear on the page.

It should mean that you don't see any unknown characters anymore - but one casualty at the moment is the pound symbol (£), which will be missing from any headlines where it should appear. But, since Chipwrapper is only for UK newspapers, I think it is safe to assume that any stories about postal workers winning 35m is about a postal worker winning £35m, not dollars, yen or euros.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)