« More ways to make money from the semantic web | Main | One namespace to rule them all »

Checking Out Yahoo Pipes

Easy, quick, and useful.

You've probably heard that Yahoo has this new, drag-and-drop tool to easily combine and manipulate RSS and Atom feeds. (Forgive me for omitting the exclamation point from their name—speaking of which, shouldn't the logo for yahoo.es be "¡Yahoo!"?) Tim O'Reilly called Yahoo pipes no less than "a milestone in the history of the internet." Early reports mentioned load problems, and I was extra busy with work, so I waited a bit before trying it.

[yahoo pipes logo]

It's definitely cool. My to-do list has several low-priority entries under "check out the Universal Feed Parser and try to combine feeds X, Y, and Z into one feed and then search/sort/remove redundant entries from the combination". Yahoo pipes makes this simple: you drag modules from a menu on the left of the screen onto a workspace where you hook up inputs, specialized processing modules, and then an output.

A Fetch module lets you specify one or more feeds to grab, using Atom or any RSS flavor. (It took me a while to catch on to the "or more" part—at first I assembled messy combinations of single-URL Fetch modules all combined through Union operator modules.) Of the modules that you can add between your Fetch module(s) and the Pipe Output one, modules liked "Content Analysis" and "For Each: Annotate" look interesting, but I couldn't make them work after a few minutes of playing. Instead of worrying about it, I thought I'd just wait until the Yahoo Pipes documentation is less skimpy and the system's teething problems are done in case my difficulties are not my own fault.

And, there's plenty you can do with the simpler modules. For my first non-trivial pipe, I wanted to create something I could show at a talk I'll be giving to a group of law librarians, so I found feed URLs for fourteen Intellectual Property blawgs ("law blogs"—get it?) on blawg.com's Intellectual Property blawg listing, then piped the combination of these feeds through a Filter module that only passes along the ones with the phrase "fair use" in them. Then, the pipeline goes through another module that sorts the entries by published date and finally to the output module. Now we have something that would be valuable to an IP lawyer working on a case where the issue of "fair use" plays an important role: IP blawg entrieso mentioning fair use.

One great thing about the modules that use information from specific elements of an RSS feed is that they give you a menu of the available elements in the input you've chosen. For example, the Filter module lets you permit or block items that match any or all of the rules that you specify on that module. The default rule says "title contains [text]", where you enter text such as "fair use". Along with "contains" they offer five other choices such as "does not contain" and "Matches regex". The choices besides "title" depend on your input; when you drag a little blue hose from the output of a module such as Fetch to a Filter module, the Title part briefly says "Updating..." and then becomes a drop-down menu showing available elements in the input. The following shows an example of the elements that show up in the drop-down list for a given set of input.

screen shot showing dropdown list

Once I was comfortable with all this, I set myself a goal of creating a version of the BoingBoing feed that filters out all entries by Cory Doctorow and Xeni Jardin, and I timed myself to see how long this took. I didn't time the creation process down to the second, but from the time I clicked "New" until I saw a working test run was literally two minutes. That's nice. I started writing out the details of how I created the pipe here, but if you have a Yahoo ID you can easily see the visual representation for yourself, and it saves me some typing.

Of course the results of your pipe creation work are available as a feed that you can add to a feed reader and drop into other Yahoo pipes feeds, but unfortunately there's no Atom output. On the website's suggestion list, you can vote to make that a higher priority. (JSON output is available.)

I look forward to figuring out and trying the more complex modules. And maybe I'll start following BoingBoing again, now that I've found an easy way to improve their signal-to-noise ratio.


Listed below are links to weblogs that reference Checking Out Yahoo Pipes:

» Yahoo! Pipes from Stefan Tilkov's Random Stuff
I’ve been meaning to write about Yahoo! Pipes for a while, but I still haven’t found the time to take a serious look. Still, experience reports from Steve Jones and Bob DuCharme make it even more interesting …... [Read More]