I’ve been coding PHP for 8 years now, and it can be hard to jump into another language when you’re so entrenched in the syntax and quirks of another. However, I still think it’s a useful exercise from an intellectual standpoint to dabble in “competitors” to PHP, like Python and Ruby. It can also provide some real productivity gains when you leverage something another language does more effectively than PHP.
The Web2.0 kids love to chatter about Ruby’s beauty and elegance and super awesomeness. I don’t give a crap about that per se, but their groovy love-in has produced some very nice libraries for handling Web2.0-ish stuff like XML feeds. The FeedTools lib is a very comprehensive, very powerful Ruby library that makes creating, consuming and manipulating feeds much easier — moreso than anything I’ve found in PHP.
Ruby existed as a good general-purpose scripting language long before Rails made it a nerd-hipster household name, and that’s how we’re going to use it here: as a good ol’ fashioned CLI script to aggregate a few separate XML feeds into a single aggregated feed. You’ll need to have Ruby and the RubyGems package manager installed, which is an exercise left up to the reader (it varies from platform to platform). Once we have them working, you’ll need to install the FeedTools gem:
# gem install feedtools
Now that we have that installed, you can use the feed_tools library in your Ruby scripts. Here’s a script I wrote to aggregate three feeds from CERIAS into a single combo feed:
require "RubyGems" require "feed_tools" feedurls = %w(http://www.cerias.purdue.edu/feeds/news http://www.cerias.purdue.edu/weblogs/feed/ http://www.cerias.purdue.edu/feeds/seminars_podcast) combo = FeedTools::build_merged_feed feedurls combo.feed_type = 'atom' combo.title = 'CERIAS Super Combined Feed' combo.copyright = '2006 CERIAS' combo.author = 'CERIAS <email@example.com>' combo.id = "http://foo.bar/foobar/combo.xml" File.open('./combo.xml', 'w') do |file| file.puts combo.build_xml() end puts "done writing"
To execute this, you’ll want to do something like:
# ruby /path/to/script/generatecombo.rb
The best idea would be to run this as a cron job, as the feed combination process takes maybe a minute even for just these three feeds.
I think most of the script is pretty self-explanatory, but let’s break it down:
require "RubyGems" require "feed_tools"
This works just like you’d imagine. Note that to use gem libs, we need to require the RubyGems lib first, as it does some mucketymuck with the require statement to get it to handle gems properly.
feedurls = %w(http://www.cerias.purdue.edu/feeds/news http://www.cerias.purdue.edu/weblogs/feed/ http://www.cerias.purdue.edu/feeds/seminars_podcast)
This section just makes an array called
feedurls that contains the three feed URLs.
%w(...) is just a handy little shorthand for “treat the stuff between the parentheses as a string, and explode it into an array with whitespace as the delimiter”. I really wish PHP had something like this, as writing out array data can be pretty tedious.
combo = FeedTools::build_merged_feed feedurls
This is the line that does all the work of combining the feeds. One line. Pretty smooth. So we now have a Feed object called
combo.feed_type = 'atom' combo.title = 'CERIAS Super Combined Feed' combo.copyright = '2006 CERIAS' combo.author = 'CERIAS <firstname.lastname@example.org>' combo.id = "http://foo.bar/foobar/combo.xml"
We need to set a few properties for
combo before it can be published, though. A couple might not be self-explanatory:
There are many more attributes for feeds available; read the docs on the Feed class for more info.
File.open('./combo.xml', 'w') do |file| file.puts combo.build_xml() end
build_xml method generates the XML for the feed object we’ve created. This final piece of code opens (or creates and opens, if necessary) a file called “combo.xml” in the current directory and writes that XML. The file is automatically closed when this block finished executing.
[tags]ruby, FeedTools, php, xml, generator, parsing, combining[/tags]