Towards Web 2.0

Avatar

Just another weblog

Specifying path for tidy rubygem

HTML Tidy is a library used to fix invalid HTML and give the source code a reasonable layout. It was developed by Dave Raggett of W3C, and is now maintained as a Sourceforge project. These are several versions of tidy available for various operating system. But the quickest way(not always easiest) to install on various unix systems are given below.

On debian based OS such as ubuntu, use apt-get to install

apt-get install tidy

On RPM based OS like fedora centOS, use yum to install

yum install tidy

On mac os x, use macports to install

port install tidy

For tidy to be used in ruby, a rubygem is available here. Just fire up gem install tidy to get it installed on your development machine. A nice documentation is provided here for reference.

gem install tidy

Usage:

  require 'tidy'
  Tidy.path = '/usr/lib/tidylib.so'
  html = 'Body'
  xml = Tidy.open(:show_warnings=>true) do |tidy|
    tidy.options.output_xml = true
    puts tidy.options.show_warnings
    xml = tidy.clean(html)
    puts tidy.errors
    puts tidy.diagnostics
    xml
  end
  puts xml

While I was working on tidy on my mac, I noticed the Tidy.path variable explained above did not work for me. I figured out an equivalent path to be used on mac,

  Tidy.path = '/usr/lib/libtidy.A.dylib'

Similar was the case with my production servers hosted on fedora/CentOS, I had to modify my path as

  Tidy.path = '/usr/lib/libtidy-0.99.so.0'

To use both paths on my development and production environment, I modified the line 2 in the example above as

  begin
    Tidy.path = '/usr/lib/libtidy-0.99.so.0'
  rescue LoadError
    Tidy.path = '/usr/lib/libtidy.A.dylib'
  end

Update:
If you’re getting the error:

/opt/ruby/ruby-1.8.6/lib/ruby/gems/1.8/gems/tidy-1.1.2/lib/tidy/tidybuf.rb:40: [BUG] Segmentation fault

Apply the following patch to fix it.

One Comment, Comment or Ping

  1. Perfect timing, was looking for exactly this.

Reply to “Specifying path for tidy rubygem”