unix & Ruby 23 Apr 2008 10:54 am
Specifying path for tidy rubygem
HTML Tidy is a library used to fix invalid HTML and give the source code a reasonable layout. It was developed by Dave Raggett of W3C, and is now maintained as a Sourceforge project. These are several versions of tidy available for various operating system. But the quickest way(not always easiest) to install on various unix systems are given below.
On debian based OS such as ubuntu, use apt-get to install
1 | apt-get install tidy |
On RPM based OS like fedora centOS, use yum to install
1 | yum install tidy |
On mac os x, use macports to install
1 | port install tidy |
For tidy to be used in ruby, a rubygem is available here. Just fire up gem install tidy to get it installed on your development machine. A nice documentation is provided here for reference.
1 | gem install tidy |
Usage:
1 2 3 4 5 6 7 8 9 10 11 12 | require 'tidy' Tidy.path = '/usr/lib/tidylib.so' html = 'Body' xml = Tidy.open(:show_warnings=>true) do |tidy| tidy.options.output_xml = true puts tidy.options.show_warnings xml = tidy.clean(html) puts tidy.errors puts tidy.diagnostics xml end puts xml |
While I was working on tidy on my mac, I noticed the Tidy.path variable explained above did not work for me. I figured out an equivalent path to be used on mac,
1 | Tidy.path = '/usr/lib/libtidy.A.dylib' |
Similar was the case with my production servers hosted on fedora/CentOS, I had to modify my path as
1 | Tidy.path = '/usr/lib/libtidy-0.99.so.0' |
To use both paths on my development and production environment, I modified the line 2 in the example above as
1 2 3 4 5 | begin Tidy.path = '/usr/lib/libtidy-0.99.so.0' rescue LoadError Tidy.path = '/usr/lib/libtidy.A.dylib' end |
Update:
If you’re getting the error:
1 | /opt/ruby/ruby-1.8.6/lib/ruby/gems/1.8/gems/tidy-1.1.2/lib/tidy/tidybuf.rb:40: [BUG] Segmentation fault |
Apply the following patch to fix it.

