Wednesday, January 2, 2008

Parsing Spreadsheets with the Roo Gem

I've talked at length about using the win32ole library to automate Microsoft Excel. But there are alternatives for accessing data in Excel spreadsheets -- some of which don't even require Excel to be installed. One of these is the roo gem, which allows you to extract data from Excel, OpenOffice and Google spreadsheets. Roo provides read-only access to Excel and OpenOffice spreadsheets, but both read and write access to Google spreadsheets.

To install the roo gem, including its dependencies, open a console window and enter:


gem install roo -y

Require the roo library in your script:

require 'roo'

To parse an Excel worksheet, we first create an instance of the Excel workbook object by calling the Excel.new method and passing it the path and filename:

xl = Excel.new('C:\my_workbook.xls')

The next step is to define which worksheet in the workbook we will be working with. We do this by setting the Excel.default_sheet value to one of the worksheets in the Excel.sheets array:

xl.default_sheet = xl.sheets[0]

To extract the value from a particular cell, call the cell method, passing it the row number and either the column number or column letter. Examples:

val = xl.cell(3, 5)
val = xl.cell(3, 'E')

The row method returns an array of values from the specified row number, so...

values = xl.row(3)

...returns the values from the third row.

Similarly, the column method returns an array of values from a column:

values = xl.column(5)

Get the full details from the roo homepage, and the roo Rdoc page.

That's all for now. As always, feel free to post a comment here or email me with questions, comments, or suggestions.

Thanks for stopping by!

Digg my article

1 comment:

Anonymous said...

Excellent find Dave.

Thanks!