Saturday 28 August 2010

Robust XPath to scrap a specific field with dynamic row and column index

A simple XPath like below is easy to digest and quite effective on doing its work, as long the field we want to scrap is always on the same position (on this case first row second column).

//table[@class='mytableclassname']/tr[1]/td[2]

But what should we do if both column and row can change its position, depending on what group a user classified as?

Here's what I did to overcome this challenge, I made a line of XPath to tell nokogiri to look for a row which contains a certain String, and a certain column whose 'th' index equals another certain String. Like this one below :

//table[@class='mytableclassname']/tr[.='my_row_contains_some_string']/td[count(//th[.= 'my_destined_column_th' ]/preceding-sibling::*)+1)]

The path will search for a table with 'mytableclassname' class, and then a row which contains 'my_row_contains_some_string' on at least one of its field, and on that certain row will search for a field whose index equal to 'th' = 'my_destined_column_th'.

The column and row can change however they want, but I will have the exact field whenever I want too (^.^).

No comments:

Post a Comment