Nokogiri how to traverse every row of a table with two classes

207 Views Asked by At

I am attempting to parse an HTML table using Nokogiri. The table is marked up well and has no structural issues except for table header is embedded as an actual row instead of using <thead>. The problem I have is that I want every row but the first row, as I'm not interested in the header, but everything that follows instead. Here's an example of how the table is structured.

<table id="foo">
<tbody>
  <tr class="headerrow">....</tr>
  <tr class="row">...</tr>
  <tr class="row_alternate">...</tr>
  <tr class="row">...</tr>
  <tr class="row_alternate">...</tr>
</tbody>
</table>

I'm interesting in grabbing only rows with the class row and row_alternate. However, this syntax is not legal in Nokogiri as far as I'm aware:

doc.css('.row .row_alternate').each do |a_row|
  # do stuff with a_row
end

What's the best way to solve this with Nokogiri?

3

There are 3 best solutions below

0
On BEST ANSWER

I would try this:

doc.css(".row, .row_alternate").each do |a_row|
  # do stuff with a_row
end
0
On

try doc.at_css(".headerrow").remove and then

doc.css("tr").each do |row| #some code end

0
On

A CSS selector can contain multiple components separated by comma:

A comma-separated list of selectors represents the union of all elements selected by each of the individual selectors in the list. (A comma is U+002C.) For example, in CSS when several selectors share the same declarations, they may be grouped into a comma-separated list. White space may appear before and/or after the comma.

doc.css('.row, .row_alternate').each do |a_row|
  p a_row.to_html
end

# "<tr class=\"row\">...</tr>"
# "<tr class=\"row_alternate\">...</tr>"
# "<tr class=\"row\">...</tr>"
# "<tr class=\"row_alternate\">...</tr>"