I am using Smarter CSV to and have encountered a csv that has blank lines. Is there anyway to ignore these? Smarter CSV is taking the blank line as a header and not processing the file correctly. Is there any way I can bastardize the comment_regexp?
mail.attachments.each do | attachment |
filename = attachment.filename
#filedata = attachment.decoded
puts filename
begin
tmp = Tempfile.new(filename)
tmp.write attachment.decoded
tmp.close
puts tmp.path
f = File.open(tmp.path, "r:bom|utf-8")
options = {
:comment_regexp => /^#/
}
data = SmarterCSV.process(f, options)
f.close
puts data
Sample File:
[![test.csv[1]](https://i.stack.imgur.com/P1E2z.png)
output

Let's first construct your file.
Two problems must be addressed to read this file using the method SmarterCSV::process. The first is that comments--lines beginning with an octothorpe (
'#')--and blank lines must be skipped. The second is that the field separator is not a fixed-length string.The first of these problems can be dealt with by setting the value of
process':comment_regexpoption key to a regular expression:which reads, "match an octothorpe at the beginning of the string (
\Abeing the beginning-of-string anchor) or (|) match a string containing zero or more whitespace characters (\sbeing a whitespace character and\zbeing the end-of-string anchor)".Unfortunately,
SmarterCSVis not capable of dealing with variable-length field separators. It does have an option:col_sep, but it's value must be a string, not a regular expression.We must therefore pre-process the file before using
SmarterCSV, though that is not difficult. While are are at, we may as well remove the dollar signs and use commas for field separators.1Let's look at the file produced.
displays
Now that's what a CSV file should look like! We may now use
SmarterCSVon this file with no options specified:1. I used IO::foreach to read the file line-by-line and then write each manipulated line that is neither a comment nor a blank line to the output file. If the file is not huge we could instead gulp it into a string, modify the string and then write the resulting string to the output file:
File.write(fout_name, File.read(fin_name).gsub(/^#.*?\n|^[ \t]*\n|^[ \t]+|[ \t]+$|\$/, '').gsub(/[ \t]+/, ',')). The first regular expression reads, "match lines beginning with an octothorpe or lines containing only spaces and tabs or spaces and tabs at the beginning of a line or spaces and tabs at the end of a line or a dollar sign". The secondgsubmerely converts multiple tabs and spaces to a comma.File.new(fout_name, 'w') File.foreach(fin_name) do |line| fout.puts(line.strip.gsub(/\s+\$?/, ',')) unless line.match?(/\A#|\A\s*\z/) end fout.close