How to handle nils in CSV data

Recently I’ve had to work a lot with CSV’s and I’ve learned the hard way that the data within these aren’t always what you expect. Why would there be no number associated to a name or part? Why no name for a number or part and so on. Below is the strategy I’ve adopted to handle this case.

Fetching the data

We want the headers to return as symbols because I don’t trust the position of the rows in the future. On the second like we see the headers are indeed an array of symbols.

parsed_rows = CSV.parse("data.csv"), headers: true, header_converters: :symbol)
parsed_rows.headers => [:number, :name, :part]

Filter out nils

parsed_rows.reject do |row|
  if row.to_h.values.any?(&:nil?)
    puts "number: #{row[:number]} | name: #{row[:name]} | part: #{row[:part]}"
=> [#<CSV::Row number:"1" name:"foo" part:"bar">, #<CSV::Row number:"4" name:"zub" part:"fab">]

Here we use reject to only return rows that don’t have nil values.

We’re going through each row and transforming the data to a Hash. We’re then using any? to check if any of the values are nil in the Hash. If they are, we log the data with the puts statement. We’re doing this because someone will ask “Why didn’t this part get updated?” and we want to tell them it’s because the data in that row was nil for some reason.