Error: Importing Data CSV Rails 4 - ruby-on-rails

I'm very new to the concept of importing data into a SQL database with CSV. I've followed some stackoverflow posts but I'm getting an error. The error states, Errno::ENOENT: No such file or directory # rb_sysopen - products.csv after running rake import:data. I have csv required in my application.rb as well as I have created a csv file and placed it in TMP. Here is my code so far. I understand I may be asking for a lot from the community but if someone were to answer this question, can you provide some more insight into CSV and rake functions. Thanks so much!!!
<b>import.rake</b>
namespace :import do
desc "imports data from a csv file"
task :data => :environment do
require 'csv'
CSV.foreach('tmp/products.csv') do |row|
name = row[0]
price = row[1].to_i
Product.create( name: name, price: price )
end
end
end

Specify the full path to the CSV file.
For example, if the file is in /tmp/ use:
CSV.foreach('/tmp/products.csv') do |row|
If the products.csv file is in your application's tmp directory use:
CSV.foreach(Rails.root.join('tmp', 'products.csv')) do |row|

I ran into something similar, it was forgetting to put both parenthesis with the braces so you might want to try going from:
Product.create( name: name, price: price )
to:
Product.create({ name: name, price: price })

Check out the smarter_csv Gem.
In it's simplest form you can do this:
SmarterCSV.process('tmp/products.csv').each do |hash|
Product.create( hash )
end
Add smarter_csv to your Gemfile, so it's auto-loaded when you require the environment in your Rake task
This gives you:
namespace :import do
desc 'imports data from given csv file'
task :data, [:filename] => :environment do |t, args|
fail "File not found" unless File.exists? args[:filename]
options = {} # add options if needed
SmarterCSV.process( args[:filename], options).each do |hash|
Product.create( hash )
end
end
end
Call it like this:
rake import:data['/tmp/products.csv']
See also: https://github.com/tilo/smarter_csv

Related

Writing TestCase for CSV import rake task

I have a simple rails application where I import data from csv into my rails app which is functioning properly, but I have no idea where to start with testing this rake task, as well as where in a modular rails app. Any help would be appreciated. Thanks!
Hint
My Rails structure is a little different from traditional rails structures, as I have written a Modular Rails App. My structure is in the picture below:
engines/csv_importer/lib/tasks/web_import.rake
The rake task that imports from csv..
require 'open-uri'
require 'csv'
namespace :web_import do
desc 'Import users from csv'
task users: :environment do
url = 'http://blablabla.com/content/people.csv'
# I forced encoding so avoid UndefinedConversionError "\xC3" from ASCII-8BIT to UTF-8
csv_string = open(url).read.force_encoding('UTF-8')
counter = 0
duplicate_counter = 0
user = []
CSV.parse(csv_string, headers: true, header_converters: :symbol) do |row|
next unless row[:name].present? && row[:email_address].present?
user = CsvImporter::User.create row.to_h
if user.persisted?
counter += 1
else
duplicate_counter += 1
end
end
p "Email duplicate record: #{user.email_address} - #{user.errors.full_messages.join(',')}" if user.errors.any?
p "Imported #{counter} users, #{duplicate_counter} duplicate rows ain't added in total"
end
end
Mounted csv_importer in my parent structure
This makes the csv_importer engine available in the root of the application.
Rails.application.routes.draw do
mount CsvImporter::Engine => '/', as: 'csv_importer'
end
To correctly migrate in the root of the application, I added initializer
/engines/csv_importer/lib/csv_importer/engine.rb
module CsvImporter
class Engine < ::Rails::Engine
isolate_namespace CsvImporter
# This enables me to be able to correctly migrate the database from the parent application.
initializer :append_migrations do |app|
unless app.root.to_s.match(root.to_s)
config.paths['db/migrate'].expanded.each do |p|
app.config.paths['db/migrate'] << p
end
end
end
end
end
So with this explanation am able to run rails app like every other rails application. I explained this so anyone who will help will understand what to help me with as regards writing test for the rake task inside the engine.
What I have done as regards writing TEST
task import: [:environment] do
desc 'Import CSV file'
task test: :environment do
# CSV.import 'people.csv'
Rake::Task['app:test:db'].invoke
end
end
How do someone write test for a rake task in a modular app? Thanks!
I haven't worked with engines, but is there a way to just put the CSV importing logic into it's own class?
namespace :web_import do
desc 'Import users from csv'
task users: :environment do
WebImport.new(url: 'http://blablabla.com/content/people.csv').call
end
end
class WebImport # (or whatever name you want)
def initialize(url) ... end
def call
counter, CSV parse, etc...
end
end
That way you can bump into the Rails console to do the WebImport and you can also do a test isolating WebImport. When you do Rake tasks and Jobs (Sidekiq etc), you want to make the Rake task act as as thin a wrapper as possible around the actual meat of the code (which is in this case CSV parsing). Separate the "trigger the csv parse" code from the "actually parse the csv" code into their own classes or files.

importing data from csv into rails application

I am trying to import data from CSV into the database using Classes so that I can easily write Test Case for the csv import rake task I created
However, my solution does not work.
And I also feel:
It doesn't make sense
Aside feeling its not a good solution that connotes Ruby mastery, it doesn't work.
Here is what I came up with in my engines/csv_importer/lib/tasks/csv_import.rake
require 'open-uri'
require 'csv'
namespace :csv_import do
desc 'Import users from csv'
task users: :environment do
WebImport.new(url: 'http://blablabla.com/details/people.csv').call.answers
end
end
class WebImport
def initialize(url)
#csv_string = url
end
def call
CSV.parse(#csv_string, headers: true, header_converters: :symbol) do |row|
next unless row[:name].present? && row[:email_address].present?
end
CsvImporter::User.create row.to_h
end
def self.answers
user = []
counter = 0
duplicate_counter = 0
user.persisted? ? counter + 1 : duplicate_counter + 1
p "Email duplicate record: #{user.email_address} - #{user.errors.full_messages.join(',')}" if user.errors.any?
p "Imported #{counter} users, #{duplicate_counter} duplicate rows ain't added in total"
end
end
Error when I run rake csv_import:users
$ rake csv_import:users
rake aborted!
NoMethodError: private method `gets' called for {:url=>"http://blablabla.com/details/people.csv"}:Hash
How do I make this work and write TEST for this at the long run?
You are getting this error because you are passing a hash to CSV.parse while that method accepts a string.
To fix that you need to change argument from a hash to a string: WebImport.new('http://blablabla.com/details/people.csv') and read a remote CSV file before passing it to CSV.parse, for example: CSV.parse(open(url)).
You can try to use
rake db:seed
to import the data to your database using seed file as
require 'csv'
puts "Importing data..."
CSV.foreach(Rails.root.join("file_name.csv"), headers: true) do |row|
Model_name.create! do |model_name|
model_name.name = row[0]
model_name.email_address = row[1]
end
end
csv file should be in your project root folder

Rake task to download and unzip

I would like to update a cities table every week to reflect changes in cities across the world. I am creating a Rake task for the purpose. If possible, I would like to do this without adding another gem dependency.
The zipped file is a publicly available zipped file at geonames.org/15000cities.zip.
My attempt:
require 'net/http'
require 'zip'
namespace :geocities do
desc "Rake task to fetch Geocities city list every 3 days"
task :fetch do
uri = URI('http://download.geonames.org/export/dump/cities15000.zip')
zipped_folder = Net::HTTP.get(uri)
Zip::File.open(zipped_folder) do |unzipped_folder| #erroring here
unzipped_folder.each do |file|
Rails.root.join("", "list_of_cities.txt").write(file)
end
end
end
end
The return from rake geocities:fetch
rake aborted!
ArgumentError: string contains null byte
As detailed, I'm trying to unzip the file and save it to a list_of_cities.txt file. Once I the methodology down for accomplishing this, I believe I can figure out how to update my db, based on the file. (But if you have opinions on how best to handle the actual db update, other than my planned way, I'd love to hear them. But that seems like a different post entirely.)
This will save zipped_folder to disk, then unzip it and save its contents:
require 'net/http'
require 'zip'
namespace :geocities do
desc "Rake task to fetch Geocities city list every 3 days"
task :fetch do
uri = URI('http://download.geonames.org/export/dump/cities15000.zip')
zipped_folder = Net::HTTP.get(uri)
File.open('cities.zip', 'wb') do |file|
file.write(zipped_folder)
end
zip_file = Zip::File.open('cities.zip')
zip_file.each do |file|
file.extract
end
end
end
This will extract all files inside the zip file, in this case cities15000.txt.
You can then read the contents of cities15000.txt and update your database.
If you want to extract to a different file name, you can pass it to file.extract like this:
zip_file.each do |file|
file.extract('list_of_cities.txt')
end
I think it can be done more easily without ruby, just using wget and unzip:
namespace :geocities do
desc "Rake task to fetch Geocities city list every 3 days"
task :fetch do
`wget -c --tries=10 http://download.geonames.org/export/dump/cities15000.zip | unzip`
end
end

Rails Generate Custom Rakefile

I'm working on a project that is migrating data from a customers old_busted DB into rails objects to be worked on later. Similarly, I need to convert these objects into a CSV and upload it to a neutral FTP (this is to allow a coworker to build the example pages through Sugar CRM). I've created rake files to do all of this, and it was successful. Now, I'm going to continue this process for each object that I create in rails (relative to the previous DB) and, best case, wanted these generated when I run rake generate scaffold <object>.
Here is my import rake:
desc "Import Clients from db"
task :get_busted_clients => [:environment] do
#old_clients = Busted::Client.all
#old_clients.each do |row|
#client = Client.new();
#client.client_id = row.NUMBER
#client.save
end
end
Here is my CSV convert/FTP upload rake:
desc "Exports db's to local CSV and uploads them to FTP"
task :export_clients_CSV => [:environment] do
# Required libraries for CSV read/write and NET/FTP IO #
require 'csv'
require 'net/ftp'
# Pull all Editor objects into clients for reading #
clients = Client.all
puts "Creating CSV file for <Clients> and updating column names..."
# Open a new CSV file that uses the column headers from Client #
CSV.open("clients.csv", "wb",
:write_headers => true, :headers => Client.column_names) do |csv|
puts "--Loading each entry..."
# Load all entries from Client into the CSV file row by row #
clients.each do |client|
# This line specifically puts the attributes in the rows WITH RESPECT TO#
# THE COLUMNS
csv << client.attributes.values_at(*Client.column_names)
end
puts "--Done loading each entry..."
end
puts "...Data populated. Finished bulding CSV. Closing File."
puts "------------------------"
# Upload CSV File to FTP server by requesting new FTP connection, assigning credentials
# and informing the client what file to look for and what to name it
puts "Uploading <Clients>..."
ftp = Net::FTP.new('192.168.xxx.xxx')
ftp.login(user = "user", passwd = "passwd")
ftp.puttextfile("clients.csv", "clients.csv")
ftp.quit()
puts "...Finished."
end
I ran rake generate g get_busted and put this in my get_busted_generator.rb:
class GetBustedGenerator < Rails::Generators::NamedBase
source_root File.expand_path('../templates', __FILE__)
def generate_get_busted
copy_file "getbusted.rake", "lib/tasks/#{file_name}.rake"
end
end
After that, I got lost. I can't find anything on templating a rake file or the syntax included to do so.
Rails has been a recent endeavor and I may be overlooking something in terms of design of the solution to my problem.
TL;DR: Is templating a rake file a bad thing? Solution alternatives? If not, whats the syntax for generating either script custom to the object (or point me in the direction, please).

Regenerate fixture test files in Rails

How do I regenerate all the YML fixture files? I accidentally deleted them.
#brian,
I'm using the following script to generate the fixtures from a given sql
This is under my lib/task directory as a rake task
namespace :fixture_generator do
desc "generate fixtures for a given sql query from the current development database"
task :fixture_generator, [:sql, :file_name] => :environment do |t, args|
args.with_defaults(:sql => nil, :file_name => nil)
i = "000"
p "creating fixture - #{args.file_name}"
File.open("#{Rails.root}/test/fixtures/#{args.file_name}.yml", 'a+') do |file|
data = ActiveRecord::Base.connection.select_all(args.sql)
file.write data.inject({}) { |hash, record|
number = i.succ!
hash["#{args.file_name}_#{number}"] = record
hash
}.to_yaml
end
end
end
Usage, Say I want to generate fixture for users table
rake fixture_generator:fixture_generator["select * from users","users"]
And also, If you run another query with the same fixture file name, it will append to the existing one
HTH

Resources