XML import to PostgreSQL using Nokogiri - ruby-on-rails

I would like to import an XML file from a URL using Nokogiri and save it to my PostgreSQL database.
In my schema.rb I have the following table:
create_table "centres", force: :cascade do |t|
t.string "name"
t.string "c_type"
t.text "description"
t.float "lat"
t.float "long"
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
end
Below is a sample from the file I am importing:
<facility>
<id>CG432</id>
<facility_name>Cairncry Community Centre</facility_name>
<expiration>2099-12-31T23:59:59Z</expiration>
<type>Community Centre</type>
<brief_description/>
<lat>57.1601027</lat>
<long>-2.1441739</long>
</facility>
I created the following import.rake task in lib/tasks:
require 'rake'
require 'open-uri'
require 'Nokogiri'
namespace :db do
task :xml_parser => :environment do
doc = Nokogiri::XML(open("http://sample.xml"))
doc.css('centre').each do |node|
facility_name = node.xpath("centre").text,
type = node.xpath("centre").text,
brief_description = node.xpath("centre").text,
lat = node.xpath("centre").text,
long = node.xpath("centre").text,
Centre.create(:facility_name => name, :type => c_type, :brief_description => description, :lat => lat, :long => long)
end
end
end
I tried rake db:migrate and also rake -T | grep import.

Your XML does not contain a <centre> element. Also there is no need to create a bunch of variables if you only intend to use them once.
doc.css('facility').each do |f|
centre = Centre.create do |c|
c.facility_name = node.css("facility_name").first.text
c.type = node.css("type").first.text
c.brief_description = node.css("brief_description").first.text
c.lat = node.css("lat").first.text
c.long = node.css("long").first.text
end
end
A more elegant way to do this if the selectors match up with your attributes is:
KEYS = ["facility_name", "type", "brief_description", "lat", "long"]
doc.css('facility').each do |f|
values = KEYS.map { |k| node.css(k).first.text }
Centre.create(Hash[*KEYS.zip(values).flatten])
end
An explaination on how this works can be found at: http://andywenk.github.io/programming/2014/06/27/ruby-create-a-hash-from-arrays/

Related

How to set attributes with different names than a DB schema

I am a newbie Ruby developer. I cannot figure out how to create an ActiveRecord model with different attributes names than defined in a DB schema
Consider the following schema
create_table "sync_tasks", force: :cascade do |t|
t.string "name"
t.string "path"
t.string "task_type"
t.string "status"
t.boolean "async", default: false
t.boolean "direct_download", default: true
t.datetime "created_at", null: false
t.datetime "completed_at"
t.datetime "updated_at", null: false
end
And I have the following payload
{
"name" : "Sync /var/www/",
"path" : "/var/www",
"directDownload": true,
"async" : false,
"taskType" : "directory"
}
And trying to create my model like that
class SyncTask < ApplicationRecord
TYPE_DB='db'
TYPE_FILE='file'
TYPE_DIRECTORY='directory'
def initialize(params)
# super
#task_type = params[:taskType]
#direct_download = params[:directDownload]
#path = params[:path]
#status = params[:status]
#async = params[:async]
end
end
When I try to save it throws an error
<NoMethodError: undefined method `[]' for nil:NilClass>
Also I am not able to access field like that
new_task = SyncTask.new(allowed_task_params)
new_task.task_type
It throws the following error
#<NoMethodError: undefined method `task_type' for #<SyncTask not initialized>>
In case I uncomment the super call it gives another error
#<ActiveModel::UnknownAttributeError: unknown attribute 'taskType' for SyncTask.>
What I am doing wrong ? How can I use different attributes names and initialize the model by myself ?
Thanks
You can transform the keys , for example:
=> payload = { "name": "Sync /var/www/", "path": "/var/www", "directDownload": true, "taskType": "directory" }
=> h = payload.transform_keys { |key| key.to_s.underscore } # only since v4.0.2
=> h = Hash[payload.map { |(k, v)| [k.to_s.underscore, v] }] # before v.4.0.2
#> {"name"=>"Sync /var/www/", "path"=>"/var/www", "direct_download"=>true, "task_type"=>"directory"}
=> new_task = SyncTask.new(h)
You shouldn't use the initialize method on AR models. If you still need to use initialize, use after_initialize hook. Because with the initialize we have to declare the super, so it is best to use the callback.

Error Transaction.new into Rails app trying to import CSV data

I try to import a CSV file into my database in a Rails app. I follow this gist.
Here is my code:
# db/seeds.rb
require 'csv'
csv_text = File.read(Rails.root.join('lib', 'seeds', 'siren_db.csv'))
csv = CSV.parse(csv_text, :headers => true, :encoding => 'ISO-8859-1')
csv.each do |row|
t = Transaction.new
t.siren = row['siren']
t.nom = row['nom']
t.adresse = row['adresse']
t.complement_adresse = row['complement_adresse']
t.pays = row['pays']
t.region = row['region']
t.departement = row['departement']
t.activite = row['activite']
t.date = row['date']
t.nb_salaries = row['nb_salaries']
t.nom = row['nom']
t.prenom = row['prenom']
t.civilite = row['civilite']
t.adr_mail = row['adr_mail']
t.libele_acti = row['libele_acti']
t.categorie = row['categorie']
t.tel= row['tel']
t.save
puts "#{t.siren}, #{t.nom} saved"
end
puts "There are now #{Transaction.count} rows in the transactions table"
Unfortunately, I have an error but don't know why? (I have the exact same code as the gist) :
rake aborted! NameError: uninitialized constant Transaction
/Users/nicolasleroux/Public/sites/sirenforest/db/seeds.rb:6:in block
in '
/Users/nicolasleroux/Public/sites/sirenforest/db/seeds.rb:5:in' Tasks: TOP => db:seed (See full trace by running task
with --trace)
UPDATE
The script works but everything is filled with "nill"... Here are my codes:
#db/migrate/create_transaction
class CreateTransactions < ActiveRecord::Migration[5.0]
def change
create_table :transactions do |t|
t.integer :siren
t.string :nom_ent
t.string :adresse
t.string :complement_adresse
t.string :pays
t.string :region
t.integer :departement
t.string :activite
t.integer :date
t.string :nb_salaries
t.string :nom
t.string :prenom
t.string :civilite
t.string :adr_mail
t.string :libele_acti
t.string :categorie
t.integer :tel
t.timestamps
end
end
end
#model transaction
class Transaction < ApplicationRecord
end
The beginning of the csv file:
SIREN;NOM;ADRESSE;COMPLEMENT_ADRESSE;CP_VILLE;PAYS;REGION;DEPARTEMENT;ACTIVITE;DATE;NB_SALARIES;NOM;PRENOM;CIVILITE;ADR_MAIL;LIBELE_ACTI;CATEGORIE;TEL
38713707;SYND COPR DU 6 AU 8 RUE DE CHARONNE 75;6 RUE DE CHARONNE;;75011 PARIS;FRANCE;Île-de-France;75;Activités combinées de soutien lié aux bâtiments;2008;1 ou 2 salariés;;;;;Syndicat de copropriété ;PME;
38713707;SYND COPR DU 6 AU 8 RUE DE CHARONNE 75;6 RUE DE CHARONNE;;75011 PARIS;FRANCE;Île-de-France;75;Activités combinées de soutien lié aux bâtiments;2008;1 ou 2 salariés;;;;;Syndicat de copropriété ;PME;
38724340;SYND COPR DU 18 BD ARAGO 75013 PARIS;18 BOULEVARD ARAGO;;75013 PARIS;FRANCE;Île-de-France;75;Activités combinées de soutien lié aux bâtiments;2008;1 ou 2 salariés;;;;;Syndicat de copropriété ;PME;
look at the 1. Setup section it says like this:
Make sure you've created a resource with the appropriate columns to match your seed data. The names don't have to match up.
You must generate Transaction model in your rails application, like this:
$ rails generate model Transaction street:text city:string etc...
see section 5 on the gist for appropriate columns.
Update:
You should've specified delimiter for your CSV file like this:
csv = CSV.parse(csv_text, :headers => true, :encoding => 'ISO-8859-1', :col_sep => ';' )
also hash key should have been uppercase as in your csv file and you have same column names, should be unique(t.nom). Full code:
csv = CSV.parse(csv_text, :headers => true, :encoding => 'ISO-8859-1', :col_sep => ';' )
csv.each do |row|
t = Transaction.new
t.siren = row['SIREN']
t.nom = row['NOM'] # => 2 same columns
t.adresse = row['ADRESSE']
t.complement_adresse = row['COMPLEMENT_ADRESSE']
t.pays = row['PAYS']
t.region = row['REGION']
t.departement = row['DEPARTEMENT']
t.activite = row['ACTIVITE']
t.date = row['DATE']
t.nb_salaries = row['NB_SALARIES']
t.nom = row['NOM'] # => 2 same columns
t.prenom = row['PRENOM']
t.civilite = row['CIVILITE']
t.adr_mail = row['ADR_MAIL']
t.libele_acti = row['LIBELE_ACTI']
t.categorie = row['CATEGORIE']
t.tel= row['TEL']
t.save
puts "#{t.siren}, #{t.nom} saved"
end
puts "There are now #{Transaction.count} rows in the transactions table"

No method error is shown when associating ActiveRecords table [duplicate]

This question already has answers here:
Why can't show restaurant list?
(3 answers)
Closed 6 years ago.
I'm trying RoR Active Records with Association.
And trying to connect two tables, which is restaurants and restaurant_translations. These are split for multi-language support.
Here's the definition of those two tables.
create_table "restaurant_translations", id: false, force: :cascade do |t|
t.integer "id", limit: 4, default: 0, null: false
t.integer "restaurant_id", limit: 4
t.string "restaurantname", limit: 255
t.string "address", limit: 255
t.string "tel", limit: 255
t.text "description", limit: 65535
t.string "lang", limit: 255, default: "", null: false
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
end
create_table "restaurants", force: :cascade do |t|
t.string "restaurant_type", limit: 255
t.string "genre", limit: 255
t.string "url", limit: 255
t.string "fb", limit: 255
t.string "mailaddr", limit: 255
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
And the Models.
class Restaurant < ActiveRecord::Base
has_many :restaurant_translations
end
class RestaurantTranslation < ActiveRecord::Base
self.table_name = 'restaurant_translations'
belongs_to :restaurant
end
And then here's the controller which creates my headache.
class RestaurantController < ApplicationController
def list
#restaurants = Restaurant.includes(:restaurant_translations).where('restaurant_translations.lang = ?', "en").references(:restaurant_translations)
logger.debug #restaurants
end
end
View file(.slim) is like this.
h1 = t :restraunt_list_title
table
thead
tr
th = t :restraunt_list_type
th = t :restraunt_list_name
th = t :restraunt_list_url
th = t :restraunt_list_genre
th = t :restraunt_list_addr
tbody
- #restaurants.each do |restaurant|
tr
td = restaurant.restaurant_type
td = restaurant.restaurant_translations.first.restaurantname
td = link_to 'here', restaurant.url
td = restaurant.genre
td = restaurant.restaurant_translations.first.address
br
So, an error occurs 'No Method Error'. Tell me how to express association table parameters. Thanks in advance.
ps. After fixing the view as followed, the result is like this.
cf. restaurant_translation is like this.
I'm guessing you're trying to call the name method defined on the restaurant_translations, in that case you should be calling:
tr
td = restaurant.restaurant_type
td = restaurant.restaurant_translations.first.name
td = link_to 'here', restaurant.url
td = restaurant.genre
td = restaurant.restaurant_translations.first.address
However, a few corrections to your code,
You wouldn't need the restaurant_id column on restaurant, because that is already defined as id unless you want to also tie a restaurant to a restaurant_translation via a belongs_to association, in which case you'd need a restaurant_translation_id column.
I see that you're excluding the id column in restaurant_translation and yet adding it again, that seems a bit redundant, moreover if you want to take advantage of some advanced ActiveRecord features, you'd need an id column
You don't need to specify the table_name on restaurant_translation model as that is inferred by Rails
In your restaurants_controller, you're assigning #restaurants and reassigning it immediately to restaurant_translations. I don't know what you intended to do their, but I don't think that's right
Try to maintain a consistent name in your application, so that your future self can understand it. An example is the usage of restraunt_list_type, I guess you wanted to say restaurant_list_type
There could be others, but these are the ones my eyes caught immediately.
UPDATE
You should check your database to ensure that all your restaurants have at least a restaurant_translation. The error: ...for NilClass means your restaurant_translation is an empty array. If you want to fetch all restaurants that have at least a restaurant_translation, then you should be using joins vs includes in your controller, as such:
Restaurant.joins(:restaurant_translations).where(restaurant_translations: { lang: "en"}).references(:restaurant_translations)
However, if you want to fetch all restaurants, with/without restaurant_translations, then I'd say you should go with the approach of the previous response to your question, using the Object#try method:
tbody
- #restaurants.each do |restaurant|
tr
td = restaurant.restaurant_type
td = restaurant.restaurant_translations.first.try(:restaurantname)
td = link_to 'here', restaurant.url
td = restaurant.genre
td = restaurant.restaurant_translations.first.try(:address)

How to group by individual elements in arrays

I have collections of shows with their genres attached, so that Show.first.genres returns ['horror','scifi',etc].
My goal is to calculate a users mean score by unique genres. The problem is that if I do a Show.group(:genres), I get results by the whole sets of genres:
['horror','scifi']=>[list of entries]
['horror','gore']=>[list of entries]
I would rather get a count of all elements with horror in the genres, all elements with scifi, etc. Any ideas?
Here's some relevant schema information:
create_table "animes", force: :cascade do |t|
end
create_table "animes_genres", id: false, force: :cascade do |t|
t.integer "anime_id", null: false
t.integer "genre_id", null: false
end
create_table "genres", force: :cascade do |t|
t.string "name"
end
create_table "library_entries", force: :cascade do |t|
end
These are all linked back and forth and I can generally access any relationships that exist via ActiveRecord.
Or in a more Railsish way, you should probably start from Genre and do something like:
Genre.all.map{|g| [g, g.shows] }.to_h
If the ActiveRecord Association goes both directions, then you should be able to look at the problem from the Genre Model's perspective.
Genre.find('space opera').shows
I am not sure if this is what you are looking for, but if that Show.group(:genres) returns a Hash of [array of genres] => [array of entries], you can transform that into a Hash of genre => [array of entries], by doing this:
by_genres = Show.group(:genres)
by_genre = {}
by_genre.default = []
by_genres.each{|ks,vs| ks.each{|k| by_genre[k] += vs }}
Or if you only want the count:
by_genres = Show.group(:genres)
count_genre = {}
count_genre.default = 0
by_genres.each{|ks,vs| ks.each{|k| count_genre[k] += vs.size }}

How do I iterate through all records and pass database value to a variable?

I have two tables, "Que" and "Opts".
I want to iterate through all the records in Que and add them to the variables rikt_nr, start_nr and end_nr, because they will go on the end of a URL, which will look like:
api.url.number=8-00001
How do I make it iterate through Que and pass rikt_nr, start_nr and end_nr to the rest of the code?
The Que table has these fields:
create_table "ques", force: true do |t|
t.integer "rikt_nr"
t.integer "start_nr"
t.integer "end_nr"
t.datetime "created_at"
t.datetime "updated_at"
end
and Opts has these fields:
create_table "opts", force: true do |t|
t.string "numbers"
t.string "operator"
t.datetime "created_at"
t.datetime "updated_at"
end
This is the code:
Que.all.each do | que |
url = "http://api.url.number="
que.rikt_nr = rikt_nr
que.start_nr = start_nr
que.end_nr = end_nr
st_en = start_nr..end_nr
st_en.each do |nr|
full_url = "#{url}" + rikt_nr.to_s + "-"+ nr.to_s
doc = Nokogiri::XML(open(full_url))
number = doc.at('Number').text
operator = doc.at('Name').text
number_operator = "#{number}" + ";" + " #{operator}"
number_operator
save = #opts.create(:numbers => number, :operator => operator)
end
end
I just found out this.
The problem was here:
que.rikt_nr = rikt_nr
que.start_nr = start_nr
que.end_nr = end_nr
st_en = start_nr..end_nr
so I changed it to:
Que.all.each do | que |
url = "http://api.url.number="
st_en = que.start_nr..que.end_nr
st_en.each do |nr|
full_url = "#{url}" + que.rikt_nr.to_s + "-"+ nr.to_s
doc = Nokogiri::XML(open(full_url))
number = doc.at('Number').text
operator = doc.at('Name').text
number_operator = "#{number}" + ";" + " #{operator}"
number_operator
save = #opts.create(:numbers => number, :operator => operator)
end
end
and it now works.

Resources