Where to put code that downloads and stores data?

Where to put code that downloads and stores data? - ruby-on-rails

I'm pretty much new to Rails and have a question about how to organize my code. I read about fat models and skinny controllers and it makes a lot of sense (in theory?).
What I'd like to do now is this: Periodically (via Cron and a Rails runner) download some data from the Internet and store parts of it in the database. What I don't understand is where to put the code which speaks with the API from which I get the data. Do I put it into the model and let it look like this:
API data
'--> Model --> Database
What about another case where the downloaded data has to be split up and stored in two different models / database tables? Which version do is choose?
Version 1:
API data
'--> Model --> Database
'--> Model --> Database
Version 2:
API data
'--> Controller
|--> Model --> Database
'--> Model --> Database
Thanks for your help! :)

As #agmcleod suggested you should go with raketask, which you run with rake task_name, and then add to it cron jobs
Start with:
rails g task api_service fetch_data_for_model1 fetch_data_for_model2
now open lib/tasks/api_service.rake
namespace :api_servicedo
desc "Update database with new Movies"
task :fetch_data_for_model1=> :environment do
puts 'start fetching data'
API.new(credentials).fetch_movies.each do |movie|
puts "creating movie id =#{movie.id}"
Movie.find_or_create_by(movie.attributes)
emd
puts 'Finisheed!'
end
desc "TODO"
task :fetch_data_for_model2=> :environment do
....
end
end
now open
crontab -e
and give it rake task path
00 00 * * * cd /Users/you/projects/myrailsapp && /usr/local/bin/rake RAILS_ENV=production api_service:fetch_data_for_model1
You may consider using https://github.com/javan/whenever which provides a clear syntax for writing and deploying cron jobs.
With whenever you would probably create schedule.rb file with such definitions
every 1.day, :at => '4:30 am' do
rake 'api_service:fetch_data_for_model1'
end

Related

Ruby on Rails filling database with scraped data daily

I'm trying to setup a process which will scrape web data from a set of websites on a set schedule (maybe monthly, daily...etc). I want it to then fill the database tables. What would be the best way to do this? Would it be best to create a ruby script outside of rails, and then use a cron task on my own schedule to fill the database? Or is there a way I can do this within the rails framework?

Step 1: Create a rake task
ie: lib/tasks/scrapping.rake
namespace :scrapping do
desc "Fetches new data from websites"
task scrap_websites: :environment do
# Call your scrapping classes/jobs/whatever code here
end
end
Step 2: Create a CRON task calling your rake task
You can use a gem like whenever for this: https://github.com/javan/whenever
For instance, your config/schedule.rb could look like this:
every 1.day, at: '4:00am' do
rake 'scrapping:scrap_websites'
end

Where do I put seed data if I have already created my database in my Rails project?

I’m using Rials 4.2.5. I want to create some seed data for a new model, user_images, I just created in an existing project. However, I already have a db/seeds.rb file that has been run on my database. Where do I put the seed data for this new model? I assume i can’t use db/seeds.rb because it has already been run. It is not an option to blow away the database and start again.
Thanks, - Dave

You can use seeds.. I use, for example:
Person.find_or_create_by(name: 'Bob')
Lots of them, as required, then run as many times as I like.. I run seeds on each auto deployment for example, so I don't forget..
Link to command: http://apidock.com/rails/v4.2.1/ActiveRecord/Relation/find_or_create_by

create a custom rake task in lib/tasks. The file should end in .rake. Then you run it by the name. For example:
task :do_something => :environment do
p "do something"
end
You'd run this task by calling rake do_something in terminal.

fetch csv feed and place into database in rails

So from scratch, I have a CSV feed. Its currently 2596 lines long (Yay)
This feed gets updated frequently, I'm wanting to have this csv feed, (Baring in mind when i click the link it instantly downloads as a csv file.) to populate my database daily at a random time (e.g. 5am in the morning) every morning the database table would wipe and repopulate via the csv. (the way i access the csv is via url)
How would i go about this using rails? I'm unaware if there is any gems or anything i could use for this.
Sam

You wouldn't do it via rails - rails is a web framework, this is more a background task. If the population of the database needs to know your application structure, I'd set this up as a rake task in lib/tasks/populate.rake
Your question is much to broad to answer fully without more details, but generally something like the below should work.
Edit: delete users and recreate from an assumed structure
require 'open-uri'
namespace :populate do
desc 'wipes the database and recreates from CSV'
task reload: :environment do
# Remove all users
User.delete_all
CSV.new(open(YOUR_CSV_URL)).each do |row|
# do something with the row
User.create(name: row[0], address: row[1])
end
end
end
You could then use Cron or equivalent to call this at 5am
cd /path/to/your/web/app && RAILS_ENV=production bundle exec rake populate:reload

Rails - How to auto-transfer records from table to table at a specific time?

I am pre-storing records in a table A and I want to transfer these records from table A to table B automatically at a specific time, lets say on every evening at 08:00 PM.
Any ideas on how to solve this little problem?

You could create rake task to implement your job, and then schedule it with cron, default *nix time manager. Its syntax is difficult to remember, so I prefer to use Ruby wrapper around it, gem whenever.

You can use whenever gem to run cron jobs ...for example job that runs every 5 mins
in schedule.rb
every 5.minutes do
rake "transfer_data:send_data"
end
lib/tasks/send_data.rake
#!/usr/bin/env ruby
namespace :transfer_data do
desc "Rake task to transfer data
task :send_data => :environment do
## code to transfer data from one table to other table
end
end
Execute the task using bundle exec rake transfer_data:send_data

Ruby scripts with access to Rails Models

Where and how do I run a simple script that uses my rails environment. Specifically I have one column that holds multiple pieces of information, I've added columns now for each piece of information and need to run a ruby script that can run to call a method on each row of the database to extrapolate data and save it to the new column.

Using a migration sounds like the right way to go if I am understanding your use case.
However, if you really do want to write a standalone script that needs access to your Rails application's models, you can require the environment.rb file from inside your standalone script.
Example:
#!/bin/env ruby
ENV['RAILS_ENV'] = "production" # Set to your desired Rails environment name
require '/path/to/railsapp/config/environment.rb'
# After this point you have access to your models and other classes from your Rails application
model_instance = MyModel.find(7)
model_instance.some_attribute = "new value"
model_instance.save

I have to agree with David here. Use a migration for this. I'm not sure what you want to do, but running it from inside your environment is much, much more efficient then loading up the app environment manually. And since your initial post suggests you're only doing this once, a migration is the way to go:
rails g migration MigrateData
.. generates:
class MigrateData < ActiveRecord::Migration
def self.up
# Your migration code here
end
def self.down
# Rollback scenario
end
end
Of course, you will always want to perform this locally first, using some test data.

Agree with everyone, for this specific case it sounds like migration will be way to go, however, to do this regularly, or write some other task/script that interacts rails app environment make rails generate a rake task for you! This gets saved with your rails app, and can be run again and again :)
Easiest way to generate a rake task that interact with rails app/models is to make Rails generate Rake tasks for you!! :)
Here's an example
run rails g task my_namespace my_task
This will generate a file called lib/tasks/my_namespace.rake which looks like:
namespace :my_namespace do
desc "TODO: Describe your task here"
task :my_task1 => :environment do
#write any ruby code here and also work with your models
puts User.find(1).name
end
end
Run this task with rake my_namespace:my_task
Watch your ruby code task that interacts with rails modal run!

Seeding data:
http://railscasts.com/episodes/179-seed-data
Adding data with migrations
http://railscasts.com/episodes/23-counter-cache-column
Working with Rake Tasks
http://railscasts.com/episodes/66-custom-rake-tasks
I prefer to use migrations for adding some data in your case.

If it's a one-time thing, use a migration.
If this is something that needs to be done multiple times, use a rake task for it.

Develop Reference

ios ruby-on-rails asp.net-mvc docker delphi jenkins grails google-sheets machine-learning dart

Where to put code that downloads and stores data? - ruby-on-rails

Related

Ruby on Rails filling database with scraped data daily

Where do I put seed data if I have already created my database in my Rails project?

fetch csv feed and place into database in rails

Rails - How to auto-transfer records from table to table at a specific time?

Ruby scripts with access to Rails Models

Categories

Resources