Good place for data loading scripts that rely on rails? - ruby-on-rails

In your experience, where is the best place to place scripts that run data loading jobs, but which rely on rails? In my project they are in the model folder, but that adds a lot of code to the model folder and won't rails load it all into memory when the server is run (unnecessarily)? The lib/ folder looks good, but those don't have rails access unless you manually specify that in the scripts. Any clean solution here?

Are you talking jobs that you fire off via rake? (then tasks/)
Or are you talking putting data into the Rails app, then maybe you want something like the data_migration plugin.

What do you mean by 'data loading jobs'? If they are scripts that manipulate the database, put them in db/.

rake db:seed would be the best imo
put your script in db/seeds.rb

Related

Where do I put a recurring script that updates database from api in rails

I have a Rails app set up with a model Account that should be updated every morning with data coming from an external API I'm calling (a CRM). Basically either I create new accounts in my app that I find in the CRM and some of the fields that are mapped with my columns, either I find the account if it already exists and I update it.
So far, I've been putting this code into the seeds.rb file and from Heroku, where the app is hosted, I set up a scheduler with the command : rails db:seed that runs periodically.
My issue is that I'm sure there is a better way of doing this. I've read about rake tasks but I did not quite understand how that applied to my case. Otherwise I thought of putting my method in the models/account.rb file as a self method. But I don't really know how I can invoke it in a rake command to allow me to set up a scheduler in Heroku.
Any idea on where would be the best place to put this method, and how to call it from command line?
Thanks in advance.
You can create a script directory in your project, and put your script from db/seeds.rb into this directory, maybe called update_accounts.rb. Then you can run it with
rails runner script/update_accounts.rb
and schedule that task in heroku. More info about rails runner here.
I would suggest using a background processor such as Sidekiq: https://github.com/mperham/sidekiq
Once using Sidekiq, you need a scheduler like https://github.com/moove-it/sidekiq-scheduler to make sure it happens periodically as you require.
This will become easier to maintain as your application grows and you need more workers. It also moves your scheduling into version control.

Location for yaml data files in ruby on rails

I am using yaml files to provide initialisation data for the database and also to provide initialisation data for some of my services models. Where should I store these files in a ruby on rails app?
Based on ruby_newbie's comment and the general lack of other responses, it seems that there is no well defined rails way for this. Reasonable locations are
rails_root/data
rails_root/config/data
rails_root/db/data
You should put any data required for your application to run in the seeds file(db/seeds.rb). http://edgeguides.rubyonrails.org/active_record_migrations.html#migrations-and-seed-data
If you need create a initial database state you can use seeds files in "db/seeds/". After you can use rake to run and create initial state in your database.
In seeds file you can use Rails model without problems and run a follow command rake to create entries.
take db:seed
You can check Rails Documentation:
http://edgeguides.rubyonrails.org/active_record_migrations.html#migrations-and-seed-data
There may be good use case for loading fixed data into a constant without needing to store in a database. Since this is technically fixed "data" I would suggest putting it in
rails_root/db/yaml/
# and you'll have files like
rails_root/db/yaml/measurments.yml
rails_root/db/yaml/locations.yml
# or if you prefer
rails_root/data/yaml/

Rails execute script

I am building a script in on of my controllers to fill a database with an excel files data. I would build the function, then access it through a route. (That i guess i can protect with cancan) But i thought about it, and it doesn't seem very ... 'Railsy'.
I know the scripts folder exists, and it is probably for these kinds of tasks. I've tried googling stuff like 'rails execute script' and other stuff, but i can't find any good advice for what to do next.
I'm sorry if this seems kind of stupid, but in my apps i've been kind of hacking around stuff to make it work, so any advice on this task would be appreciated.
If you need to upload the file in the app and process it, it should probably go in the "lib"directory and be accessed like any other Ruby library/module/etc.
If it's something you need to run locally, "on demand", "scripts" is fine. If you need access to your rails environment when running it like any Rails models, you can run it from "rails console" or "rails runner".
As Aln said, there are a variety of ways it could be scheduled as well.
You could simply do
#!/usr/bin/env ruby
require 'rubygems'
# regular ruby code here
and have it running just like any other util. Of course you can always call any *.rb with simply
ruby somescript.rb
If you need some scheduled script, check into rufus-scheduler gem.

Where to put files that will be read in a Rails app?

I'm developing a Rails application and within that application I developed a Rake task that will read entries from a file and store them into the DB. Producing the code was no problem, but I'd like to know, where do I place the file that is read? Is there a convention for that, if yes, what is it?
I know I could have used the seed.rb file but is it ok, by the standards, to load and read a file from there?
Thanks in advance!
Yes, put the data you wish to load in the db/seeds.rb file and to load it run rake db:seed. This is what this file was designed to do.
I don't think there's a hard and fast Rails convention for this case. When it comes to seed data I put mine in a subfolder of db.
The yaml_db plugin dumps/loads content from/to the database from the file
rails_root/db/data.yml i'm not sure if this is by convention, however, the content is db related making the db folder an appropriate choice
Where to put stuff in Rails is a problem that I've been working around for a while. The question is whether your file can fit into one of the existing concerns. For instance, is it configuration information?
Anyway, perhaps a similar fit would be the schema.rb, which is put in the db directory. Do not modify the schema.rb with your data, of course: I'm merely suggesting that the db directory, or a subdirectory, might be a place to put your file(s).
On the other hand, if you don't see any directories that hold anything similar -- it's not one of the main categories in app, nor any of the main categories above that -- then you can just make up a name and use that.

How go about writing standalone Ruby ActiveRecord utility in my current rails project?

I have a RoR project on my Windows 7 PC.
I want to create some Ruby code that I can execute from the cmd.exe command line that manipulates the development database (via database.yml) of the project. (I don't want to have to run my utility code via a web page.)
What is the best way to go about pulling this off? (I'm a newbie.)
I can't put the code in the test/ directory because that executes against the test database.
I tried just creating a utility.rb file under app/ but when I run it I get this:
utility.rb:5: uninitialized constant ActiveRecord (NameError)
My standalone file obviously doesn't know about the rest of the rails framework.
Any suggestions?
Rails comes with a utility to do exactly this. Instead of using ruby filename, use script/runner filename (from within the top-level directory for the Rails project), which will automatically load up your Rails environment before running the script.
However, if what you're trying to do is manipulate the database, the right answer is probably to create a migration. Most people assume that migrations are only for changing the structure of your database (adding or removing columns or tables) but they can also be a great way to add seed data or manipulate all the data in the database.
You can write your own rake task which depends on :environment and pass RAILS_ENV=development when executing it.
Nice screencast about it: screencast

Resources