I have a Rails application where there are standard cities and areas rows that I want to have initialized in my tables in production and development. I have the data in separate files in array form.
Example cairo_areas.txt file has the following:
["Downtown - Abdin",
"Downtown - Abu El Rish",
"Downtown - Ahmed Helmy",
"Downtown - Ahmed Maher",
"Downtown - Gamea' El Banat"]
This is only part of the array of course to serve as an example.
Now I've tried to look for the best way to initialize in the database, but most of the answers I found seem very old.
In short the two approaches I found are:
Insert the data in seeds.rb. The downside is that most of the data there are for testing purposes, and I'm trying to separate the original data from the Faker data.
Approach 2 is to create a task to seed the city and area data into the database, but most of the answers I found are very old in older versions of Rails.
I want to know what is the best way to insert authentic production data into the database at initializing?
According to what Mark told you on his answer there is a third option which is widely used for adding data to production environments which is creating a migration that creates those records or modifies existing ones.
Two things must be said about the seeds.rb, this file should contain the minimal amount of data, for which according to the business logic of your system, makes the system functional. E.g, creation of roles, permissions and other information which your systems makes no sense without it. Also, this seeds.rb script should be idempotent, so you can re run the seeds.rb at any time without breaking the logic of the data. So you should use Model.find before doing Model.create basically. In that sense you can keep adding information to that seed file and re run it any time.
You might wonder how to choose over a migration or a rake task. I would suggest you to use migrations, because when used correctly you have the possibility to rollback those changes. Option that you do not have when using tasks.
Sorry if I was to verbose, I hope it helps !
Related
I'm quite new to Rails but in my current assignment I have no other choice but use RoR. My problem is that in my app I would like to create, connect and destroy databases automatically on user demand but as far as I understand it is quite hard to accomplish this with ActiveRecord. It would be nice to hear some advice from more experienced RoR developers on this issue.
The problem in details:
I have a main database (which I access with activerecord). In this database I store a list of my active programs (and some template data for creating new programs). I would like to create a separate database for each of this programs (when a user creates a new program in my app).
In the programs' databases I would like to store the state and basic info of the particular program and a huge amount of program related data (which is used to calculate the state and is necessary to have for audit reasons).
My problem is that for example I want a dashboard listing all the active programs and their state data. So first I have to get the list from my main db and after that I have to connect to all the required program databases and get the state data.
My question is what is the best practice to accomplish this? What should I use (ActiveRecord, a particular gem, etc.)?
Hi, thanks for your answers so far, I would like to add a couple of details to make my problem more clear for you:
First of all, I'm not confusing database and table. In my case there is a tool which is processing log files. Its a legacy tool (written in ruby 1.8.6) and before running it, I have to run an SQL script which creates a database with prefilled- and also with empty tables for this tool. The tool then processes the logs and inserts the calculated data into different tables in this database. The catch is that the new system should support running programs parallel which means I have to create different databases for different programs.(this was not an issue so far while the tool was configured by hand before each run, but now the configuration must be automatic by my tool) There is no way of changing the legacy tool while it would be too complicated in the given time frame, also it's a validated tool. So this is the reason I cannot use different tables for different programs, because my solution should be based on an other tool.
Summing my task up:
I have to crate a complex tool using RoR and Ruby 2.0.0 which:
- creates a specific database for a legacy tool every time a user want to start a new program
- configures this old tool on a daily basis to process the required logs and insert the calculated data into the appropriate database
- access these databases and show dashboards based on their data
The database I'm using is MySQL.
I cannot use other framework, because the future owner of my tool won't be able to manage/change/update it. So I have to go with RoR, which is quite painful for me right now and I really hope some of you guys can give me a little guidance.
Ok, this is certainly outside of the typical use case scenario, BUT it is very doable within Rails and ActiveRecord.
First of all, you're going to want to execute some SQL directly, which is fine, but you'll also have to take extra care if you're using user input to determine the name of the new database for instance, and do your own escaping. (Or use one of ActiveRecord's lower-level escaping methods that we normally don't worry about.) The basic idea though is something like:
create_sql = <<SQL
CREATE TABLE foo ...
SQL
ActiveRecord::Base.connection.execute(create_sql)
Although now that I look at ActiveRecord::ConnectionAdapters::Mysql2Adapter, there's a #create method that might help you.
The next step is actually doing different things in the context of different databases. The key there is ActiveRecord::Base.establish_connection. Using that, and passing in the params for the database you just created, you should be able to do what you need to for that particular db. If the db's weren't being created dynamically, I'd put that line at the top of a standard ActiveRecord model so that that model would always connect to that db instead of the main one. If you want to use the same class, and connect it to different db's (one at a time of course), you would probably remove_connection before calling establish_connection to the next one.
I hope this points you in the right direction. Good luck!
Sorry if the question is awkwardly phrased -- still a Mongo/Mongoid/Rails newbie here.
What I'm trying to ask:
In my development, I've been changing around the design of my Models as I go, adding some fields here, removing some fields there (one of the great things about MongoDB/Mongoid is that you can do this very quickly and easily!)
Everything is working, but in browsing through the development database, I've got some "detritus" -- documents with the old fields (and data) that aren't being used. It's no big deal other than to my garbage-collective sensibilities. I could, in theory, drop the DB and start from scratch, but that's messy.
Is there a utility / gem / etc. that will, essentially, look at the current document design and drop any fields in the live DB that don't match up to the data model?
I know this can be done manually, and I know about the mongoid migrations gems that are out there -- those are both good and, ultimately, more thorough solutions (which I'll look at).
For now, though, I'm wondering if there's a simple "quick shot" type of utility to simply sync up the DB and drop any fields that aren't explicitly specified in my models.
Thanks!
I don't know of any tools that do this for you. It's not a common ask because most people see this flexibility as a useful feature.
For your development database, you should probably clear it out and start over.
In production you have a couple choices:
Write your code to be robust against fields being missing and the database documents not matching your mongoid model. In other words, be prepared for the two to get out of sync.
Get in the habit of migrating your data every time you change the model. It's a bit of hassle and not strictly necessary if you follow the first, but if the untidiness bothers you, this is a fine idea. See Managing mongoid migrations for strategies.
Let's say for example I'm managing a Rails application that has static content that's relevant in all of my environments but I still want to be able to modify if needed. Examples: states, questions for a quiz, wine varietals, etc. There's relations between your user content and these static data and I want to be able to modify it live if need be, so it has to be stored in the database.
I've always managed that with migrations, in order to keep my team and all of my environments in sync.
I've had people tell me dogmatically that migrations should only be for structural changes to the database. I see the point.
My counterargument is that this mostly "static" data is essential for the app to function and if I don't keep it up to date automatically (everyone's already trained to run migrations), someone's going to have failures and search around for what the problem is, before they figure out that a new mandatory field has been added to a table and that they need to import something. So I just do it in the migration. This also makes deployments much simpler and safer.
The way I've concretely been doing it is to keep my test fixture files up to date with the good data (which has the side effect of letting me write more realistic tests) and re-importing it whenever necessary. I do it with connection.execute "some SQL" rather than with the models, because I've found that Model.reset_column_information + a bunch of Model.create sometimes worked if everyone immediately updated, but would eventually explode in my face when I pushed to prod let's say a few weeks later, because I'd have newer validations on the model that would conflict with the 2 week old migration.
Anyway, I think this YAML + SQL process works explodes a little less, but I also find it pretty kludgey. I was wondering how people manage that kind of data. Is there other tricks available right in Rails? Are there gems to help manage static data?
In an app I work with, we use a concept we call "DictionaryTerms" that work as look up values. Every term has a category that it belongs to. In our case, it's demographic terms (hence the data in the screenshot), and include terms having to do with gender, race, and location (e.g. State), among others.
You can then use the typical CRUD actions to add/remove/edit dictionary terms. If you need to migrate terms between environments, you could write a rake task to export/import the data from one database to another via a CSV file.
If you don't want to have to import/export, then you might want to host that data separate from the app itself, accessible via something like a JSON request, and have your app pull the terms from that request. That seems like a lot of extra work if your case is a simple one.
I'm trying to figure out the cut-off with respect to when a "text entry" should be stored in the database vs. as a static file. Are there any rules of thumb here? The text entries will be at the most several paragraphs and have links to images and tables (and hyperlinks to other text entries). Some criteria for the text entry:
I'm thinking of using DITA as the content format
The text should be searchable
If the text is revised, a new version will be created
thanks in advance, Chuck
The "rails way" would be using a database.
The solution will be more scalable, therefore faster and probably easier to develop with (using migration and so on). Using the file system, you will have to build lots of functions on your own, that are already implemented for database usage.
You could create a Model (e.g.) Document and easily use existing versioning systems, like paper_trail. When using an indexed search, you can just have an has_many relation enabling you to realise the depencies between the models (destroy a model means to destroy the search index).
Rather than a cut-off, you could look at what databases provide and ask yourself if those features would be useful. Take Isolation (the I in ACID): if you have any worries that multiple people could be trying to edit an entry at the same time, a database would handle that well while you'd have to handle the locks yourself working with files. Or Atomicity: you might want to update two things at once (e.g. an index page and an entry page) and know they will either both succeed or both fail.
Databases do a number of things beyond ACID, such as taking advantage of multiple datatypes, making querying easier, and allowing for scaling. It's a question worth asking since most databases end up having data stored in a bunch of files on disk. Would you end up writing a mini-database if you used files yourself?
Besides, if you're using rails you mind as well take advantage of its ActiveRecord functionality, and make it possible to use the many plugins that expect a database.
I'd use a database for even a small, single user rails app.
I have a couple of scripts that generate & gather large amounts of data that I will need both to seed my database with and in the future to add large amounts more to it. What is the best way to import lots of relational data into a rails database as both seed data and intermittently during production?
I haven't settled on an output format for my script yet but the data's structure largely mirrors my rails model's and contains has_many associations that I would like the import to preserve.
I've googled a fair bit and seen ar-extensions and seed_fu as well as the idea of using fixtures.
With ar-extensions all the examples seem to be staightforward csv imports (likely from table dumps which seems to be its primary use case) with no mention of handling associations or avoiding duplicate updates. In my case I have no id's, foreign keys, or join tables in my script so this seems like it wouldn't work for me unless I was prepared to handle that complexity myself.
With seed_fu, it looks like it could handle the relational aspects of the data creation but would still require me to specify ids (how can you know which ones are available in production?) and mix code with data.
Fixtures also have the same id problem though now it requires objects to be named (I'd probably just end up using numbers for names) and I am not sure how I would avoid accidental duplications of records.
Or would i be better off just putting my data into a local sqlite db first and then using the straight table dumping techniques?
I've done this before with CSV files. I have a cron job that collects the data and puts it in accessible CSV files. Then I have a rake task (also in cron, but on another box) that looks for CSVs, and if there are any, calls model methods to create objects out of the CSV rows.
The models have a csv_create_or_update method that takes a CSV row. This technique sidesteps the id issue, and also allows validations to run (or not) on the data coming in.
I was handed a rails app recently that required some imported data and it was all handled in a rake task. All the data came in form of csv formatted files and was handled per class/model etc as necessary. This worked out relatively well for me being new to the system as it was very easy to see where data was going and how it was being applied. As you import the data you can check for id conflicts/collisions and handle them accordingly.