Rails versioning with point-in-time query - ruby-on-rails

Is there a Rails 3 model versioning plugin that supports point-in-time queries (not just recovery) in an SQL database?
To be concrete: I have a table, documents. I want to be able to say, "as of 9/17/2010, which documents contained the text 'foo'?". This requirement seems to rule out all of the single-table versioning solutions like vestal_versions, and none of the other ones seem to have this feature either. All of the plugins I've looked at are documented as black boxes, so perhaps they store enough data internally to do this sort of query but you would never know it from the docs. In terms of Slowly Changing Dimensions, Type 2 is probably the sort of solution I'm looking for, although ultimately I'll use whatever works.
I also need to keep track of which user made changes, although that's probably possible to do outside of the versioning system too.
Is there one that I'm missing? Or am I using the wrong search term? Or do I get to roll my own?

I think you get to roll your own. If you really need to be able to perform queries on the versions, then the versions should probably be their own model, and making a versioning gem bend to fit your needs will be more painful than doing it yourself.
Sorry, and have fun :/

Related

Rails self referential relationship for quasi versioning

I’m looking for the most elegant way to handle a self referential relationship in my rails model. I have an “answer” which could be used as a basis for another “answer.” There aren’t necessarily versions of each other since a significant amount of information could change from “answer” to the other. I’m looking to maintain a sort of multi level parent child relationship so I can trace back the authors, content, etc.. I’ve looked at a couple of gems, but I’m hesitant to introduce something like that, as it may be overkill. Is there a preferred “rails way” of doing this?
Thanks in advance
It sounds like you're wanting to create a tree structure where each answer has a reference to its "parent" answer. This is a common use-case, and there's a whole slew of gems that help solve this problem.
At it's most basic, you'll need some sort of parent_id column on your answers that references the parent. At that point it's really easy to create methods and scopes that grab the parent, children, siblings, etc.
Since your use-case seems pretty simple, writing your own code to manage the tree is probably fine. However, there is something to be said for using existing code that's been well tested and had the kinks worked out. If nothing else, it can save you time in writing your own unit tests. ActsAsTree is about as basic as they come. It shipped with earlier versions of Rails, so if you find your needs growing for some reason, just about every other gem in this space has an upgrade path you can follow.

Rails: acts_as_tree and acts_as_sane_tree

This is the first time I'm modelling a hierarchy within the same model (product categories).
I found a great post on this topic. Since I use Rails 4 & Postgres, which according to the article supports recursive querying (this is the first time I hear this term), the "Adjacency List With Recursive Query" seems to be the way to go because it's both easy to model and fast to query.
The article suggests the acts_as_sane_tree gem, which supports recursive querying. This repo hasn't been updated for two years and I'm not sure whether it supports Rails 4. The project is a fork of the acts_as_tree gem, which supports Rails 4 and is well maintained.
Which gem should I use? And does the acts_as_tree gem support recursive querying to avoid expensive queries?
If you are in doubt what gem to use, I always suggest to takes a look at the Ruby Toolbox. It helps to evaluate if a gem project is still active, how many developer using this gem and a lot more. Why do you know to do that? Do do not want to choose a gem that is not maintained anymore. You want to use the tools that the community uses and stay as close to the mainstream. If you do not follow the community you will run into problems if you need a bug fixed, further documentation or want to update your Rails version.
In this case for nested ActiveRecord awesome_nested_set and ancestry are good candidates. I would not choose the Recursive Query implementation, because most databases do not support this. Unless you have a very good reason, it is not worth to bind your app to a specific database management system.
have you consider ancestry gem?
"It exposes all the standard tree structure relations (ancestors, parent, root, children, siblings, descendants) and all of them can be fetched in a single SQL query."
I'd agree with the accepted answer on one point - it's good to go for gems that are well maintained.
On two points I disagree:
Firstly, just because a gem is popular doesn't make it the right choice, or even a good choice. Taking ancestry gem as an example. It's been around a long time and is popular, but it requires you to add a special column to your tables which it fills with magic voodoo (I'm very uncomfortable with that sort of thing). Whereas a gem like acts_as_recursive_tree does all the same things as ancestry, also using single queries, but it only requires you to make a parent_id column that holds the ID of the parent - probably what you already have before even hunting for a gem.
Another example - there was a gem for linking uploaded files to records. I chose to use it because it seemed the popular choice. But I ditched it as soon as I discovered it was actually modelling a many-to-many relationships, not with a joining table, but by putting comma-separated list of IDs into a single field (can you believe it?)
Secondly, if the database you have chosen has cool features like recursive query implementation, then by all means use it - that's part of the reason you chose the superior database in the first place, isn't it? Unless you have a need for you application to be database-agnostic, then don't be scared of using the features your database provides. Mitigating against the very unlikely possibility that sometime in the future you'll want to switch to a database that has less features than your current one is certainly not worth the cost of avoiding the more powerful gems that use the features of your database.
Anyway, my recommendation is acts_as_recursive_tree It's very easy to user and powerful, and actively maintained.

Mongoid: Is there a utility to "sync" the DB's fields with the current "schema" as defined in my Models?

Sorry if the question is awkwardly phrased -- still a Mongo/Mongoid/Rails newbie here.
What I'm trying to ask:
In my development, I've been changing around the design of my Models as I go, adding some fields here, removing some fields there (one of the great things about MongoDB/Mongoid is that you can do this very quickly and easily!)
Everything is working, but in browsing through the development database, I've got some "detritus" -- documents with the old fields (and data) that aren't being used. It's no big deal other than to my garbage-collective sensibilities. I could, in theory, drop the DB and start from scratch, but that's messy.
Is there a utility / gem / etc. that will, essentially, look at the current document design and drop any fields in the live DB that don't match up to the data model?
I know this can be done manually, and I know about the mongoid migrations gems that are out there -- those are both good and, ultimately, more thorough solutions (which I'll look at).
For now, though, I'm wondering if there's a simple "quick shot" type of utility to simply sync up the DB and drop any fields that aren't explicitly specified in my models.
Thanks!
I don't know of any tools that do this for you. It's not a common ask because most people see this flexibility as a useful feature.
For your development database, you should probably clear it out and start over.
In production you have a couple choices:
Write your code to be robust against fields being missing and the database documents not matching your mongoid model. In other words, be prepared for the two to get out of sync.
Get in the habit of migrating your data every time you change the model. It's a bit of hassle and not strictly necessary if you follow the first, but if the untidiness bothers you, this is a fine idea. See Managing mongoid migrations for strategies.

The Ruby community values simplicity...what's your argument for simplifying a db schema in a new project?

I'm working on a project with developers who have not worked with Ruby OR Rails before.
They have created a schema that is too complicated, in my opinion. The schema has 117 tables, and obtaining the simplest piece of information would require traversing/joining 7 tabels...and of course, there's no "main" table that serves as a sort of key between them. The schema renders many of the rails tools like 'find' method, and many of the has_many/belongs to relationships almost useless. And coding for all of these relationships will likely be more time-consuming than we have the money to code for.
THE QUESTION:
Assuming you are VERY convinced (IMHO...hehe) that the schema is not ideal, and there are multiple ways to represent the domain, how would you argue FOR simplifying the schema (aside from what I've already said)?
I'll stand up in 2 roles here
DBA: Database admin/designer.
Dev: Application developer.
I assume the DBA is a person who really know all the Database tricks. Reaallyy Knows.
DBA:
Database is the key of the application and should have predefined structure in order to serve its purpose well and with best performance.
If you cannot use random schema (which is reasonably normalised and good) then the tools are wrong.
Dev:
The database is just a data store, so we need to keep it simple and concentrate on the application.
DBA:
Database is not a store it is the core of the application. There is no application without database.
Dev:
No. The application is the core. There is no application without the front-end and the business logic applied to it.
And the war begins...
Both points are valid and it is always trade off.
If the database will ONLY be used by RoR, then you can use it more like a simple store.
If the DB can be used by other application OR it will be used with large amount of data and high traffic it must enforce some best practices.
Generally there is no way you can disagree with DBA.
But they can understand your situation and might allow you to loose the standards a bit so you could be more productive.
So you need to work closely, together.
And you need to talk to each other to explain and prove the point why database should be like this or that.
Otherwise, the team is broken and project can be failure with hight probability.
ActiveRecord is a very handy tool. But it cannot do everything for you. It does not provide Database structure by default that you expect exactly. So it should be tuned.
On the other side. If DBA can accept that all PKs are Auto incremented integers that would make Developer's life easier (ActiveRecord does it by default).
On the other side, if developers would accept some of DBA constraints it would make DBA's life easier.
Now to answer your question:
how would you argue FOR simplifying the schema
Do not argue. Meet the team and deliver the message and point on WHY it should be done.
Maybe it really shouldn't and you don't know all the things, maybe they are not aware of something.
You could agree on the general structure of the database AND try to describe it using RoR migrations as a meta language.
This way they would see the general picture, and you would use your great ActiveRecords.
And also everybody would be on the same page.
Your DB schema should reflect the domain and its relationships.
De-normalisation should only be done when you have measured that there is a performance problem.
7 joins is not excessive or bad, provided you have good indexes in place.
The general way to make this argument up the chain is based on cost. If you do things simply, there will be less code and fewer bugs. The system will be able to be built more quickly, or with more features, and thus will create more ROI. If you can get the money manager on board with that approach, he or she may let you dictate terms to the team. There is the counterargument that extreme over-normalization prevents bad data, but I have found that this is not the case, as the complexity it engenders tends to lead to more errors and more database code in general.
The architectural and technical argument here is simple. You have decided to use Ruby on Rails. Therefore you have decided to use the ActiveRecord pattern. The ActiveRecord pattern is driven by having the database tables match the object model. That's the pattern in use here, and in many other places, so the best practices they are trying to apply for extreme data normalization simply do not apply. Buy a copy of Patterns of Enterprise Application Architecture and put the little red bookmark at page 160 so they can understand how the pattern works from the architecture perspective.
What the DBA types tend to be unaware of is how much work ActiveRecord does for you, from query generation, cascading deletes, optimistic locking, auto populated columns, versioning (with acts_as_versioned), soft deletes (with acts_as_paranoid), etc. There is a strong argument to use well tested, community supported library functions to perform these operations versus custom code that must be maintained by a DBA.
The real issue with DBAs is then that they need some work to do. Let them focus on monitoring performance, finding slow queries in the code, creating indexes and doing backups.
If you end up losing the political battle for a sane schema, you may want to consider switching to DataMapper. It's the next pattern in PoEAA. The other thing you may be able to get them to do is to create views in the database that correspond to the object model. This way, you could use many of the finding capabilities in the ActiveRecord model based on the views, but have custom insert, update, and delete methods.

MVC Implementation where a Search Engine is the Model

Maybe I am mistating the problems and conflating the answer with the questions, but please here me out. I would like to think (communally, with you) about a site that is based on any any of the MVC frameworks(something PHP or ASP.NET MVC, whtever) that would use a search engine (lucene/solr, FAST ESP, whatever) as the back end of the Model. That is to say, there is no database per se in the project. Just a giant index of docuements that are semistructured content.
I am looking to understand - and keep in mind the site is primarily read-only - where I am likely to run into trouble. What are the things that make you think this is a bad idea from the get go. Also, please assume that there will be a robust infrastructure with caching surrounding the search engine - so while perf comments are welcomed, we feel they are not the major problem.
Thanks!
In general, I'd use a tool like Lucene for searching content, and a database for retrieving it. That doesn't mean that it won't work. It's more a question of why you don't want to use a database. Yes, it can work, and it probably will work (depending on the functional requirements of the site, read on), but that still doesn't make a tool like Lucene the right tool for the job per se.
That being said, it also it does depend on the kind of site however. Is it really a site with just a whole bunch of searchable data and nothing else, or is it something much more than that? If the answer is the first, then good! If it is the latter, there are some issues I can think of:
Updates to the data can be troublesome. "Instant updates" are usually a no-go, as Lucene would have to rebuild its index, which is time-consuming. If there aren't many updates to the data that's fine. You can just recreate the index a couple of times per day, or nightly, if that works.
Trying to stuff any data in an index which is not really suited to be indexed is usually not a good idea. If the site lets users register on your site, then that user data should really go in a database. It's not impossible to store it in a lucene index, it's just not the right tool for the job. Use the index as a bunch of indexed documents, but don't use it as a database as well.

Resources