ORM that accepts SQL and simply maps the objects and relations? - ruby-on-rails

I find that the use of ActiveRecord affects the way I design the database schema (though I wish it wouldn't). I'm thinking about the inefficiency of fetching data and how to reduce the overall number of queries. The find :include option can only get you so far. I come from writing stored procs that grab everything you need (for a particular screen or activity) in a single call.
I really don't need an API for writing my SQL. I'm entirely content to write T-SQL in a non-Ruby way. The only thing I want is to have my query results mapped to instantiated models and their associations. Are there any ORMs that take this approach? Ones that can handle multiple selects (stored procs?) and maybe even the use of temp tables...
EDIT:
I rephrased and clarified what I was really after with this question.

I find that the need for a quite complex query is about the 20% of a project, so using an ORM quite helps.
When that 20% arise, I find myself doing something alike what you ask, working excluvely with SQL. ActiveRecord and DataMapper have a find_by_sql method that helps you more, but doesn't instantiate all models (at least on ActiveRecord, if I'm not mistaken, that is).
Have you tried using Sequel? It's an ORM too but let's you have an easier approach and more flexibility on what you have.
Besides that, I can't think of a more focused solution on the ORMs realm. Keep in mind that the ORM tries to abstract the querying interface to simplify. If you are feeling confortable with using raw SQL, maybe you could be more productive with just an SQL facade interface.

look at the ActiveRecord find_by_sql() method.
http://github.com/rails/rails/blob/491f1b5f36a11427decc5d7f3558b5c3f5f243c1/activerecord/lib/active_record/base.rb#L660

Related

Is it a good practice to directly write SQL query in rails?

I am making an application with very huge data and multiple joins. Is it a bad practice to right away use the full sql string in rails? What are the downsides of writing the full sql query in rails?
It's only bad practice if you do it without understanding the alternatives.
That said there is rarely a reason to do this. The framework encapsulates it for you and the benefit is that you have to write less code. The other benefit is database independence. The more direct queries you write, the more likely you'll write something that will break when you switch database engines.
It is easy to test. If you are using the framework properly (i.e. optimizing ActiveRecord as you will find discussed in numerous articles) and still feel like your queries are too slow...you can always benchmark direct queries.
But not knowing how to do something using ActiveRecord associations is not a good reason to resort to direct SQL.
http://guides.rubyonrails.org/association_basics.html
SQL is not a 'bad practice' per se. Database systems have plenty of native SQL ways of doing things that would be much slower to execute and more complex to write and maintain if written in Ruby. Like Oracle's Analytic Functions.
That said, ActiveRecord is pretty easy to write and you probably aren't going to get a performance boost just by using a SQL query. At least not if the query you write resembles the query ActiveRecord would have written anyway! ;)
Perhaps you should try to work with ActiveRecord and only resort to SQL if you hit problems you can't solve another way. That way you keep your code simple until you need to do it another way (i.e. don't 'optimise early').
I generally try to make things work in ActiveRecord (or DataMapper or Sequel or whatever), but I have definitely resorted finder_sql when the job needed doing quickly and I couldn't get where I wanted to go using the ORM's 'sugar'. Other times I have based a rails object on a single massive view in the database.
Hope this helps.
:D
If you need more powerfull syntax than provides standard ActiveRecord module, see meta_where gem.

Model for ASP.NET MVC

I just started with ASP.NET MVC 1.0. I've read through some tutorials but I don't have any good ideas on how to build my model.
I've been experimenting with LINQ to SQL. Can't say I like it that much, but I'll give it a try. I prefer using SQL Stored Procedures, but it doesn't seem to work very good together with the optional parameters in my stored procedures.
The project is a prototype, but I would like to start with a good design right away.
The database will probably contain around 50 tables and stored procedures will be used for all queries and updates. Some procedures will also return multiple results. How would the best way to design this model?
All and any ideas are welcome. Should I even use LINQ to SQL? Should I design my stored procedures to only return one result? I feel a little bit lost.
You can use stored procedures with LINQ. You simply need to replace the autogenerated statements on your entities with your stored procs. Look at the properties for the entity (table) in the designer to change these. Your stored procedures won't be able to return multiple results as they need to map onto a collection of either one of your entities or an autogenerated entity, i.e, onto a collection of a single class.
Having said all that, I'd give it a go without using stored procedures. The LINQ code is fully parameterized so you already have this benefit. You'll find that LINQ allows you to easily build up queries alleviating your need for the optional parameters in your stored procs and, likely, resulting in more efficient queries as you don't need to build extra logic into your query to handle the optional parameters. Once you start using LINQ I think you'll find that the power and expressiveness in the code will make you forget about using your stored procedures.
You may still have a need for stored procs or table-valued functions for complex queries -- that's ok. If they map onto an existing entity, you can simply drag the proc/function onto the entity in the designer and you get a nice method on your data context that runs your SQL and returns collections of that entity. If not, I often create a view for the proc/function schema, use that as an entity, and attach the proc/function to it.
May I ask why you need to use stored procedures for this project? They have almost no value today if you are working with a modern database. Parameterized SQL using a good ORM will give you the same cached execution plans and security benefits that you are looking for with stored procedures.
I would recommend NHibernate because it's a far better way to achieve persistence ignorance. And IMHO this is the goal of anyone designing a model. You should be thinking object first instead of data first.
The best resource around is a screencast series by Steve Bohlen "The summer of nhibernate"
Also LINQ to NHibernate is available and works great for any dynamic query that you might need to write.
The only pain you will notice with NHibernate is the XML, but if you really enjoy the ORM you could write a fluent interface instead using FNH.
I found NHibernate to be the best tool around for modeling a domain because your objects are 100% infrastructure free and this allows you to do anything you want with them without having to worry about the database. Also if you are into unit testing of any kind this will make your life much easier :P
Edit
Part of the reason I replied to your question is that I have built many a system using stored procedures and found to regret that decision later. But in my situation I had a requirement by a dba so I couldn't step away from the norm.
If you are already looking at LINQ to SQL I would simply encourage you to look at NHibernate as another great option to achieve the goal of persistence ignorance.
When I got away from thinking about the problem data first it allowed me to work at a much higher level and take advantage of the .net platform for solving problems. TSQL has its place but if you are writing an application and want to maintain it - try to think about how your domain objects will work together in memory first.

Sequel in conjunction with ActiveRecord any gotchas?

I'm considering using Sequel for some of my hairier SQL that I find too hard to craft in Active Record.
Are there any things I need to be aware of when using Sequel and ActiveRecord on the same project? (Besides the obvious ones like no AR validations in sequel etc...)
Disclaimer: I'm the Sequel maintainer.
Sequel is easy to use along side of or instead of ActiveRecord when using Rails. You do have to setup the database connection manually, but other than that, the usage is similar. Your Sequel model files go in app/models and work similarly to ActiveRecord models.
Setting up the database connections isn't tedious, it's generally one line in environment.rb to require sequel, and a line in each environment file (development.rb, test.rb, production.rb) to do something like:
DB = Sequel.connect(...)
So it's only tedious if you consider 4 lines of setup code tedious.
Using raw SQL generally isn't a problem unless you are targeting multiple databases. The main reason to avoid it is the increased verbosity. Sequel supports using raw SQL at least as easily as ActiveRecord, but the times where you need to use raw SQL are generally fairly rare in Sequel.
BTW, Sequel ships with multiple validation plugins. The validation_class_methods plugin is similar to ActiveRecord validations, using class methods. The validation_helpers plugin has a simpler implementation using instance level methods, but both can do roughly the same thing.
Finally, I'll say that if you already have working ActiveRecord code that does what you want, it's probably not worth the effort to port the code to Sequel unless you plan on adding features.
Personally, I wouldn't do it. Just managing connection more-or-less by hand would be tedious, for a start. I'd be more inclined, if I felt Sequel was the stronger option, to hold off for Rails 3.0 (or perhaps start developing against Edge Rails) where it should be fairly easy to switch ORMs, if Yehuda and co are doing their stuff right. A lot more Merb-like than now, at least.
This was DHH's take on the subject (I'm not saying it should be taken as gospel truth, mind, but it is, so to speak, from the horse's mouth):
But Isn’t Sql Dirty?
Ever since programmers started to
layer object-oriented systems on top
of relational databases, they’ve
struggled with the question of how
deep to run the abstraction. Some
object-relational mappers seek to
eradicate the use of SQL entirely,
striving for object oriented purity by
forcing all queries through another OO
layer.
Active Record does not. It was built
upon the notion that SQL is neither
dirty nor bad, just verbose in the
trivial cases. The focus is on
removing the need to deal with the
verbosity in those trivial cases but
keeping the expressiveness around for
hard queries – the type SQL was
created to deal with elegantly.
Therefore, you shouldn’t feel guilty
when you use find_by_sql() to handle
either performance bottlenecks or hard
queries. Start out using the
object-oriented interface for
productivity and pleasure, and the dip
beneath the surface for a
close-to-the-metal experience when you
need to.
(Quote was found here, original text is on p334 of AWDRWR, the "hammock" book).
I think that's reasonable.
Are we talking about something that find_by_sql can't handle? Or are we talking about complex non-SELECT stuff that execute can't deal with?
Any examples we could look at?

What are alternatives to find_by_sql for computationaly-heavy queries?

Our company loves reports that calculate obscure metrics--metrics that cannot be calculated with ActiveRecord's finders (except find_by_sql) and where ruport's ruby-based capabilities are just too slow.
Is there a plugin or gem or db adapter out there that will do large calculations in the database layer? What's your solution to creating intricate reports?
Although not database agnostic, our solution is plpgsql functions where it becomes really slow to use Ruby and ActiveRecord.
Thoughtbot's Squirrel plugin adds a lot of Ruby-ish functionality to ActiveRecord's find method, with multi-layered conditionals, ranges, and nested model associations:
www.thoughtbot.com/projects/squirrel/
Is there anything inherent about your reports that prevents the use of an SQL view or stored procedure?
In one particular project, a technique I often find useful is to create your SQL query (that may be quite complex) as a named view in the database, and then use
YourModel.connection.select_all(query)
to pull back the data. It's not an optimal approach; I'm keen to explore improvements to it.
Unfortunately, as you suggested, the support for doing computing complex database-based reports within rails seems fairly limited.
It sounds as if your tables could be normalized. At one place I worked, the amount of normalization we did was impacting our reporting needs, so we created some shadow tables that contained a bunch of the aggregate data, and did reporting against that.
I agree with Neil N's comment that the question is a little vague, but perhaps this gets you moving in the right direction?
You might want to look at using DataMapper or Sequel for your ORM if you're finding that ActiveRecord lacks the expressiveness you need for complex queries. Switching away from ActiveRecord wouldn't be a decision to take likely, but it might be worth investigating at least.

Why all the Active Record hate? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
As I learn more and more about OOP, and start to implement various design patterns, I keep coming back to cases where people are hating on Active Record.
Often, people say that it doesn't scale well (citing Twitter as their prime example) -- but nobody actually explains why it doesn't scale well; and / or how to achieve the pros of AR without the cons (via a similar but different pattern?)
Hopefully this won't turn into a holy war about design patterns -- all I want to know is ****specifically**** what's wrong with Active Record.
If it doesn't scale well, why not?
What other problems does it have?
There's ActiveRecord the Design Pattern and ActiveRecord the Rails ORM Library, and there's also a ton of knock-offs for .NET, and other languages.
These are all different things. They mostly follow that design pattern, but extend and modify it in many different ways, so before anyone says "ActiveRecord Sucks" it needs to be qualified by saying "which ActiveRecord, there's heaps?"
I'm only familiar with Rails' ActiveRecord, I'll try address all the complaints which have been raised in context of using it.
#BlaM
The problem that I see with Active Records is, that it's always just about one table
Code:
class Person
belongs_to :company
end
people = Person.find(:all, :include => :company )
This generates SQL with LEFT JOIN companies on companies.id = person.company_id, and automatically generates associated Company objects so you can do people.first.company and it doesn't need to hit the database because the data is already present.
#pix0r
The inherent problem with Active Record is that database queries are automatically generated and executed to populate objects and modify database records
Code:
person = Person.find_by_sql("giant complicated sql query")
This is discouraged as it's ugly, but for the cases where you just plain and simply need to write raw SQL, it's easily done.
#Tim Sullivan
...and you select several instances of the model, you're basically doing a "select * from ..."
Code:
people = Person.find(:all, :select=>'name, id')
This will only select the name and ID columns from the database, all the other 'attributes' in the mapped objects will just be nil, unless you manually reload that object, and so on.
I have always found that ActiveRecord is good for quick CRUD-based applications where the Model is relatively flat (as in, not a lot of class hierarchies). However, for applications with complex OO hierarchies, a DataMapper is probably a better solution. While ActiveRecord assumes a 1:1 ratio between your tables and your data objects, that kind of relationship gets unwieldy with more complex domains. In his book on patterns, Martin Fowler points out that ActiveRecord tends to break down under conditions where your Model is fairly complex, and suggests a DataMapper as the alternative.
I have found this to be true in practice. In cases, where you have a lot inheritance in your domain, it is harder to map inheritance to your RDBMS than it is to map associations or composition.
The way I do it is to have "domain" objects that are accessed by your controllers via these DataMapper (or "service layer") classes. These do not directly mirror the database, but act as your OO representation for some real-world object. Say you have a User class in your domain, and need to have references to, or collections of other objects, already loaded when you retrieve that User object. The data may be coming from many different tables, and an ActiveRecord pattern can make it really hard.
Instead of loading the User object directly and accessing data using an ActiveRecord style API, your controller code retrieves a User object by calling the API of the UserMapper.getUser() method, for instance. It is that mapper that is responsible for loading any associated objects from their respective tables and returning the completed User "domain" object to the caller.
Essentially, you are just adding another layer of abstraction to make the code more managable. Whether your DataMapper classes contain raw custom SQL, or calls to a data abstraction layer API, or even access an ActiveRecord pattern themselves, doesn't really matter to the controller code that is receiving a nice, populated User object.
Anyway, that's how I do it.
I think there is a likely a very different set of reasons between why people are "hating" on ActiveRecord and what is "wrong" with it.
On the hating issue, there is a lot of venom towards anything Rails related. As far as what is wrong with it, it is likely that it is like all technology and there are situations where it is a good choice and situations where there are better choices. The situation where you don't get to take advantage of most of the features of Rails ActiveRecord, in my experience, is where the database is badly structured. If you are accessing data without primary keys, with things that violate first normal form, where there are lots of stored procedures required to access the data, you are better off using something that is more of just a SQL wrapper. If your database is relatively well structured, ActiveRecord lets you take advantage of that.
To add to the theme of replying to commenters who say things are hard in ActiveRecord with a code snippet rejoinder
#Sam McAfee Say you have a User class in your domain, and need to have references to, or collections of other objects, already loaded when you retrieve that User object. The data may be coming from many different tables, and an ActiveRecord pattern can make it really hard.
user = User.find(id, :include => ["posts", "comments"])
first_post = user.posts.first
first_comment = user.comments.first
By using the include option, ActiveRecord lets you override the default lazy-loading behavior.
My long and late answer, not even complete, but a good explanation WHY I hate this pattern, opinions and even some emotions:
1) short version: Active Record creates a "thin layer" of "strong binding" between the database and the application code. Which solves no logical, no whatever-problems, no problems at all. IMHO it does not provide ANY VALUE, except some syntactic sugar for the programmer (which may then use an "object syntax" to access some data, that exists in a relational database). The effort to create some comfort for the programmers should (IMHO...) better be invested in low level database access tools, e.g. some variations of simple, easy, plain hash_map get_record( string id_value, string table_name, string id_column_name="id" ) and similar methods (of course, the concepts and elegance greatly varies with the language used).
2) long version: In any database-driven projects where I had the "conceptual control" of things, I avoided AR, and it was good. I usually build a layered architecture (you sooner or later do divide your software in layers, at least in medium- to large-sized projects):
A1) the database itself, tables, relations, even some logic if the DBMS allows it (MySQL is also grown-up now)
A2) very often, there is more than a data store: file system (blobs in database are not always a good decision...), legacy systems (imagine yourself "how" they will be accessed, many varieties possible.. but thats not the point...)
B) database access layer (at this level, tool methods, helpers to easily access the data in the database are very welcome, but AR does not provide any value here, except some syntactic sugar)
C) application objects layer: "application objects" sometimes are simple rows of a table in the database, but most times they are compound objects anyway, and have some higher logic attached, so investing time in AR objects at this level is just plainly useless, a waste of precious coders time, because the "real value", the "higher logic" of those objects needs to be implemented on top of the AR objects, anyway - with and without AR! And, for example, why would you want to have an abstraction of "Log entry objects"? App logic code writes them, but should that have the ability to update or delete them? sounds silly, and App::Log("I am a log message") is some magnitudes easier to use than le=new LogEntry(); le.time=now(); le.text="I am a log message"; le.Insert();. And for example: using a "Log entry object" in the log view in your application will work for 100, 1000 or even 10000 log lines, but sooner or later you will have to optimize - and I bet in most cases, you will just use that small beautiful SQL SELECT statement in your app logic (which totally breaks the AR idea..), instead of wrapping that small statement in rigid fixed AR idea frames with lots of code wrapping and hiding it. The time you wasted with writing and/or building AR code could have been invested in a much more clever interface for reading lists of log-entries (many, many ways, the sky is the limit). Coders should dare to invent new abstractions to realize their application logic that fit the intended application, and not stupidly re-implement silly patterns, that sound good on first sight!
D) the application logic - implements the logic of interacting objects and creating, deleting and listing(!) of application logic objects (NO, those tasks should rarely be anchored in the application logic objects itself: does the sheet of paper on your desk tell you the names and locations of all other sheets in your office? forget "static" methods for listing objects, thats silly, a bad compromise created to make the human way of thinking fit into [some-not-all-AR-framework-like-]AR thinking)
E) the user interface - well, what I will write in the following lines is very, very, very subjective, but in my experience, projects that built on AR often neglected the UI part of an application - time was wasted on creation obscure abstractions. In the end such applications wasted a lot of coders time and feel like applications from coders for coders, tech-inclined inside and outside. The coders feel good (hard work finally done, everything finished and correct, according to the concept on paper...), and the customers "just have to learn that it needs to be like that", because thats "professional".. ok, sorry, I digress ;-)
Well, admittedly, this all is subjective, but its my experience (Ruby on Rails excluded, it may be different, and I have zero practical experience with that approach).
In paid projects, I often heard the demand to start with creating some "active record" objects as a building block for the higher level application logic. In my experience, this conspicuously often was some kind of excuse for that the customer (a software dev company in most cases) did not have a good concept, a big view, an overview of what the product should finally be. Those customers think in rigid frames ("in the project ten years ago it worked well.."), they may flesh out entities, they may define entities relations, they may break down data relations and define basic application logic, but then they stop and hand it over to you, and think thats all you need... they often lack a complete concept of application logic, user interface, usability and so on and so on... they lack the big view and they lack love for the details, and they want you to follow that AR way of things, because.. well, why, it worked in that project years ago, it keeps people busy and silent? I don't know. But the "details" separate the men from the boys, or .. how was the original advertisement slogan ? ;-)
After many years (ten years of active development experience), whenever a customer mentions an "active record pattern", my alarm bell rings. I learned to try to get them back to that essential conceptional phase, let them think twice, try them to show their conceptional weaknesses or just avoid them at all if they are undiscerning (in the end, you know, a customer that does not yet know what it wants, maybe even thinks it knows but doesn't, or tries to externalize concept work to ME for free, costs me many precious hours, days, weeks and months of my time, live is too short ... ).
So, finally: THIS ALL is why I hate that silly "active record pattern", and I do and will avoid it whenever possible.
EDIT: I would even call this a No-Pattern. It does not solve any problem (patterns are not meant to create syntactic sugar). It creates many problems: the root of all its problems (mentioned in many answers here..) is, that it just hides the good old well-developed and powerful SQL behind an interface that is by the patterns definition extremely limited.
This pattern replaces flexibility with syntactic sugar!
Think about it, which problem does AR solve for you?
Some messages are getting me confused.
Some answers are going to "ORM" vs "SQL" or something like that.
The fact is that AR is just a simplification programming pattern where you take advantage of your domain objects to write there database access code.
These objects usually have business attributes (properties of the bean) and some behaviour (methods that usually work on these properties).
The AR just says "add some methods to these domain objects" to database related tasks.
And I have to say, from my opinion and experience, that I do not like the pattern.
At first sight it can sound pretty good. Some modern Java tools like Spring Roo uses this pattern.
For me, the real problem is just with OOP concern. AR pattern forces you in some way to add a dependency from your object to infraestructure objects. These infraestructure objects let the domain object to query the database through the methods suggested by AR.
I have always said that two layers are key to the success of a project. The service layer (where the bussiness logic resides or can be exported through some kind of remoting technology, as Web Services, for example) and the domain layer. In my opinion, if we add some dependencies (not really needed) to the domain layer objects for resolving the AR pattern, our domain objects will be harder to share with other layers or (rare) external applications.
Spring Roo implementation of AR is interesting, because it does not rely on the object itself, but in some AspectJ files. But if later you do not want to work with Roo and have to refactor the project, the AR methods will be implemented directly in your domain objects.
Another point of view. Imagine we do not use a Relational Database to store our objects. Imagine the application stores our domain objects in a NoSQL Database or just in XML files, for example. Would we implement the methods that do these tasks in our domain objects? I do not think so (for example, in the case of XM, we would add XML related dependencies to our domain objects...Truly sad I think). Why then do we have to implement the relational DB methods in the domain objects, as the Ar pattern says?
To sum up, the AR pattern can sound simpler and good for small and simple applications. But, when we have complex and large apps, I think the classical layered architecture is a better approach.
The question is about the Active
Record design pattern. Not an orm
Tool.
The original question is tagged with rails and refers to Twitter which is built in Ruby on Rails. The ActiveRecord framework within Rails is an implementation of Fowler's Active Record design pattern.
The main thing that I've seen with regards to complaints about Active Record is that when you create a model around a table, and you select several instances of the model, you're basically doing a "select * from ...". This is fine for editing a record or displaying a record, but if you want to, say, display a list of the cities for all the contacts in your database, you could do "select City from ..." and only get the cities. Doing this with Active Record would require that you're selecting all the columns, but only using City.
Of course, varying implementations will handle this differently. Nevertheless, it's one issue.
Now, you can get around this by creating a new model for the specific thing you're trying to do, but some people would argue that it's more effort than the benefit.
Me, I dig Active Record. :-)
HTH
Although all the other comments regarding SQL optimization are certainly valid, my main complaint with the active record pattern is that it usually leads to impedance mismatch. I like keeping my domain clean and properly encapsulated, which the active record pattern usually destroys all hope of doing.
I love the way SubSonic does the one column only thing.
Either
DataBaseTable.GetList(DataBaseTable.Columns.ColumnYouWant)
, or:
Query q = DataBaseTable.CreateQuery()
.WHERE(DataBaseTable.Columns.ColumnToFilterOn,value);
q.SelectList = DataBaseTable.Columns.ColumnYouWant;
q.Load();
But Linq is still king when it comes to lazy loading.
#BlaM:
Sometimes I justed implemented an active record for a result of a join. Doesn't always have to be the relation Table <--> Active Record. Why not "Result of a Join statement" <--> Active Record ?
I'm going to talk about Active Record as a design pattern, I haven't seen ROR.
Some developers hate Active Record, because they read smart books about writing clean and neat code, and these books states that active record violates single resposobility principle, violates DDD rule that domain object should be persistant ignorant, and many other rules from these kind of books.
The second thing domain objects in Active Record tend to be 1-to-1 with database, that may be considered a limitation in some kind of systems (n-tier mostly).
Thats just abstract things, i haven't seen ruby on rails actual implementation of this pattern.
The problem that I see with Active Records is, that it's always just about one table. That's okay, as long as you really work with just that one table, but when you work with data in most cases you'll have some kind of join somewhere.
Yes, join usually is worse than no join at all when it comes to performance, but join usually is better than "fake" join by first reading the whole table A and then using the gained information to read and filter table B.
The problem with ActiveRecord is that the queries it automatically generates for you can cause performance problems.
You end up doing some unintuitive tricks to optimize the queries that leave you wondering if it would have been more time effective to write the query by hand in the first place.
Try doing a many to many polymorphic relationship. Not so easy. Especially when you aren't using STI.

Resources