Is it a good practice to directly write SQL query in rails? - ruby-on-rails

I am making an application with very huge data and multiple joins. Is it a bad practice to right away use the full sql string in rails? What are the downsides of writing the full sql query in rails?

It's only bad practice if you do it without understanding the alternatives.
That said there is rarely a reason to do this. The framework encapsulates it for you and the benefit is that you have to write less code. The other benefit is database independence. The more direct queries you write, the more likely you'll write something that will break when you switch database engines.
It is easy to test. If you are using the framework properly (i.e. optimizing ActiveRecord as you will find discussed in numerous articles) and still feel like your queries are too slow...you can always benchmark direct queries.
But not knowing how to do something using ActiveRecord associations is not a good reason to resort to direct SQL.
http://guides.rubyonrails.org/association_basics.html

SQL is not a 'bad practice' per se. Database systems have plenty of native SQL ways of doing things that would be much slower to execute and more complex to write and maintain if written in Ruby. Like Oracle's Analytic Functions.
That said, ActiveRecord is pretty easy to write and you probably aren't going to get a performance boost just by using a SQL query. At least not if the query you write resembles the query ActiveRecord would have written anyway! ;)
Perhaps you should try to work with ActiveRecord and only resort to SQL if you hit problems you can't solve another way. That way you keep your code simple until you need to do it another way (i.e. don't 'optimise early').
I generally try to make things work in ActiveRecord (or DataMapper or Sequel or whatever), but I have definitely resorted finder_sql when the job needed doing quickly and I couldn't get where I wanted to go using the ORM's 'sugar'. Other times I have based a rails object on a single massive view in the database.
Hope this helps.
:D

If you need more powerfull syntax than provides standard ActiveRecord module, see meta_where gem.

Related

There is no real 'prepared statement' in rails?

When we use ActiveRecord, we can use:
User.find(:first, :conditions=>["name=?", name])
It looks like ActiveRecord are using 'prepared statement', but after looking into the code, I found ActiveRecord just use String.dup and connection.quote() to adjust the content to build a sql, not like Java.
So, there is no real prepared statment in raiils? And why rails doesn't provide it?
If by prepared statement you mean a piece of SQL that is held in a compiled form for more efficient execution, then you're correct: it's not something implemented in Rails. There are several reasons for this:
Rails is the web framework - it doesn't know or care about databases. By default, Rails uses
ActiveRecord for object-relational mapping. You can change this if you want to - it doesn't have to be a RDBMS even.
ActiveRecord doesn't know about specific database platforms. For that it relies on "adapters", which translate AR's requirements into platform-specific SQL.
Some RDBMSs that have AR adapters support prepared statements (or equivalent) but others don't.
Rails 3.1 (or greater) supports prepared statements. Every new query you make will be added to pool of prepared statement. If you are are using MySql DB it still won't support prepared statement because using prepared statements in MySql reduces the performance compare to without prepared statement. Here is the nice article by Pat Shaughnessy on this.
In many (most?) ways, a DB View is a prepared statement. And it is easy to use views with ActiveRecord. Just declare the view in your DB (I use rake tasks) and then access it via ActiveRecord.
Works great for read-only access to complicated SQL. Saves the DB from many of the steps required to parse/compute SQL.

ORM that accepts SQL and simply maps the objects and relations?

I find that the use of ActiveRecord affects the way I design the database schema (though I wish it wouldn't). I'm thinking about the inefficiency of fetching data and how to reduce the overall number of queries. The find :include option can only get you so far. I come from writing stored procs that grab everything you need (for a particular screen or activity) in a single call.
I really don't need an API for writing my SQL. I'm entirely content to write T-SQL in a non-Ruby way. The only thing I want is to have my query results mapped to instantiated models and their associations. Are there any ORMs that take this approach? Ones that can handle multiple selects (stored procs?) and maybe even the use of temp tables...
EDIT:
I rephrased and clarified what I was really after with this question.
I find that the need for a quite complex query is about the 20% of a project, so using an ORM quite helps.
When that 20% arise, I find myself doing something alike what you ask, working excluvely with SQL. ActiveRecord and DataMapper have a find_by_sql method that helps you more, but doesn't instantiate all models (at least on ActiveRecord, if I'm not mistaken, that is).
Have you tried using Sequel? It's an ORM too but let's you have an easier approach and more flexibility on what you have.
Besides that, I can't think of a more focused solution on the ORMs realm. Keep in mind that the ORM tries to abstract the querying interface to simplify. If you are feeling confortable with using raw SQL, maybe you could be more productive with just an SQL facade interface.
look at the ActiveRecord find_by_sql() method.
http://github.com/rails/rails/blob/491f1b5f36a11427decc5d7f3558b5c3f5f243c1/activerecord/lib/active_record/base.rb#L660

Sequel in conjunction with ActiveRecord any gotchas?

I'm considering using Sequel for some of my hairier SQL that I find too hard to craft in Active Record.
Are there any things I need to be aware of when using Sequel and ActiveRecord on the same project? (Besides the obvious ones like no AR validations in sequel etc...)
Disclaimer: I'm the Sequel maintainer.
Sequel is easy to use along side of or instead of ActiveRecord when using Rails. You do have to setup the database connection manually, but other than that, the usage is similar. Your Sequel model files go in app/models and work similarly to ActiveRecord models.
Setting up the database connections isn't tedious, it's generally one line in environment.rb to require sequel, and a line in each environment file (development.rb, test.rb, production.rb) to do something like:
DB = Sequel.connect(...)
So it's only tedious if you consider 4 lines of setup code tedious.
Using raw SQL generally isn't a problem unless you are targeting multiple databases. The main reason to avoid it is the increased verbosity. Sequel supports using raw SQL at least as easily as ActiveRecord, but the times where you need to use raw SQL are generally fairly rare in Sequel.
BTW, Sequel ships with multiple validation plugins. The validation_class_methods plugin is similar to ActiveRecord validations, using class methods. The validation_helpers plugin has a simpler implementation using instance level methods, but both can do roughly the same thing.
Finally, I'll say that if you already have working ActiveRecord code that does what you want, it's probably not worth the effort to port the code to Sequel unless you plan on adding features.
Personally, I wouldn't do it. Just managing connection more-or-less by hand would be tedious, for a start. I'd be more inclined, if I felt Sequel was the stronger option, to hold off for Rails 3.0 (or perhaps start developing against Edge Rails) where it should be fairly easy to switch ORMs, if Yehuda and co are doing their stuff right. A lot more Merb-like than now, at least.
This was DHH's take on the subject (I'm not saying it should be taken as gospel truth, mind, but it is, so to speak, from the horse's mouth):
But Isn’t Sql Dirty?
Ever since programmers started to
layer object-oriented systems on top
of relational databases, they’ve
struggled with the question of how
deep to run the abstraction. Some
object-relational mappers seek to
eradicate the use of SQL entirely,
striving for object oriented purity by
forcing all queries through another OO
layer.
Active Record does not. It was built
upon the notion that SQL is neither
dirty nor bad, just verbose in the
trivial cases. The focus is on
removing the need to deal with the
verbosity in those trivial cases but
keeping the expressiveness around for
hard queries – the type SQL was
created to deal with elegantly.
Therefore, you shouldn’t feel guilty
when you use find_by_sql() to handle
either performance bottlenecks or hard
queries. Start out using the
object-oriented interface for
productivity and pleasure, and the dip
beneath the surface for a
close-to-the-metal experience when you
need to.
(Quote was found here, original text is on p334 of AWDRWR, the "hammock" book).
I think that's reasonable.
Are we talking about something that find_by_sql can't handle? Or are we talking about complex non-SELECT stuff that execute can't deal with?
Any examples we could look at?

If using LINQ to SQL is there any good reason to learn SQL queries/syntax anymore?

I do understand SQL querying and syntax because of previous work using ASP.NET web forms and stored procedures, but I would not call myself an "expert" in it.
Since I have been using ASP.NET MVC and LinqToSql it seems that so much of the heavy lifting is done for me and encapsulated away at the SQL end that I'm questioning whether there is any benefit in continuing to top-up my knowledge of SQL queries or whether I'm better off focusing my "learning time" on other things.
Your thoughts?
You should absolutely know SQL and keep your knowledge up-to-date. ORM is designed to ease the pain of doing something tedious that you know how to do, much like a graphing calculator is designed to do something that you can do by hand (and should know how).
The minute you start letting your ORM do things in the database that you don't fully understand is the minute you've lost control over your model.
In my opinion, knowing SQL is more valuable than any vendor specific technology. There will always be cases when those nice prepackaged frameworks will not be able to solve a particular situation and knowledge of advanced SQL will be required.
It is still important to learn SQL queries/syntax. The reason is you need to at least understand how Linq to SQL translate to the database behind the scenes.
This will help you when you find problems, for example something not updating correctly. Or a query performance needs to increase.
It is the same that you need to understand what assembly language is and how it eventually becomes machine language. However in all you don't have to be an expert, but at least be able to write in it and understand it.
It is still important to know SQL and the paradigm (set-based) behind it to be able to create efficient SQL statements, even if your using LinqToSql or any other OR/M.
There will always be situations where you will want to write the query in native SQL because it is not possible to write it in LinqToSql / HQL / whatever, or LinqToSql is just not able to generate a performant query for it.
There will always be situations where you will want to execute an ad-hoc query on a database using native sql, etc...
I think LinqToSQL (or other Linq to SQL providers) should not prevent you of knowing SQL.
When your query is not returning what you expect, or when it takes 30 minutes to run on the production database, you'd better be able to understand what LTS has generated, and why it is failing.
I know, it's a rehashed topic, and it might not be applicable to what you do ("small" database that will never hit that kind of problem etc), but it pays not to get too oblivious of abstraction layers sometimes.
The other reason is, Linq does not the whole range of what you can do in SQL, so you might have to resort to writing "raw" SQL, even if the result is materialised as objects.
It depends what you're working on, and from what you said it might make more sense to focus on other areas.
Having said that I find knowing SQL allows the following:
The ability to write queries to extract data from systems easily.
For adhoc queries, or for checking things.
The ability to write complex stored procedures, which allows me to group complex data processing in one place, where it should be, in the database.
The ability to fine tune LinqToSql by adding indexes, and understanding the SQL/query plan's it procedures.
Most of these are more of a help on more complex systems, so if you're not working on those it might not be as much of a help.
It may help in your situation to list the technologies which might be of use, and then prioritise them.
In order words make a development plan for yourself, which may encompass more then just learning technical knowledge but allow a more broad focus like design patterns, communication skills and other areas.
SQL is a tool. Linq to SQL is also a tool. Having more tools in your belt is a good thing. It'll give you more perspectives when attacking a problem.
Consider a scenario where you may want to do multiple queries or multiple updates to the db in one operation. If you can write TSQL you can potentially save yourself a lot of roundtrips to the database.
I would say you definately need to know your SQL in depth, because you need to know what code your Linq-expression generates and what effects the code will have if you want high performing queries. Sure you might get the job done in most cases, but sometimes there is a huge difference in performance in very subtle difference in Linq-syntax.
I ran into this this morning actually, where I had done .Any(d => d.Id == (...).First().Id) instead of doing where (...).Any(i => i.Id == d.Id). This resulted in the query executing five times slower.
Sometimes you need to analyze the actual Sql-query to realise the mistakes you make.
Its always a good think to learn the underlying language for stuff like Linq To SQL. SQL is pretty much standardized and it will help you understand a new paradigm in programming.
You may not always be working in .NET.
Doesn't hurt to know the underlying concepts.
LINQ to SQL is not being maintained anymore in favor of the Entity Framework
Sooner or later you will run into problems that need at leat a working knowledge of SQL to solve. And sooner or later you will run into requirements that are best realised in the DB (whether in SP-s or in triggers or views or whaterver).
LINQ To SQL will only work with .NET. IF you happen get another job where you are not working with .NET, then you will have to go back to writing Stored Procs.
Knowing SQL will also give you a better understanding of how the server operates as well as possibly making you a better database designer.

What are alternatives to find_by_sql for computationaly-heavy queries?

Our company loves reports that calculate obscure metrics--metrics that cannot be calculated with ActiveRecord's finders (except find_by_sql) and where ruport's ruby-based capabilities are just too slow.
Is there a plugin or gem or db adapter out there that will do large calculations in the database layer? What's your solution to creating intricate reports?
Although not database agnostic, our solution is plpgsql functions where it becomes really slow to use Ruby and ActiveRecord.
Thoughtbot's Squirrel plugin adds a lot of Ruby-ish functionality to ActiveRecord's find method, with multi-layered conditionals, ranges, and nested model associations:
www.thoughtbot.com/projects/squirrel/
Is there anything inherent about your reports that prevents the use of an SQL view or stored procedure?
In one particular project, a technique I often find useful is to create your SQL query (that may be quite complex) as a named view in the database, and then use
YourModel.connection.select_all(query)
to pull back the data. It's not an optimal approach; I'm keen to explore improvements to it.
Unfortunately, as you suggested, the support for doing computing complex database-based reports within rails seems fairly limited.
It sounds as if your tables could be normalized. At one place I worked, the amount of normalization we did was impacting our reporting needs, so we created some shadow tables that contained a bunch of the aggregate data, and did reporting against that.
I agree with Neil N's comment that the question is a little vague, but perhaps this gets you moving in the right direction?
You might want to look at using DataMapper or Sequel for your ORM if you're finding that ActiveRecord lacks the expressiveness you need for complex queries. Switching away from ActiveRecord wouldn't be a decision to take likely, but it might be worth investigating at least.

Resources