Rails - Large database queries - ruby-on-rails

Let's say I need to implement a search algorithm for a product catalog database. This would include multiple joins across the products table, manufacturers table, inventory table, etc. etc.
In .NET / MSSQL, I would isolate such logic in a DB stored procedure, then write a wrapper method in my data access layer of my .NET app to simply call this stored procedure.
How does something like this work in RoR? From my basic understanding, RoR uses its ORM by default. Does this mean, I have to move my search logic into the application layer, and write it using its ORM? The SQL stored proc is pretty intense... For performance, it needs to be in the stored procedure.
How does this work in RoR?
Edit: From the first two responses, I gather that ActiveRecord is the way to do things in Ruby. Does this mean that applications that require large complex queries with lots of joins, filtering and even dynamic SQL can (should) be re-written using ActiveRecord classes?
Thanks!

While it is possible to run raw SQL statements in Rails, using the execute method on a connection object, by doing so you will forfeit all the benefits of ActiveRecord. If you still want to go down this path, you can use it like so:
ActiveRecord::Base.connection.execute("call stored_procedure_name")
Another option to explore might be to create a "query object" to encapsulate your query logic. Inside, you could still use ActiveRecord query methods. ActiveRecord has become fairly proficient in optimizing your SQL queries, and there is still some manual tweaking you could do.
Below is a simple scaffoold for such an object:
# app/queries/search_products.rb
class SearchProducts
def initialize(params)
#search = search
end
def call
Product.where(...) # Plus additional search logic
end
end
The third option would be to go with something like elasticsearch-rails or sunspot. This will require some additional setup time and added complexity, but might pay off down the line, if your search requirements change.

Stored procedure is one of way to make the apps can be faster sometimes but it will need high costs and time to debug code for developer. Rails is using ActiveRecord ORM so if you want to use stored procedure that will lead to the main function ActiveRecord unused well.
There are some explains about rails
stored-procedures-in-ruby-on-rails and using stored procedure

Related

Using ActiveRecord interface for Models backed by external API in Ruby on Rails

I'm trying to use Models in my Rails application that retrieve information from an external API. What I would like to do is access my data models (which may consist of information resulting from multiple API calls) in a way similar to what an ActiveRecord model would provide (specifically associations, and the same style of chain-able query methods).
My initial instinct was to recreate the parts of ActiveRecord that I wanted and incorporate this API. Not wanting to 'reinvent the wheel' and seeing exactly how much work would be required to add more functionality have made me take a step back and reevaluate how to approach this.
I have found ways to use ActiveRecord without a table (see: Railscast #193 Tableless Model and the blog post here) and looked into ActiveRecord. Because ActiveModel only seems to include Validations I'm not sure that's very helpful in this situation. The workaround to using ActiveRecord without a table seems like the best option, but I suspect there's a cleaner way of doing this that I'm just not seeing.
Here is a gist containing some of the code written when I was trying to recreate the ActiveRecord functionality, borrowing heavily from the ActiveRecord source itself.
My question boils down to: I can get the functionality I want (chaining query methods, relations) by either implementing the workaround to ActiveRecord specified above or recreating the functionality myself, but are these really ideal solutions?
Remember that Rails is still just Ruby underneath.
You could represent the external API as instantiated classes within your application.
class Event
def self.find(id)
#...External http call to get some JSON...#
new(json_from_api)
end
def initialize(json)
#...set up your object here...#
end
def attendees
#...external http call to get some JSON and then assemble it
#...into an array of other objects
end
end
So you end up writing local abstractions to create ruby objects from api calls, you can probably also mix in ActiveModel, or Virtus into it, so you can use hash assignment of attributes, and validations for forms etc.
Take a look at an API abstraction I did for the TfL feed for the tube. service_disruption

Rails App Interface Architecture Design

I have a rails application that serves as an interface to a hybrid of data. Most of the information I require is retrieved from the command-line program using XML-RPC. Aside from this, I require some additional bit of data which I have no option but to store in a database. For this reason, I am having trouble figuring out what would be the best way to design the application.
I have overridden self.all and self.find(id) such that they rely on calls to super and then "enrich" the object by defining its instance variables to the appropriate data retrieved from the program using XML-RPC.
This all seems pretty convoluted though. For example, I imagine I have lost the ability to use the magic finders (find_by_x), and I don't know if anything else will break as a result of this.
My question is if there is a more logical and sensible way of going on about doing this. That is, designing an application that depends on XML-RPC for its data for the most part, but also some data stored in a database.
I did read about after_find. Using this callback, I can implement the "object enriching" process and have it run anytime there is a found record. However, my method of retrieving data associated with an item is different than that of retrieving all item data. The way I do it for retrieving all item data (self.all) is way more efficient, but unfortunately not applicable, to retrieving only one item's data (self.find). This would work well if there were a way I could make the callback not apply to self.all calls.
In my experience, you shouldn't mess with ActiveRecord's finders - there is a lot of magic that they rely on.
after_find is a great direction to start with, but if you're having issues with batching, then what I'd recommend is twofold - use a caching layer and alias_method_chain to implement a version of #all that performs your batched XML-RPC find, caches it, and then pass the call through to the unaliased original all. Then, your after_find would check your cache for data first, and if it's not there, perform the remote find. This would let you successfully batch data for all finds while utilizing the callback.
That said, there is probably an easier way to do this. I would just use models that don't descend from ActiveRecord::Base, but rather, which descend from some XMLRPC base interface, and then have faux associations on them that point to AR instances with your database information. Thus, you might have something like:
class XmlRpcModelBase
...
def find(...)
end
def all(...)
end
def extra_data
#extra_data ||= SomeActiveRecordModel.find(...)
end
end
class Foo < XmlRpcModelBase
end
It's not ideal, and honestly, it's going to depend a lot on how much of this is read, and how much is read/write, but I would try to stay out of ActiveRecord's way where possible, and instead just bolt on the AR-related pieces as necessary.

How to use ActiveRecord Query Cache with Custom SQL

In a stats part of a Rails app, I have some custom SQL calls that are called with ActiveRecord::Base.execute() from the model code. They return various aggregates.
Some of these (identical) queries are run in a loop in the controller, and it seems that they aren't cached by the ActiveRecord query cache.
Is there any way to cache custom SQL queries within a single request?
Not sure if AR supports query caching for #execute, you might want to dig in the documentation.
Anyway, what you can do, is to use memoization, which means that you'll keep the results manually until the current request is over.
do something like this in your model:
def repeating_method_with_execute
#rs ||= ActiveRecord::Base.execute(...)
end
This will basically run the query only on the first time and then save the response to #rs until the entire request is over.
If I am not wrong, Rails 2.x already has a macro named memoization on ActiveRecord that does all that automatically
hope it helps

Rails: Accessing a database not meant for Rails?

I have a standard rails application, that uses a mysql database through Active Record, with data loaded through a separate parsing process from a rather large XML file.
This was all well and good, but now I need to load data from an Oracle database, rather than the XML file.
I have no control how the database looks, and only really need a fraction of the data it contains (maybe one or two columns out of a few tables). As such, what I really want to do is make a call to the database, get data back, and put the data in the appropriate locations in my existing, Rails friendly mysql database.
How would I go about doing this? I've heard* you can (on a model by model basis) specifiy different databases for Rails Models to use, but that sounds like they use them in their entirety, (that is, the database is Rails friendly). Can I make direct Oracle calls? Is there a process that makes this easier? Can Active Record itself handle this?
A toy example:
If I need to know color, price, and location for an Object, then normally I would parse a huge XML file to get this information. Now, with oracle, color, price, and location are all in different tables, indexed by some ID (there isn't actually an "Object" table). I want to pull all this information together into my Rails model.
Edit: Sounds like what I'd heard about was ActiveRecord's "establish_connection" method...and it does indeed seem to assume one model is mapped to one table in the target database, which isn't true in my case.
Edit Edit: Ah, looks like I might be wrong there. "establish_connection" might handle my situation just fine (just gotta get ORACLE working in the first place, and I'll know for sure... If anyone can help, the question is here)
You can create a connection to Oracle directly and then have ActiveRecord execute a raw SQL statement to query your tables (plural). Off the top of my head, something like this:
class OracleModel < ActiveRecord::Base
establish_connection(:oracle_development)
def self.get_objects
self.find_by_sql("SELECT...")
end
end
With this model you can do OracleModel.get_objects which will return a set of records whereby the columns specified in the SELECT SQL statement are attributes of each OracleModel. Obviously you can probably come up with a more meaningful model name than I have!
Create an entry named :oracle_development in your config/database.yml file with your Oracle database connection details.
This may not be exactly what you are looking for, but it seems to cover you situation pretty well: http://pullmonkey.com/2008/4/21/ruby-on-rails-multiple-database-connections/
It looks like you can make an arbitrarily-named database configuration in the the database.yml file, and then have certain models connect to it like so:
class SomeModel < ActiveRecord::Base
establish_connection :arbitrary_database
#other stuff for your model
end
So, the solution would be to make ActiveRecord models for just the tables you want data out of from this other database. Then, if you really want to get into some sql, use ActiveRecord::Base.connection.execute(sql). If you need it as a the actual active_record object, do SomeModel.find_by_sql(sql).
Hope this helps!
I don't have points enough to edit your question, but it sounds like what you really need is to have another "connection pool" available to the second DB -- I don't think Oracle itself will be a problem.
Then, you need to use these alternate connections to "simply" execute a custom query within the appropriate controller method.
If you only need to pull data from your Oracle database, and if you have any ability to add objects to a schema that can see the data you require . . . .
I would simplify things by creating a view on the Oracle table that projects the data you require in a nice friendly shape for ActiveRecord.
This would mean maintaining code to two layers of the application, but I think the gain in clarity on the client-side would outweigh the cost.
You could also use the CREATE OR REPLACE VIEW Object AS SELECT tab1., tab2. FROM tab1,tab2 syntax so the view returned every column in each table.
If you need to Insert or Update changes to your Rails model, then you need to read up on the restrictions for doing Updates through a view.
(Also, you may need to search on getting Oracle to work with Rails as you will potentially need to install the Oracle client software and additional Ruby modules).
Are you talking about an one-time data conversion or some permanent data exchange between your application and the Oracle database? I think you shouldn't involve Rails in. You could just make a SQL query to the Oracle database, extract the data, and then just insert it into the MySQL database.

How to make ActiveRecord work with legacy partitioned/sharded databases/tables?

thanks for your time first...after all the searching on google, github and here, and got more confused about the big words(partition/shard/fedorate),I figure that I have to describe the specific problem I met and ask around.
My company's databases deals with massive users and orders, so we split databases and tables in various ways, some are described below:
way database and table name shard by (maybe it's should be called partitioned by?)
YZ.X db_YZ.tb_X order serial number last three digits
YYYYMMDD. db_YYYYMMDD.tb date
YYYYMM.DD db_YYYYMM.tb_ DD date too
The basic concept is that databases and tables are seperated acording to a field(not nessissarily the primary key), and there are too many databases and too many tables, so that writing or magically generate one database.yml config for each database and one model for each table isn't possible or at least not the best solution.
I looked into drnic's magic solutions, and datafabric, and even the source code of active record, maybe I could use ERB to generate database.yml and do database connection in around filter, and maybe I could use named_scope to dynamically decide the table name for find, but update/create opertions are bounded to "self.class.quoted_table_name" so that I couldn't easily get my problem solved. And even I could generate one model for each table, because its amount is up to 30 most.
But this is just not DRY!
What I need is a clean solution like the following DSL:
class Order < ActiveRecord::Base
shard_by :order_serialno do |key|
[get_db_config_by(key), #because some or all of the databaes might share the same machine in a regular way or can be configed by a hash of regex, and it can also be a const
get_db_name_by(key),
get_tb_name_by(key),
]
end
end
Can anybody enlight me? Any help would be greatly appreciated~~~~
Case two (where only db name changes) is pretty easy to implement with DbCharmer. You need to create your own sharding method in DbCharmer, that would return a connection parameters hash based on the key.
Other two cases are not supported right away, but could be easily added to your system:
You implement sharding method that knows how to deal with database names in your sharded dabatase, this would give you an ability to do shard_for(key) calls to your model to switch db connection.
You add a method like this:
class MyModel < ActiveRecord::Base
db_magic :sharded => { :sharded_connection => :my_sharding_method }
def switch_shard(key)
set_table_name(table_for_key(key)) # switch table
shard_for(key) # switch connection
end
end
Now you could use your model like this:
MyModel.switch_shard(key).first
MyModel.switch_shard(key).count
and, considering you have shard_for(key) call results returned from the switch_shard method, you could use it like this:
m = MyModel.switch_shard(key) # Switch connection and get a connection proxy
m.first # Call any AR methods on the proxy
m.count
If you want that particular DSL, or something that matches the logic behind the legacy sharding you are going to need to dig into ActiveRecord and write a gem to give you that kind of capability. All the existing solutions that you mention were not necessarily written with your situation in mind. You may be able to bend any number of solutions to your will, but in the end you're gonna have to probably write custom code to get what you are looking for.
Sounds like, in this case, you should consider not use SQL.
If the data sets are that big and can be expressed as key/value pairs (with a little de-normalization), you should look into couchDB or other noSQL solutions.
These solutions are fast, fully scalable, and is REST based, so it is easy to grow and backup and replicate.
We all have gotten into solving all our problems with the same tool (Believe me, I try to too).
It would be much easier to switch to a noSQL solution then to rewrite activeRecord.

Resources