mysql2 driver seems to write invalid queries - ruby-on-rails

I'm developing an application layer on top of a rails app developed by someone else.
His application uses a module called request_logger to write to a table, which worked fine under ruby1.8/rails2/mysql gem, but in my ruby1.9/rails3/mysql2 environment, activerecord falls over, suggesting that the generated query is invalid.
It obviously is, all mysql relation names are wrapped in double quotes instead of backticks.
The call to activerecord itself just sets a bunch of attributes with
log.attributes = {
:user_id => user_id,
:controller => controller,
...etc
}
and then calls
log.save
So I'm leaning towards it not being dodgy invocation. Any suggestions?

mysql2 works fine for a lot of people, but it unashamedly sacrifices conformance to the MySQL C API for performance in the common tasks. Perhaps, if request_logger is low-level enough, it's expecting calls to exist which don't.
It's trivial to switch back to using mysql - give it a try, and if it works, stick with it. Remember to change both your Gemfile and your config/database.yml settings.

It turned out to be what seems to be a change in behaviour between rails 2 and 3 (we have the same setup working fine in rails 2)
We use database.yml to specify an (empty) "master" database and then feed in our clients with shards+octopus.
The master db is sqlite for simplicity, and it seems that activerecord was feeding off requests formatted for sqlite to the mysql2 shards, regardless of their adaptor type.

Related

What is the source of "unknown OID" errors in Rails?

When replicating an app to production, my POSTGIS table columns started misbehaving, with Rails informing me there was an "unknown OID 26865" and that the fields would be treated as String.
Instead of current_pos yielding e. g.
#<RGeo::Geographic::SphericalPointImpl:0x22fabdc "POINT (13.39318248760133 52.52908798020595)"> I would get 0101000020E6100000FFDD958664C92A403619DEE6B2434A40. It looked like the activerecord-postgis-adapter was not installed, or installed badly, but I eliminated that possibility by testing for the existence of data type RGeo::Feature::Point and by test-assigning
current_pos = "POINT (13.39318248760133 52.52908798020595)"
to the field - which proceeded without error but then yielded another incomprehensible hex string like the above.
Also, strangely enough, POSTGIS was working correctly within the database, e.g. giving correct results for a ST_DISTANCE query. A very limited problem thus, where writing, writing-parsing (from Point to hex format), manipulating by SQL and reading all worked, only the parsing upon read didn't.
When I tried to use migrations to ensure the database column would have the correct type, the migrations failed, giving
undefined method `st_point' for #<ActiveRecord::ConnectionAdapters::PostgreSQL::TableDefinition:0x00000005cb80b8>
I spent several hours trying all kinds of solutions, even re-installing the server from scratch, double-checking version numbers of everything, installing a slightly newer version of Ruby and a slightly older version of POSTGIS (to match my other environment), exporting the database and starting with a clean one, and so on. After I had done migrations and arrived at the "undefined method st_point" error, I was finally able to find the solution via Google, way down in a Github issue, and it's really simple:
In config/database.yml, swap out postgres:// for postgis:// in the database url. If you're using Heroku, this may require some ugly manipulation:
production:
url: <%= ENV.fetch('DATABASE_URL', '').sub(/^postgres/, "postgis") %>
So silly...
Do not forget to add activerecord-postgis-adapter to your Gemfile so #Sprachprofi's solution can run.

Rails: soulmate gem pipelining for multiple types

I'm using the soulmate gem for autocompletion in my Rails app, and a big problem I'm encountering is the latency for the query from the client to the Redis server and back.
A quick look at the gem code shows
# in lib/soulmate/server.rb#search
types.each do |type|
matcher = Matcher.new(type)
results[type] = matcher.matches_for_term(term, :limit => limit)
end
As you can see, a new instance of Soulmate::Matcher is created for each type (i.e., "location", "user", "venue", etc.) and then queries the Redis server using the matches_for_term method. That means if I want to query 3 types at a time, there will be 6 handshakes. I want Soulmate::Matcher to accept multiple types on creation and pipeline-query the redis server. How would I go about changing the code/overriding? It seems like I need to rewrite the entire gem.
I ended up just using the parallel gem and parallelizing the queries.
It's tacky and still uses way more handshakes than necessary, just in parallel, but it works...for now.

ruby load file async?

I have some weird issues going on (for a very weird use case as I'll explain). I'm setting up a multi-tenant application using postgres schemas for data multi-tenancy.
Each company in my system will get its own schema. The way I accomplish this is with an after_commit on the model, on create, that then goes and creates a new postgres schema, and loads schema.rb into it. (copied from rake db:schema:load code) using ruby load.
You can see the gem here
Anyway, all this works (in the console). Creating a company creates the new schema and i can switch to it etc... my problem lies in my integration tests. I have an rspec test that creates to companies like so:
before do
#c1 = Factory :company
#c2 = Factory :company
end
What's odd is that I start to get the logs about the db schema loading, but they're truncated. Almost as if they're happening in parallel. Here's a sample output:
>> create: database: unique_name1
-- create_table("first_table_in_schema_rb", {:force=>true})
>> create: database: unique_name2
create: database is my log line, the -- create_table is from schema.rb itself.
As you can see, the second create: database seems to happen while I'm loading schema.rb from the first company creation.
Does anyone know if load is somehow asynchronous? I know ruby doesn't have real threads, but could it be using fibres or something? This is really messing me up because when my test comes around, the postgres schema that was meant to be created doesn't seem to exist.
Rails 3.0.8
Ruby 1.9.2
Im not 100% sure this is your problem because im sure of what happens with require but not with load, the things this happen to me once with require because require is not atomic, so loading code from a file with require will cause a race condition. Maybe that is what is happening with load but i was not able to find any info about load been atomic or not.
nevermind... issue had nothing to do with load it was the fact that i was already connected to the wrong schema when importing the schema.rb
There was in fact an exception being thrown that was silently caught somewhere

Rails Console Doesn't Automatically Load Models For 2nd DB

I have a Rails project which has a Postgres database for the actual application but which needs to pull a heck of a lot of data out of an Oracle database.
database.yml looks like
development:
adapter: postgresql
database: blah blah
...
oracle_db:
adapter: oracle
database: blah blah
My models which descend from data on the Oracle DB look something like
class LegacyDataClass < ActiveRecord::Base
establish_connection "oracle_db"
set_primary_key :legacy_data_class_id
has_one :other_legacy_class, :foreign key => :other_legacy_class_id_with_funny_column_name
...
end
Now, by habit I often do a lot of my early development (and this is early development) by coding for a bit and then playing in the Rails console. For example, after defining all the associations for LegacyDataClass I'll start trying things like a = LegacyDataClass.find(:first); puts a.some_association.name. Unexpectedly, this dies with LegacyDataClass not being already loaded.
I can then require 'LegacyDataClass' which fixes the problem until I either need to reload!, which won't actually reload it, or until I open a new instance of the console.
Thus the questions:
Why does this happen? Clearly there is some Rails magic I am not understanding.
What is the convenient Rails workaround?
I believe this might have to do with your model name, rather than your connection. The Rails convention is that model class names are CamelCase, while the files they reside in are lowercase+underscore.
The "LegacyModel" class should therefore be in models/legacy_model.rb. Your statement about "require 'LegacyDataClass'" indicates that this is not the case, and therefore Rails doesn't know how to automagically load that model.
I wrote something for an app at work that handles connections to other databases' at runtime, it might be able to help.
http://github.com/cherring/connection_ninja

Database sharding and Rails

What's the best way to deal with a sharded database in Rails? Should the sharding be handled at the application layer, the active record layer, the database driver layer, a proxy layer, or something else altogether? What are the pros and cons of each?
FiveRuns have a gem named DataFabric that does application-level sharding and master/slave replication. It might be worth checking out.
I assume with shards we're talking about horizontal partitioning and not vertical partitioning (here are the differences on Wikipedia).
First off, stretch vertical partitioning as far as you can take it before you consider horizontal partitioning. It's easy in Rails to have different models point to different machines and for most Rails sites, this will bring you far enough.
For horizontal partitioning, in an ideal world, this would be handled at the application layer in Rails. But while it's not hard, it's not trivial in Rails, and by the time you need it, usually your application has grown beyond the point where this is feasible since you have ActiveRecord calls sprinkled all over the place. And no one, developers or management, likes working on it before you need it since everyone would rather work on features users will use now rather than on partitioning which may not come into play for years after your traffic has exploded.
ActiveRecord layer... not easy from what I can see. Would require lots of monkey patching into Rails internals.
At Spock we ended up handling this using a custom MySQL proxy and open sourced it on SourceForge as Spock Proxy. ActiveRecord thinks it's talking to one MySQL database machine when reality it's talking to the proxy, which then talks to one or more MySQL databases, merges/sorts the results, and returns them to ActiveRecord. Requires only a few changes to your Rails code. Take a look at the Spock Proxy SourceForge page for more details and for our reasons for going this route.
For those of you like me who hadn't heard of sharding:
http://highscalability.com/unorthodox-approach-database-design-coming-shard
rails 6.1 provides ability to switch connection per database thus we can do the horizontal partitioning.
Shards are declared in the three-tier config like this:
production:
primary:
database: my_primary_database
adapter: mysql2
primary_replica:
database: my_primary_database
adapter: mysql2
replica: true
primary_shard_one:
database: my_primary_shard_one
adapter: mysql2
primary_shard_one_replica:
database: my_primary_shard_one
adapter: mysql2
replica: true
Models are then connected with the connects_to API via the shards key
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
connects_to shards: {
default: { writing: :primary, reading: :primary_replica },
shard_one: { writing: :primary_shard_one, reading: :primary_shard_one_replica }
}
end
Then models can swap connections manually via the connected_to API. If using sharding, both a role and a shard must be passed:
ActiveRecord::Base.connected_to(role: :writing, shard: :shard_one) do
#id = Person.create! # Creates a record in shard one
end
ActiveRecord::Base.connected_to(role: :writing, shard: :shard_one) do
Person.find(#id) # Can't find record, doesn't exist because it was created
# in the default shard
end
reference:
https://edgeguides.rubyonrails.org/active_record_multiple_databases.html#horizontal-sharding
https://dev.to/ritikesh/multitenant-architecture-on-rails-6-1-27c7
Connecting Rails to multiple databases is not a big deal- you simply have an ActiveRecord subclass for each shard that overrides the connection property. That makes it pretty simple if you need to make cross-shard calls. You then just have to write a little code when you need to make calls between the shards.
I don't like Hank's idea of splitting the rails instances, because it seems challenging to call the code between the instances unless you have a big shared library.
Also you should look at doing something like Masochism before you start sharding.
For rails to work with replicated environment, I would suggest using my_replication plugin which helps switch database connection to one of the slaves at run-time
https://github.com/minhnghivn/my_replication
To my mind, the simplest way is maintain a 1:1 between rails instances and DB shards.
Proxy layer is better, it can support all program languages.
For example: Apache ShardingSphere' proxy.
There are 2 different products of Apache ShardingSphere, ShardingSphere-JDBC for application layer which for Java language only and ShardingSphere-Proxy for proxy layer which for all program languages.
FYI: https://shardingsphere.apache.org/document/current/en/user-manual/shardingsphere-proxy/
Depends upon rails version. Newer rails version provide support for sharding as said by #Oshan. But if you can't update to a newer version you can use the octopus gem.
Gem Link
https://github.com/thiagopradi/octopus

Resources