Using OrientDB's JDBC driver with ActiveRecord - ruby-on-rails

What is the proper way of using OrientDB's JDBC driver with ActiveRecord?
I am trying to connect a Rails 3.2 application to OrientDB 1.4. I installed the gem activerecord-jdbc-adapter, and configured the database.yml as follows:
development:
adapter: jdbc
username: admin
password: admin
driver: com.orientechnologies.orient.jdbc.OrientJdbcDriver
url: jdbc:orient:local:db/test_db2
I load the OrientDB's JDBC driver as follows:
# in config/application.rb:
require '/home/myuser/jars/orientdb-jdbc-1.4.0-all.jar'
The following exception is being thrown when the application starts (using rails s):
java.lang.NullPointerException
at arjdbc.jdbc.RubyJdbcConnection.unmarshalResult(RubyJdbcConnection.java:1187)
at arjdbc.jdbc.RubyJdbcConnection.set_native_database_types(RubyJdbcConnection.java:537)
at arjdbc.jdbc.RubyJdbcConnection$INVOKER$i$0$0$set_native_database_types.call(RubyJdbcConnection$INVOKER$i$0$0$set_native_database_types.gen)
...
Is there something missing in my configuration? What is the proper way of using OrientDB's JDBC driver with ActiveRecord?

While activerecord-jdbc-adapter (theoretically) supports any JDBC compilant driver, it uses APIs and makes a few assumptions that might work not so well for a few. Esp. with not-fully compliant drivers such as orientdb-jdbc (at least version 1.4) is.
In this case AR-JDBC tries to resolve supported types from the DB meta-data: http://git.io/s7g47A but since metadata.getTypeInfo() returns an unexpected null instead of an actual ResulSet object all fails badly. This might be improved by handling "null" types by overriding native_database_types method in Ruby and/or some additional code on AR-JDBC's side - although for OrientDB's "driver" it still might not be enough to get it fully functional with AR-JDBC ... sounds like a pretty good fit for an AR-JDBC extension (assuming OrientDB can handle the SQL that ActiveRecors/AREL will generate).

Related

Are there Mongo Admin fsync + lock commands available in Mongoid?

If I wanted to call the fsync + lock methods on my database, is there a way to do this with Mongoid in a Rails app? Is there also a way to only specify the replica node that I want to perform this operation on?
I'm trying to create a rake task to perform backups nightly using cron.
Mongoid 2 uses the 10gen supported driver.
Mongoid::Config.master.connection corresponds to the connection object of class Mongo::MongoClient (was Mongo::Connection).
This class has an instance method lock! which does the fsyncLock command, and unlock! is its mate.
http://api.mongodb.org/ruby/current/Mongo/MongoClient.html#lock!-instance_method
http://api.mongodb.org/ruby/current/Mongo/MongoClient.html#unlock!-instance_method
There are no options to these methods to specify member/s of a replica set,
only by socket which is essentially for internal use.
So if you need to fsyncLock a specific replica set member,
I recommend that you connect to it explicitly via an explicit connection,
e.g., Mongo::MongoClient.new(host, port).
client = Mongo::MongoClient.new(host, port)
client.lock!
# ...
client.unlock!
client.close
Mongoid 3 uses Moped and not the 10gen driver.
But you can still use the 10gen driver independently for your rake tasks even if you move to Mongoid 3.
I'm interested in your results and any followup questions.

How to set transaction isolation level using ActiveRecord connection?

I need to manage transaction isolation level on a per-transaction basis in a way portable across databases (SQLite, PostgreSQL, MySQL at least).
I know I can do it manually, like that:
User.connection.execute('SET SESSION TRANSACTION ISOLATION LEVEL SERIALIZABLE')
...but I would expect something like:
User.isolation_level( :serializable ) do
# ...
end
This functionality is supported by ActiveRecord itself:
MyRecord.transaction(isolation: :read_committed) do
# do your transaction work
end
It supports the ANSI SQL isolation levels:
:read_uncommitted
:read_committed
:repeatable_read
:serializable
This method is available since Rails 4, it was unavailable when the OP asked the question. But for any decently modern Rails application this should be the way to go.
There was no gem available so I developed one (MIT): https://github.com/qertoip/transaction_isolation
Looks Rails4 would have the feature out of box:
https://github.com/rails/rails/commit/392eeecc11a291e406db927a18b75f41b2658253

mysql2 driver seems to write invalid queries

I'm developing an application layer on top of a rails app developed by someone else.
His application uses a module called request_logger to write to a table, which worked fine under ruby1.8/rails2/mysql gem, but in my ruby1.9/rails3/mysql2 environment, activerecord falls over, suggesting that the generated query is invalid.
It obviously is, all mysql relation names are wrapped in double quotes instead of backticks.
The call to activerecord itself just sets a bunch of attributes with
log.attributes = {
:user_id => user_id,
:controller => controller,
...etc
}
and then calls
log.save
So I'm leaning towards it not being dodgy invocation. Any suggestions?
mysql2 works fine for a lot of people, but it unashamedly sacrifices conformance to the MySQL C API for performance in the common tasks. Perhaps, if request_logger is low-level enough, it's expecting calls to exist which don't.
It's trivial to switch back to using mysql - give it a try, and if it works, stick with it. Remember to change both your Gemfile and your config/database.yml settings.
It turned out to be what seems to be a change in behaviour between rails 2 and 3 (we have the same setup working fine in rails 2)
We use database.yml to specify an (empty) "master" database and then feed in our clients with shards+octopus.
The master db is sqlite for simplicity, and it seems that activerecord was feeding off requests formatted for sqlite to the mysql2 shards, regardless of their adaptor type.

Which MongoDB DSL should I learn?

Im using MongoDB and Ruby.
I have noticed there are different DSL:s.
The Javascript DSL used with the MongoDB client (mongo):
show dbs
use my_db
db.person.find({first_name: "Syd"})
The Ruby DSL used with the Ruby driver for MongoDB:
connection = Mongo::Connection.new
connection.database_names.each { |name| puts name }
connection.database_info.each { |info| puts info.inspect}
person.find({"hello" => "world"})
Then the MongoID/MongoMapper DSL for MongoDB:
Person.desc(:last_name).asc(:first_name)
Person.descending(:last_name).ascending(:first_name)
Person.all(:conditions => { :first_name => "Syd" })
Questions:
Is it correct MongoID/MongoMapper is build on top of the Ruby DSL that is built on top of MongoDB client's DSL?
Should I learn all three DSL:s or just make my pick depending on the level of abstraction I want?
Are there any reasons I would like to learn/use the MongoDB client DSL? Can I use it in a script or is it just interactive with it's client (mongo)?
Thanks!
Learn all three.
The first one is going to be heavily used when you want to test query or find data etc, especially when you are in production. You would want to use the mongo client to do this kind of stuff.
The second one is used when the driver DSL does not support the features on the mongo. e.g:
At some stage you can not use the $or operator with MongoMapper when it was already supported on mongo 1.5
The last time I used mongoid and mongomapper does not support mapping to GridFS so you would use the driver API for this
And the last time I used, mongoid and mongomapper does not support map-reduce again you have to use the driver API for this situation
MongoMapper and Mongoid is used to map your domain object to mongo document, at some stage where the ODM is lack of you have to have the fallback plan, which is to use the mongo driver API.
Hope that helps.

Database sharding and Rails

What's the best way to deal with a sharded database in Rails? Should the sharding be handled at the application layer, the active record layer, the database driver layer, a proxy layer, or something else altogether? What are the pros and cons of each?
FiveRuns have a gem named DataFabric that does application-level sharding and master/slave replication. It might be worth checking out.
I assume with shards we're talking about horizontal partitioning and not vertical partitioning (here are the differences on Wikipedia).
First off, stretch vertical partitioning as far as you can take it before you consider horizontal partitioning. It's easy in Rails to have different models point to different machines and for most Rails sites, this will bring you far enough.
For horizontal partitioning, in an ideal world, this would be handled at the application layer in Rails. But while it's not hard, it's not trivial in Rails, and by the time you need it, usually your application has grown beyond the point where this is feasible since you have ActiveRecord calls sprinkled all over the place. And no one, developers or management, likes working on it before you need it since everyone would rather work on features users will use now rather than on partitioning which may not come into play for years after your traffic has exploded.
ActiveRecord layer... not easy from what I can see. Would require lots of monkey patching into Rails internals.
At Spock we ended up handling this using a custom MySQL proxy and open sourced it on SourceForge as Spock Proxy. ActiveRecord thinks it's talking to one MySQL database machine when reality it's talking to the proxy, which then talks to one or more MySQL databases, merges/sorts the results, and returns them to ActiveRecord. Requires only a few changes to your Rails code. Take a look at the Spock Proxy SourceForge page for more details and for our reasons for going this route.
For those of you like me who hadn't heard of sharding:
http://highscalability.com/unorthodox-approach-database-design-coming-shard
rails 6.1 provides ability to switch connection per database thus we can do the horizontal partitioning.
Shards are declared in the three-tier config like this:
production:
primary:
database: my_primary_database
adapter: mysql2
primary_replica:
database: my_primary_database
adapter: mysql2
replica: true
primary_shard_one:
database: my_primary_shard_one
adapter: mysql2
primary_shard_one_replica:
database: my_primary_shard_one
adapter: mysql2
replica: true
Models are then connected with the connects_to API via the shards key
class ApplicationRecord < ActiveRecord::Base
self.abstract_class = true
connects_to shards: {
default: { writing: :primary, reading: :primary_replica },
shard_one: { writing: :primary_shard_one, reading: :primary_shard_one_replica }
}
end
Then models can swap connections manually via the connected_to API. If using sharding, both a role and a shard must be passed:
ActiveRecord::Base.connected_to(role: :writing, shard: :shard_one) do
#id = Person.create! # Creates a record in shard one
end
ActiveRecord::Base.connected_to(role: :writing, shard: :shard_one) do
Person.find(#id) # Can't find record, doesn't exist because it was created
# in the default shard
end
reference:
https://edgeguides.rubyonrails.org/active_record_multiple_databases.html#horizontal-sharding
https://dev.to/ritikesh/multitenant-architecture-on-rails-6-1-27c7
Connecting Rails to multiple databases is not a big deal- you simply have an ActiveRecord subclass for each shard that overrides the connection property. That makes it pretty simple if you need to make cross-shard calls. You then just have to write a little code when you need to make calls between the shards.
I don't like Hank's idea of splitting the rails instances, because it seems challenging to call the code between the instances unless you have a big shared library.
Also you should look at doing something like Masochism before you start sharding.
For rails to work with replicated environment, I would suggest using my_replication plugin which helps switch database connection to one of the slaves at run-time
https://github.com/minhnghivn/my_replication
To my mind, the simplest way is maintain a 1:1 between rails instances and DB shards.
Proxy layer is better, it can support all program languages.
For example: Apache ShardingSphere' proxy.
There are 2 different products of Apache ShardingSphere, ShardingSphere-JDBC for application layer which for Java language only and ShardingSphere-Proxy for proxy layer which for all program languages.
FYI: https://shardingsphere.apache.org/document/current/en/user-manual/shardingsphere-proxy/
Depends upon rails version. Newer rails version provide support for sharding as said by #Oshan. But if you can't update to a newer version you can use the octopus gem.
Gem Link
https://github.com/thiagopradi/octopus

Resources