How to use CouchRest with Sunspot? - ruby-on-rails

I have a problem with integration between CouchRest and Sunspot. When I search the book detail, the result from Sunspot is empty. I try to google it for a long time but no help.
Started GET "/books/search?utf8=%E2%9C%93&query=Book of Life&commit=Search%21" for 127.0.0.1 at 2011-09-08 11:27:41 +0700
Processing by BooksController#search as HTML
Parameters: {"utf8"=>"?", "query"=>"Book of Life", "commit"=>"Search!"}
Rendered books/index.html.erb within layouts/application (10.7ms)
Completed 200 OK in 145ms (Views: 20.6ms | ActiveRecord: 0.0ms)
[] <-- I got empty result
My System
Ruby 1.9.2p290
Rails 3.0.10
CouchDB 1.1.0
File structure ( https://gist.github.com/1164637/ )
Model (/app/models/book_detail.rb)
Controller (/app/controllers/books_controller.rb)
Sunspot Adapter for CouchRest (/config/initializers/couchdb.rb)
Sunspot Adapter Module (/config/initializers/sunspot_couch.rb)
NOTE: Sorry about code link. I always got "Please indent all code by 4 spaces using the code toolbar button". I try to remove all tab and follow SO code formatting guideline but it not work anymore.

Forgive me if I'm missing something, but I can't see how Sunspot is mapping "keywords" to the searchable fields on your CouchRest objects.
To debug first I'd visit Couch in the browser admin UI to make sure my that end is working. Then I'd double check that sunspot is getting anything. If sunspot contains your records then the bug is on the search side, if it is empty that maybe something is up with the object lifecycle management code it injects into your model class.
It's been ages since I did any serious Ruby, wish I could be more helpful. One option is to take advantage of some of the direct CouchDB full text offerings like CouchDB Lucene: https://github.com/rnewson/couchdb-lucene

Related

regexp for rails logs

I never wrote any complex regular expression before, and what I need seems to be (at least) a bit complicated.
I need a Regex to find matches for the following:
Here below show the logs for this i need regexp plesase help Thanking you in advance
Started GET \"/\" for 1x2.x6.1xx.2x at 2016-10-20 11:04:00 +0200
Processing by WelcomeController#index as HTML
Current user: anonymous
Redirected to http://example.pro.local/login?back_url=http%xx%xx%2Fexample.pro.local%2F
Filter chain halted as :check_if_login_required rendered or redirected
Completed 302 Found in 3.4ms (ActiveRecord: 1.9ms)"
Extracting information from unstructured logs with regex is tedious and brittle.
Instead it is preferable to make the application output logs in a structured format (as suggested by #ndn).
Consider using lograge and/or logstasher in your Rails application to output structured logs.

What kind of time is measured in the default Rails logging in Rails 3?

When the Rails logs says something like
Completed 200 OK in 454.8ms (Views: 117.9ms | ActiveRecord: 199.7ms | Solr: 0.0ms)
What kind of time is being displayed? CPU time, wall time, or something else?
http://guides.rubyonrails.org/v3.2/performance_testing.html#request-logging mentions that time is measured, but not what kind of time. I haven't found any other documentation in the Rails Guides about logging, apart from how to generate messages in the Rails logger.
Wall-time.
Check the implementation of the Notification Instrumentation:
https://github.com/rails/rails/blob/2746a227fbb7e56bd51ab47fa97919f206972ab2/activesupport/lib/active_support/notifications/instrumenter.rb
and the implementation of the LogSubscriber:
https://github.com/rails/rails/blob/b5eb2423b6e431ba53e3836d58449e7e810096b4/actionpack/lib/action_controller/log_subscriber.rb
and this:
https://github.com/rails/rails/blob/7f18ea14c893cb5c9f04d4fda9661126758332b5/activesupport/lib/active_support/subscriber.rb
it is using Time.now, which is wall-time.

Smart way to disable/enable logs on demand in Rails

I'm running a Rails app (v 3.1.10) on a Heroku Cedar stack with Papertrail add-on going crazy because of the size of the logs.
My app is really verbose and the logs are getting huge (really huge):
Sometimes because I serialize a lots of data in one field and that makes a huge SQL request. In my model I have many:
serialize :a_game_data, Hash
serialize :another_game_data, Hash
serialize :a_big_set_of_game_data, Hash
[...]
Thanks to my AS3 Flash app working with bigs sets of json...
Sometimes because there's a lots of partials to render:
Rendered shared/_flash_message.html.erb (0.1ms)
Rendered shared/_header_cart_info.html.erb (2.7ms)
Rendered layouts/_header.html.erb (19.4ms)
[...]
It's not the big issue here, but I've added this case too because Jamiew handle it, see below...
Sometimes because there's lots of sql queries on the same page:
User Load (2.2ms) SELECT "users".* FROM "users" WHERE "users"."id" = 1 LIMIT 1
Course Load (5.3ms) SELECT "courses".* FROM "courses" WHERE (id = '1' OR pass_token = NULL)
Session Load (1.3ms) SELECT "sessions".* FROM "sessions" WHERE "sessions"."id" = 1 LIMIT 1
Training Load (1.3ms) SELECT "trainings".* FROM "trainings" WHERE "trainings"."id" = 1 LIMIT 1
[...]
It's a big (too) complex App we've got here... yeah...
Sometimes because there's a lots of params:
Parameters: {"_myapp_session"=>"BkkiJTBhYWI1MUVlaVdtbE9Eb1Y2I5BjsAVEkiEF9jc3JmX3Rva2VlYVWZyM2I0dEZaR1YwNXFjZhZTQ1uBjsARkkiUkiD3Nlc3Npb25faWQGOgZFRhcmRlbi51c2yN1poVm8vdWo3YTlrdUZzVTA9BjsARkkiH3dAh7CMTQ0Yzc4ZDJmYzg5ZjZjOGQ5NVyLmFkbWluX3VzZXIua2V5BjsAVFsISSIOQWRtaW5Vc2VyBjsARlsGaQZJIiIkMmEkMTAkcmgvQ2Rwc0lrYzFEbGJFRG9jMnZvdQY7AFRJIhl3YXJkZW4udXNlci51c2VyLmtleQY7AFRbCEkiCVVzZXIGOwBGWwZpBkkiIiQyYSQxMCRBUFBST2w0aWYxQmhHUVd0b0V5TjFPBjsAVA==--e4b53a73f6b622cfe7550b2ee12678712e2973c7", "authenticity_token"=>"EeiWmlODoYXUfr3b4tFZGV05qr7ZhVo/uj7a9kuFsU0=", "utf8"=>"✓", "locale"=>"fr", "id"=>"1", "a"=>1, "a"=>1, "a"=>1, "a"=>1, "a"=>1, "a"=>1, [...] Hey! You've reach the end of the line but it's not the end of the parameters...}
The AS3 Flash app send big json data to the controller...
I didn't mention the (in)famous "Assets pipeline logging problem" because now I'm using the quiet_assets gem to handle this:
https://github.com/evrone/quiet_assets
So... what did I try?
1: Dennis Reimann's middleware solution:
http://dennisreimann.de/blog/silencing-the-rails-log-on-a-per-action-basis/
2: Spagalocco's gem (inspired by solution #1):
https://github.com/spagalloco/silencer
3: jamiew's monkeypatches (inspired by solution #1 + a bonus):
https://gist.github.com/1558325
Nothing is really working as expected but it's getting close.
I would rather use a method in my ApplicationController like this:
def custom_logging(opts={}, show_logs=true)
disable_logging unless show_logs
remove_sql_requests_from_logs if opts[:remove_sql_requests]
remove_rendered_from_logs if opts[:remove_rendered]
remove_params_from_logs if opts[:remove_params]
[...]
end
...and call it in any controller method: custom_logging({:remove_sql_requests=>1, :remove_rendered=>1})
You got the idea.
So, is there any good resource online to handle this?
Many thanks for your advices...
I"m the author of the silencer gem mentioned above. Are you looking to filter logging in general or for a particular action? The silencer gem handles the latter problem. While you can certainly use it in different ways, it's mostly intended for particular actions.
It sounds like what you are looking for less verbose logging. I would recommend you take a look at lograge. I use that in production in most of my Rails apps and have found it to be quite useful.
If you need something more specialized, you may want to look at implementing your own LogSubscriber which is essentially the lograge solution.
Set your log level in the Heroku enviroment
View your current log level:
heroku config
You most likely have "Info", which is just a lot of noise
Change it to warn or error
heroku config:add LOG_LEVEL=WARN
Also, when viewing the logs, only specify the "app" server
heroku logs --source app
I personally, append --tail to see the logs live.
heroku logs --source app --tail

What is the most performant way to get data of one rails application by another rails application?

I have two rails applications (both now on Rails 3.1.1), and they work nicely. However, I have a dependence between the two. Application A uses data of application B by linking to it. These links are created automatically, but they have to be computed by doing a lookup to the data of application B. I'm working on Windows 7 with Ruby 1.9.2 and Thin as web server, and this will not be changed :-(
I have tried the following:
Use just a RESTful resource, so defined a controller, called its action (get_xml_obj with some params in it), read the needed values from the XML. Worked, but needs around 0.5s to 1s per call.
Replaced it by ActiveResource#find which worked as well, but with the same performance as the solution before.
I have installed nginx and configured it so, that the connection are keepalive, so that the connection handling should be much faster. But noticed no difference at all when calling B from A.
When I compare the time spent, these are typical examples (here with 4 references in one web page):
Application A:
Started GET "/tasks/search_task/1803" for 127.0.0.1 at 2011-11-02 14:11:04 +0100
Processing by TasksController#search_task as HTML
Parameters: {"id"=>"1803"}
Rendered tasks/_tooltip.html.haml (4529.5ms)
Completed 200 OK in 4532ms (Views: 4527.5ms | ActiveRecord: 2.0ms)
cache: [GET /tasks/search_task/1865] miss
Application B:
cache: [GET /service/get_xml_obj?key=notice&value=rails] miss
Started GET "/service/get_xml_obj?key=notice&value=rails" for 127.0.0.1 at 2011-
11-02 14:11:05 +0100
Processing by ServiceController#get_xml_obj as */*
Parameters: {"key"=>"notice", "value"=>"rails"}
Completed 200 OK in 6ms (Views: 3.0ms | ActiveRecord: 1.0ms)
and 3 other calls with a similar length (< 10ms).
So is there something I can do to tune the retrieval (without accessing the database directly)? Do you know of any good documentation how to measure and tune the web server and middleware? These are only personal applications, so there is no way of deploying them on a decent server. I use a cache for the retrieved information, so it gets better over time, but 1 second is too much to wait for. And there may be more than 1 or 2 links in a page I want to render.
Ok, I finally gave up and implemented the following:
Added file b.rb to my models directory in application A.
Included there all raw models, where the base models (used sti) are defined like that:
class Notice < ActiveRecord::Base
self.establish_connection(
:adapter => "sqlite3",
:database => "../b/db/dev.db"
)
end
...
I am now able to ask: Notice.where(:key => 'rails') which results in a real Rails model object.
The whole thing was implemented in around 20 minutes, and now there is no difference in including no link from application A to B to include 5 links.
At some point in time, I would like to know what is the slow part in using RESTful resources here ...

Extra time to serve requests using thinking sphinx

I'm trying to figure out why my rails server is taking lots of extra time to serve requests now that I switch to thinking sphinx from solr.
I'm getting things like
Completed 200 OK in 1242ms (Views: 248.8ms | ActiveRecord: 89.3ms | Sphinx: 5.3ms)
and
Completed 200 OK in 881ms (Views: 4.7ms | ActiveRecord: 7.1ms | Sphinx: 29.4ms)
I'm trying to use Thinking Sphinx to serve up JSON for an autocomplete drop down. I'm trying to figure out where the bottle neck is. I've tried running lots of benchmarks, but I can't seem to find it. When I run stuff from the console it seems snappy. Any ideas about why I'm seeing these kinds of render times?
Before I switched the render times were effectively the sum of the Views and ActiveRecord so in the ballpark of <300ms.
I should also note that I'm only seeing the bottlenecks on pages that interact with use search, which is why I'm thinking that it may be due to Thinking Sphinx.

Resources