I read a few tutorials on getting memcached set up with Rails (2.3.5), and I'm a bit lost.
Here's what I need to cache:
I have user-specific settings that are stored in the db. The settings are queried in the ApplicationController meaning that a query is running per-request.
I understand that Rails has built-in support for SQL cacheing, however the cacheing only lasts for the duration of an Action.
I want an easy way to persist the settings (which are also ActiveRecord models) for an arbitrary amount of time. Bonus points if I can also easily reset the cache anytime a setting changes.

Gregg Pollack of RailsEnvy did a series of "Scaling Rails" screencasts a while back, which are now free (thanks to sponsorship by NewRelic). You might want to start with episode 1, but episode 8 covers memcached specifically:

Sounds like what you want is an object cache between the DB and ActiveRecord. The only decent one we've found so far is Identity Cache (https://github.com/Shopify/identity_cache). It's brand new so it's a bit rough around the edges, but gets the job done for basic caching.


Multi-schema Postgres on Heroku

I'm extending an existing Rails app, and I have to add multi-tenant support to it. I've done some reading, and seeing how this app is going to be hosted on Heroku, I thought I could take advantage of Postgres' multi-schema functionality.
I've read that there seems to be some performance issues with backups when multiple schemas are in use. This information I felt was a bit outdated. Does anyone know if this is still the case?
Also, are there any other performance issues, or caveats I should take into consideration?
I've already thought about adding a field to every table so I can use a single schema, and have that field reference to the tenants table, but given the time windows multiple schemas seem the best solution.
I use postgres schemas for a multi-tenancy site based on some work by Ryan Bigg and the Apartment gem.
I find that having seperate schemas for each client an elegant solution which provides a higher degree of data segregation. Personally I find the performance improves because Postgres can simply return all results from a table without have to filter to an 'owner_id'.
I also think it makes for simpler migrations and allows you to adjust individual customer data without making global changes. For example you can add columns to specific customers schemas and use feature flags to enable custom features.
My main argument relating to performance would be that backup is a periodic process, whereas customer table scoping would be on every access. On that basis, I would take any performance hit on backup over slowing down the customer experience.

Is using Redis right for this situation?

I'm planning on creating an app (Rails) that will have a very large collection of users - it'll start small but I would like it to be able to handle a million or more.
I want to build a system that will be able to handle 2500+ requests per second. Each request will require a write (for logging purposes) as well as a read from the enormous list of users, indexed by username (I was recommended to use MongoDB for this purpose) and the results of the read will be sent back to the user.
I am a little unclear about how mongo will handle both reads and writes, so I had this idea of using Mongo to sort of permanently store the records and then load them up into Redis every time the server starts up for even faster access so that Mongo doesn't have to deal with anything but the writes.
Does that sound reasonable or is that a huge misuse of Mongo and Redis?
The speed of delivery is of utmost importance.
It's possible, actually, to create the entire application using just Redis. What you'd want to do is research design patterns for Redis. A good place to start is this PDF by Karl Seguin called The Little Redis book.
For example, use Redis's hashes to save all users' information.
Further, if planned well you don't need to have another persistent storage such as Mongo or MySQL in conjunction with Redis as Redis is persistent itself. You just need to pick a good sharding/replication strategy that'll allow you to be flexible enough for future systemic changes.
I think the stack that you are asking about is certainly a very good solution and one that's pretty battle tested for high performance sites. Trello (created by same people who created this very site) uses a similar architecture as well as craigslist.
Trello Tech Stack Writeup
Craigslist also uses this
Redis is fast and has a great pub/sub mechanism in addition to normal invalidation type features that makes it a superior cache to most. Mongo is a db i'm very familiar with and think it's great for all sorts of data store purposes as well as being a solid enterprise db that scales well, protects data integrity and checks off a bunch of marks in the SLA enterprise jargon checklist
I think it's a great combination but really the question should be is do I even need this. For your load I think Mongo itself could handle this quite nicely (and give data integrity) and also if you really want you can run it on server with enough memory to make sure your dataset fits inside memory (denormalizing and good schema design is key). Foursquare runs exclusively on Mongo in memory.
So think if this is necessary but remember simple always wins. Redis/Mongo is super powerful but it will also take a lot more work to master two data stores and administer them.
As others have mentioned, using a single service makes more sense to me. There's reason to keep the logging data in memory though. I'd try using something simple, a logfile if possible, or Scribe or Flume if you need to distribute the writes.

Measure Rails performance locally like New Relic but without external dependencies

Having used New Relic before, I know it can measure performance but it's an external dependency. Now my app contains some faker gem data I noticed some slower performance (loading 1000 user profiles wih attached images).
Are there any tools/gems to measure performance of a Rails app since this is of course important?
Are there some docs or general guidance on performance and Rails? I know I can use indexes on tables but if you have large app with thousands of users there must be more you can do to improve performance.
Any suggestions on this topic are welcome. The main question would be: are there local New Relic-like gems to measure an app's performance so you can make improvements?
New Relic does have a local Developer Mode for Ruby apps that doesn't report data externally. It traces each request in detail.
There is a Rails guide on Performance testing and benchmarking, with some gem tools at the end of the article (I've never tried them).
For performance improvement, here are the first topics I would look:
indexes, as you said, on foreign keys and on columns often used for searches
eager loading. Check SQL requests to see if some successive requests could be merged in a single one. Use select when possible.
page and fragment caching (Caching guide)

Switching from SQl to MongoDB in Rails 3

I am considering switching a quite big application (Rails 3.0.10) from our SQL database (SQLite and Postgres) to MongoDB. I plan to put everything in it, mainly utf-8 string, binary file and user data. (Maybe also a little full text search) I have complex relationships (web structure: categories, tags, translations..., polymorphic also) and I feel that MongoDB philosophy is to avoid that and to put everything in big document, am I right ?
Does anyone have experience with MongoDB in Rails ? Particularly switching a app from ActiveRecord to Mongoid ? Do you think it's a good idea ? Do you know a guide/article to learn the MongoDB way to organize complex data ?
ps : In MongoDB, I particularly like the freedom offers by its architecture and its performance orientation. It's my main personal motivations to consider the switch.
I am using mongodb with mongoid, for 5-6 months. Have also worked with postgres + AR, MySQL + AR. Have no experience with switching AR to mongoid.
Are you facing any performance issues or expect to face them soon? If not I would advice to avoid the switch, as the decision seems just to be based on coolness factor of Mongodb.
They both have their pros and cons, I like the speed of mongodb, but there are many restrictions on what you can do to achieve that(like no joins, no transaction support and slow field vs. field(updated_at > created_at) queries).
If there are performance issues, I would still recommend to stick with your current system, as the switch might be a big task and it would be better if you spend half the time in optimizing the current system. After reading the question, I get a feeling that you have never worked with mongodb before, there are a many things which can bite you and you would not be fully aware of how to solve them.
However, If you still insist on switching, you need to carefully evaluate you data structure and the way you query them. In relational database, you have the normal forms, which have the advantage that whatever structure you start with, you will roughly reach the same end result once you do the normalization. In mongodb, there are virtually unlimited ways in which you can model your documents. You need to carefully model your documents to avail the benefits of mongodb. The queries you need to run play a very important role in your structuring along with the actual data you want to store.
Keep in mind, you do not have joins in mongodb(can be mitigated, with good modeling). As of now you can not have queries like, field1 = field2, i.e. you can't compare fields, but need to provide a literal to query against.
Take a look at this question: Efficient way to store data in MongoDB: embedded documents vs individual documents. Somebody points the OP to a discussion where embedded documents are recommended, but pretty much similar scenario, OP chooses to go with standalone documents, because of the nature of the queries he will be using to fetch the data.
All I want to say is, it should be a informed decision, which should be taken after you completely model your system with mongodb, have some performance tests with some real data to see if mongodb will solve your problem and should not be based on coolness factor.
You can do field1 = field2 using $where clause, but its slow and is advised to be avoided.
We are currently switching from PostgreSQL, tsearch, and PostGIS in a production application. It has been a challenging process to say the least. Our data model is a better fit for mongodb because we don't need to do complex joins. We can model our data very easily into the nested document structure mongodb provides.
We have started a mirror site with the mongodb changes in it so we can leave the production site alone, while we stumble through the process. I don't want to scare you, because in the end, we will be happy we made the switch - but it is a lot of work. I would agree with the answer from rubish: be informed, and make the decision you feel is best. Don't base it on the 'coolness' factor.
If you must change, here are some tips from our experience:
ElasticSearch fits well with mongo's document structure to replace PostgreSQL's tsearch full text search extensions.
It also has great support for point based geo indexing. (Points of interest closest to, or within x miles/kilometers)
We are using Mongo's built in GridFS to store files, which works great. It simplifies the sharing of user contributed images, and files across our cluster of servers.
We are using rake tasks to dump data out of postgresql into yaml format. Then, we have another rake task in the mirror site which imports and converts the data into models stored in mongodb.
The data export/import might work using a shared redis database, resque on both sides, and an observer in the production application to log changes as they happen.
We are using Mongoid as our ODM, and there are a lot of scopes within our models that needed to be rewritten to work with Mongoid vs ActiveRecord.
Over all, we are very happy with MongoDB. It offers us much more flexibility in the way we model our data. I just wish we would have discovered it before the project were started.
skip active record,
Alternatively, if you’ve already created your app, have a look at config/application.rb
and change the first lines from this:
require "rails/all"
to this:
require "action_controller/railtie"
require "action_mailer/railtie"
require "active_resource/railtie"
require "rails/test_unit/railtie"
It’s also important to make sure that the reference to active_record in the generator block is commented out:
Configure generators values. Many other options are available, be sure to check the documentation.
# config.generators do |g|
# g.orm :active_record
# g.template_engine :erb
# g.test_framework :test_unit, :fixture => true
# end
As of this this writing, it’s commented out by default, so you probably won’t have to change anything here.
I hope it will be helpful to you while you switching app from AR to mongo.

Caching Rails models between requests - bad idea?

I have a complex query that's executed on every page and whose results rarely change, so I'd like to cache it in memcached and expire it manually when it's time to update it. The simplest way would be to cache the resulting model objects themselves. But I've seen vague warnings that Active Record models shouldn't be persisted between requests, because Bad Things Can Happen.
Is that true? Is there any decent write-up of the behavior of models between requests? And if that's a bad idea, what are some corresponding good ideas?
I know Devise uses ActiveSupport::Dependencies::Reference to cache references to classes, but I can't find any documentation on that anywhere, and I don't know if that's what I want or why.
Caching queries is completely ok. Just keep in mind what you do.
One example can be found in heroku's documentation.
BTW keep in mind that Rails already do SQL caching.
