Can I write and leverage memecached from different server processes? - ruby-on-rails

Say you have a rails app, and you're already using queueing (resque) to offload some slow/non-urgent processing on the server. That queueing processing performs some functionality need for the rails app - and then saves information into a memecached store... Everything is good.
But is it possible to write a component triggered by queueing that runs a go application that, in turn, leverages the same underlying database as the rails app and writes to the same in-memory store?
Is this common? Not so trivial? The database schema would be familiar to both rails and go, and while the go app might have some duplication of business logic, it's pretty siloed. Think of it as a way to gradually migrate some server functionality running in rails to running in go. Is this done in practice?

It's pretty common to have multiple encapsulated applications interact with a shared data store like memcached. This is fine to do in practice and which technology each of the apps is written in doesn't matter, just as long as they can access the store. In such an environment you may require some additional business logic to coordinate reads and writes which shouldn't be overlooked because it could become a lot of work.

Related

How to optimize Rails for one kiosk user?

I am playing around with using Rails to underpin a kiosk. This is a terminal where there is only one local user at a time.
Under this system, a browser like Chrome would access the Rails app.
Things I assume would be helpful:
Super-fast, very lightweight Rails server (I'm using Puma).
Eliminating standard processes/assumptions that are meant for internet website contexts (caching, CDNs, middleware, etc.).
In some level of detail preferably, how should one set up a Rails app for maximum performance in a single-user kiosk?
This might sound like a non-answer, but the approach I would take is to use Rails in its default (production) configuration, and optimise performance issues as they arise in your test bed. Running Rails in production mode will likely give you more than enough performance if you have a dedicated machine for a single user (often you'll have many clients to a single Rails instance). Without testing the application, you could sink a considerable amount of time into optimisations that don't impact the user experience.
It may be worth sitting Rails behind Apache/nginx (Passenger is a well understood way to get a Rails app on Apache) to serve your static assets, but from the information provided so far I'd be surprised if performance optimisation was necessary at this stage.
A challenge that might be worth considering at this stage is how you'll deploy changes to your kiosk/set of kiosks. Will they be brought in for updates or need to have changes applied over-the-air? That will likely impact how you deploy it onto the machine, and in my experience is a harder thing to change later on.

Rails App + Background Process Sharing Models

I've been planning out a Rails RSS aggregator lately, and I ran into something I could use some advice on. The part that would handle the polling and parsing of users' subscribed feeds needs to be running constantly, which I assume a daemon is probably the best option for. (I would use the Daemons gem and have the daemon periodically query the database for feeds in need of refreshing, then use Feedzirra to parse and save items.)
My question is: how would the daemon share the models and migrations from Rails, especially if the daemon were running on another server, should the app require it for scalability? (i.e. database server, feed crawler server, and instances of the front-end) I'm probably falling victim of "premature scaling," but as a Ruby newbie I'm interested in what the best way to handle this would be. For the sake of "doing it the right way" the first time.
Or am I going about this the wrong way?
As #house9 pointed out you should use DelayedJob for this (https://github.com/collectiveidea/delayed_job)
DJ is loading whole Rails env and is capable of running as a separate process even on separate server. That's the easiest way to go.

Create multiple Rails servers sharing same database

I have a Rails app hosted on Heroku. I have to do long backend calculations and queries against a mySQL database.
My understanding is that using DelayedJob or Whenever gems to invoke backend processes will still have impact on Rails (front-end) server performance. Therefore, I would like to set up two different Rails servers.
The first server is for front-end (responding to users' requests) as in a regular Rails app.
The second server (also a Rails server) is for back-end queries and calculation only. It will only read from mySQL, do calculation then write results into anothers Redis server.
My sense is that not lot of Rails developers do this. They prefer running background jobs on a Rails server and adding more workers as needed. Is my sever structure a good design, or is it an overkill? Is there any pitfall I should be aware of?
Thank you.
I don't see any reason why a background job like DelayedJob would cause any more overhead on your main application than another server would. The DelayedJob runs in it's own process so the dyno's for your main app aren't affected. The only impact could be on the database queries but that will be the same whether from a background job or another app altogether that is accessing the same database.
I would recommend using DelayedJob and workers on your primary app. It keeps things simple and shouldn't be any worse performance wise.
One other thing to consider if you are really worried about performance is to have a database "follower", this is effectively a second database that keeps itself up to date with your primary database but can only be used for reads (not writes). There may be better documentation about it, but you can get the idea here https://devcenter.heroku.com/articles/fast-database-changeovers#create_a_follower. You could then have these lengthy background jobs read data from here leaving your main database completely unaffected.

How is request processing with rails, redis, and node.js asynchronous?

For web development I'd like to mix rails and node.js since I want to get the best out of both worlds (rails for fast web development and node for concurrency). I know that some people choose to just use full ruby stack with eventmachine that is integrated into rails controller so that every request can be nonblocking by using fiber in event-loop model. I have been able to understand how that works in a big picture.
At this moement however I want to try doing nonblocking request processing with rails and node.js with message queue concept. I heard that this can be achieved by using redis as an intermediary. I'm still having trouble trying to figure out how that works as of now. From what I can understand: so we have 2 apps A (rails) and B (node.js) and redis. rails app will handle requests from users that go through controllers in REST manner, and then from there rails will pass that through redis, and then redis will form queues and node.js app will pick up that queue and do whatever necessary afterhand (write or read from backend db).
My questions:
So how would that improve concurrency and scalability? from what i
know since rails handle the requests through controllers
synchronously, and then write to redis, the requests will be
blocking still, even though node.js end can pickup the queue
asynchronously. (I have a feeling that it's not asynchronous yet if it's not end to end
non-blocking).
Would node.js be considered a proxy or an application here if redis
is the intermediary?
I'm new to redis and learning it still. If I'm using 100% noSQL
solution for my backend database, such as mongoDB or couchDB, are they replaceable by redis entirely or is redis more seen as a
messaging queue tool like rabbitMQ?
Is messaging queue a different concurrency concept than threading or
event-loop model or is it supposed to supplement them?
That's all my question. I'm new to message queue concept. Will appreciate any help and pointers to right direction and articles that help me learn more. thanks.
You are mixing some things here that don't go together.
Let's first make sure we are on the same page regarding the strengths/weaknesses of the involved technologies
Rails: Used for it's web-development simplicity and perfect for serving database-backed web-applications.
Not very performant when having to serve a large number of long running requests as you'd run out of threads on your Ruby workers - but well suited for anything that can scale horizontally with more web-nodes (multiple web-servers - 1 db).
Node.js: Great for high-concurrency scenarios. Not as easy as rails to write a regular web-application in it. But can handle near an insane amount of long-running low-cpu tasks efficiently.
Redis: A Key-Value Store that supports operations on it's data-structures (increment/decrement values, append/prepent push/pop to lists - all operations that make this DB work consistently with multiple clients writing at once)
Now as you can see, there is no benefit in having Rails AND Node serve the same request - communicating through Redis. Going through the Rails Stack would not provide any benefit if the requests ends up being handled by the Node server.
And even if you only offload some processing to the node server, it's still the Rails webserver that handles the requests and has to wait for a response from node - killing the desired scalability. It simply makes no sense.
Where you would a setup with Node and Rails together is in certain areas of your app that have drastically different scaling requirements.
If you are for example writing a Website that displays live stats for Football games you can easily see that there are two different concerns in your app: The "normal" Site that contains signup, billing and profile stuff that screams for a quick implementation through rails. And the "live" portion of the site where users see live results and you expect to handle a lot of clients at once - all waiting for something to happen (low cpu - high concurrency).
In such a case it may be beneficial to actually seperate the two parts of the site into a Ruby and a Node app, with then sharing data about the user through a store like Redis (but actually you just need some shared state that both can look at and write to for synchronization purposes).
So you would use for example Rails for the Signup/Login portions - once signed up write the session cookie into redis alongside with the permissions of the user (what game is he allowed to follow) and hand the user off to the Node.js app.
There the Node app can read the session information from Redis and serve the user.
Word of advice:
You don't get scalability by simply throwing Node.js into your Toolbox. You really have to find out what Node.js is good at (low-cpu high-io concurrent operations) and how you can leverage that to remedy some of the problems your currently chosen technology has.
I can answer 3 for you. Redis does not guarantee that when you perform an operation that result will actually be on disk, also transaction handling it a bit "different". It also requires for the whole database to be in memory. Depending on the situation this can be an issue or not. It is however incredibly fast. It is not a messaging queue, you can easily make a queue out of it, but it is not it's purpose. If you want to have a queuing system only you can probably do better with something else.

Rails best practice: background process/thread?

I'm coming from a PHP environment (at least in terms of web dev) and into the beautiful world of Ruby, so I may have some dumb questions. I imagine there are some fundamentally different options available when not using PHP.
In PHP, we use memcache to store alerts we want to display in a bar along the top of the page. When something happens that generates an alert (such as a new blog post being made), a cron script that runs once every 5 minutes or so puts that information into memcache.
Now when a user visits the site, we look in memcache to find any alerts that they haven't already dismissed and we display them.
What I'm guessing I can do differently in Rails, is to by-pass the need for a cron script, and also the need to look in memcache on every request, by using a Singleton and a polling process running in a separate thread to copy from memcache to this singleton. This would, in theory, be more optimized than checking memcache once-per-request and also encapsulate the polling logic into one place, rather than being split between a cron task and the lookup logic.
My question is: are there any caveats to having some sort of runloop in the background while a Rails app is running? I understand the implications of multithreading, from Objective-C/Java, but I'm asking specifically about the Rails (3) environment.
Basically something like:
class SiteAlertsMap < Hash
include Singleton
def initialize
super
begin_polling
end
# ... SNIP, any specific methods etc ...
private
def begin_polling
# Create some other Thread here, which polls at set intervals
end
end
This leads me into a similar question. We push (encrypted) tasks onto an SQS queue, for things related to e-commerce and for long-running background tasks. We don't use cron for this, but rather we have a worker daemon written in PHP, which runs in the background. Right now when we deploy, we have to shut down this worker and start it again from the new code-base. In Rails, could I somehow have this process start and stop with the rails server (unicorn) itself? I don't think that's something I'd running on the main process in a separate thread, since we often want to control it as a process by itself, but it would be nice if it just conveniently ran when the web application was running.
Threading for background processes in ruby would be a terrible mistake, especially since you're using a multi-process server. Using unicorn with say 4 worker processes would mean that you'd be polling from each of them, which is not what you want. Ruby doesn't really have real threads, it has green threads in 1.8 and a global interpreter lock in 1.9 IIRC. Many gems and libraries are also obnoxiously unthreadsafe.
Using memcache is still your best option and, if you have it set up correctly, you should only see it adding a millisecond or two to the request time. Another option which would give you the benefit of persisting these alerts while incurring minimal additional overhead would be to store these alerts in redis. This would better protect you against things like memcache crashing or server reboots.
For the background jobs you should use a similar approach to what you have now, but there are several off the shelf handlers for this like resque, delayed_job, and a few others. If you absolutely have to use SQS as the backend queue, you might be able to find some code to help you, but otherwise you could write it yourself. This still requires the other daemon to be rebooted whenever there is a code change. In practice this isn't a huge concern as best practices dictate using a deployment system like capistrano where a rule can easily be added to bounce the daemon on deploy. I use monit to watch the daemon process, so restarting it is as easy as telling monit to restart it.
In general, Ruby is not like Java/Objective-C when it comes to threads. It follows the more Unix-like model of process based isolation, but the community has come up with best practices and ways to make this less painful than in other languages. Ruby does require a bit more attention to setting up its stack as it is not as simple as enabling mod_php and copying some files around, but once the choices and architecture is understood, it is easier to reason about how your application works. The process model, in my opinion, is much better for web apps as it isolates code and state from the effects of other running operations. The isolation also makes the app easier to work with in a distributed system.

Resources