Advice on communication between rails front-end and scala backend - ruby-on-rails

We are developing the following setup:
A rails webapp allows users to make requests for tasks that are passed on to a Scala backend to complete, which can take up to 10 seconds or more. While this is happening, the page the user used to make the request periodically polls rails using AJAX to see if the task is complete, and if so returns the result.
From the user's point of view the request is synchronous, except their browser doesn't freeze and they get a nice spinny thing.
The input data needed by the backend is large and has a complex structure, as does the output. My initial plan was to simply have the two apps share the same DB (which will be MongoDB), so the rails app could simply write an id to a 'jobs' table which would be picked up by the scala backend running as a daemon, but the more I think about it the more I worry that there may be lots of potential gotchas in this approach.
The two things that worry me the most is the duplication of model code, in two different languages, which would need to be kept in sync, and the added complexity of dealing with this at deployment. What other possible problems should I take into account when assessing this approach?
Some other possibilities I'm looking at are 1) making the Scala backend a RESTful service or 2) implementing a message queue. However, I'm not fully convinced about either option as they will both require more development work and it seems to me that in both cases the model code is effectively duplicated anyway, either as part of the RESTful API or as a message for the message queue - am I wrong about this? If one of these options is better, what is a good way to approach it?

I have used several times resque for similar problems and Ive been always very happy with it, it gives you all you need to implement a job queue and is backed on redis. I would strongly suggest you to take a look at it

Don't take me wrong, but, for this project what does Rails provide that Scala does not?
Put another way, can you do without Scala entirely and do it all in Rails? Obviously yes, but you're developing the back end in Scala for a reason, right? Now, what does Scala provide that Rails does not?
I'm not saying you should do one or the other, but maintaining duplicate code is asking for trouble/complication.
If you have a mountain of Ruby code already invested in the project, then you're committed to Rails, barring a complete Twitter-esque overhaul (doesn't sound like that's an option nor desired).
Regardless, not sure how you are going to get around the model duplication. Hitting a Mongo backend does buy you BSON result sets, which you could incorporate with Spray, their excellent implementation of Akka, out-of-the-box REST, and JSON utilities.
Tough problem, sounds like you're caught between 2 paradigms!

Related

Read-only web apps with Rails/Sinatra

I am working on an administrative web app in Rails. Because of various implementation details that are not really relevant, the database backing this app will have all of the content needed to back another separate website. It seems like there are two obvious options:
Build a web app that somehow reads from the same database in a read-only fashion.
Add a RESTful API to the original app and build the second site in such a way as for it to take its content from the API.
My question is this: are either of these options feasible? If so, which of them seems like the better option? Do Rails, Sinatra, or any of the other Rack-based web frameworks lend themselves particularly well to this sort of project? (I am leaning towards Sinatra because it seems more lightweight than Rails and I think that my Rails experience will carry-over to it nicely.)
Thanks!
Both of those are workable and I have employed both in the past, but I'd go with the API approach.
Quick disclaimer: one thing that's not clear is how different these apps are in function. For example, I can imagine the old one being a CRUD app that works on individual records and the new one being a reporting app that does big complicated aggregation queries. That makes the shared DB (maybe) more attractive because the overlap in how you access the data is so small. I'm assuming below that's not the case.
Anyway, the API approach. First, the bad:
One more dependency (the old app). When it breaks, it takes down both apps.
One more hop to get data, so higher latency.
Working with existing code is less fun than writing new code. Just is.
But on the other hand, the good:
Much more resilient to schema changes. Your "old" app's API can have tests, and you can muck with the database to your heart's content (in the context of the old app) and just keep your API to its spec. Your new app won't know the difference, which is good. Abstraction FTW. This the opposite side of the "one more dependency" coin.
Same point, but from different angle: in the we-share-the-database approach, your schema + all of SQL is effectively your API, and it has two clients, the old app and the new. Unless your two apps are doing very different things with the same data, there's no way that's the best API. It's too poorly defined.
The DB admin/instrumentation is better. Let's say you mess up some query and hose your database. Which app was it? Where are these queries coming from? Basically, the fewer things that can interact with your DB, the better. Related: optimize your read queries in one place, not two.
If you used RESTful routes in your existing app for the non-API actions, I'm guessing your API needs will have a huge overlap with your existing controller code. It may be a matter of just converting your data to JSON instead of passing it to a view. Rails makes it very easy to use an action to respond to both API and user-driven requests. So that's a big DRY win if it's applicable.
What happens if you find out you do want some writability in your new app? Or at least access to some field your old app doesn't care about (maybe you added it with a script)? In the shared DB approach, it's just gross. With the other, it's just a matter of extending the API a bit.
Basically, the only way I'd go for the shared DB approach is that I hated the old code and wanted to start fresh. That's understandable (and I've done exactly that), but it's not the architecturally soundest option.
A third option to consider is sharing code between the two apps. For example, you could gem up the model code. Now your API is really some Ruby classes that know how to talk to your database. Going even further, you could write a Sinatra app and mount it inside of the existing Rails app and reuse big sections it. Then just work out the routing so that they look like separate apps to the outside world. Whether that's practical obviously depends on your specifics.
In terms of specific technologies, both Sinatra and Rails are fine choices. I tend towards Rails for bigger projects and Sinatra for smaller ones, but that's just me. Do what feels good.

Is it sensible to run Backbone on both server (Node.js and Rails) and client for complex validations?

This is verbose, I apologise if it’s not in accordance with local custom.
I’m writing a web replacement for a Windows application used to move firefighters around between fire stations to fill skill requirements, enter sick leave, withdraw firetrucks from service, and so on. Rails was the desired back-end, but I quickly realised I needed a client-side framework and chose Backbone.js.
Only one user will be on it at a time, so I don’t have to consider keeping clients in sync.
I’ve implemented most of the application and it’s working well. I’ve been avoiding facing a significant shortcoming, though: server-side validations. I have various client-side procedures ensuring that invalid updates can’t be made through the interface; for instance, the user can’t move someone who isn’t working today to another station. But nothing is stopping a malicious user from creating a record outside of the UI and saving it to the server, hence the need for server-side validation.
The client loads receives all of today’s relevant records and processes them. When a new record is created, it’s sent to the server, and processed on the client if it saved successfully.
The process of determining who is working today is complex: someone could be scheduled to work, but have gone on holidays, but then been called in, but then been sent home sick. Untangling all this on the server (on each load?!) in Ruby/Rails seems an unfortunate duplication of business logic. It would also have significant overhead in a specific case involving calculating who is to be temporarily promoted to a higher rank based on station shortages and union rules, it could mean reloading and processing almost all today’s data, over and over as each promotion is performed.
So, I thought, I have all this Backbone infrastructure that’s building an object model and constraining what models can be created, why not also use it on the server side?
Here is my uncertainty:
Should I abandon Rails and just use Node.js or some other way of running Backbone on the server?
Or can I run Node.js alongside Rails? When a user opens the application, I could feed the same data to the browser and Node, and Rails would check with the server-side Backbone to make sure the proposed new object was valid before saving it and returning it to the browser.
One factory is how deeply Rails is entrenched in this application. There isn’t that much server-side Ruby for creation/deletion of changes, but I made a sort of adaptation layer for loading the data to compensate for the legacy database model. Rails is mostly just serving JSON, and CSS, Javascript, and template assets. I do have a lot of Cucumber features, but maybe only the data-creation ones would need to be updated?
Whew! So, I’m looking for reassurance: is it reasonable, like suggested in this answer, to be running both Rails and Node on the server, with some kind of inter-process communication? Or has Rails’s usefulness shrunk so much (it is pretty much a single-page application like mentioned in that answer) that I should just get rid of it entirely and suffer some rewriting to a Node environment?
Thanks for reading.
It doesn't sound like you're worried a lot about concurrency as much as being able to do push data, which both platforms are completely capable of performing. If you have a big investment in Ruby code now, and no one is complaining about its use then what might be the concern? If you just want to use Node for push and singularity of using javascript through the stack then it might be worth it to move code over to it. From your comments, I really feel it is more about what is interesting to you, but you'll have to support the language(s) of choice. If you're the only one on the team then its pretty easy to just slip into a refactor to node just because it is interesting. Just goes back to who's complaining more, you or the customer. So to summarize: Node lets you move toward a single language in your code base, but you have to worry about what pitfalls javascript on the server has currently. Ruby on Rails is nice because you have all the ability to quickly generate features and proto them out.

Best practices for accessing data between Rails applications?

I have two Rails applications whose data are pretty interdependent. If I needed to easily access models of another application in Django or otherwise share them, this was trivial, as I would simply include both applications in my project and be able to import the models of the other application. At that point, I could just access objects in the other app's database using Django's ORM and it was consistent and awesome.
In Rails, I'm getting the overwhelming vibe that the way to do this is by creating RESTful API hooks in each application for each thing you want the other application to request. Seems clean and modular enough, but it's starting to get messy because I have a situation as follows:
Make GET request to application A
Application A needs to fetch data from application B, so it makes a GET request to an API hook on application B.
As part of the function in B's controller, it needs data that is part of A's data model, so it makes requests to A.
Steps 2-3 are happening in succession
It's obvious this is creating a scalability nightmare and in our case, requests are timing out because of this race condition in which the new requests are waiting for the original request to complete but it never will until the new ones complete. We can spin up more web server processes, but I feel like there has got to be a better way to do this that doesn't involve making these superfluous GET requests when I maintain both of these applications.
Ultimately, my questions are 1) is there a more direct approach to getting another Rails app's data than issuing a GET request, or am I just trying to use what's actually a great design in a way that is killing it? and 2) If I am fighting against the grain of the design pattern and this really is a great approach, is there any general advice you can offer that will help me keep the data in these apps from having a dependency on one another that causes race conditions like this? I realize I could pass the information that app B will need from A up front (it is just one field right now but could later be more), but is that right?
If there is a large portion of shared data/models, put those models into a gem/plugin and use them in both apps. Keep the DB config per-app, though, IMO.
If they're that tightly interwoven, I'm not 100% convinced that one or both apps are doing it right, though.
I also happened to discover that Mongoid has support for multiple databases, which allows us to do exactly what I was aiming to do: http://mongoid.org/docs/installation/replication.html

How to refactor a Rails model that contains significant amount of back-end code?

I have a Rails 2.X model with 421 lines of code/comments that does a significant amount of work on the backend (opening HTTP Get requests, parsing RSS, parsing HTML, etc.). At the same time, I'm moving to Resque in order to more quickly get through this backend code. I'm wondering what the best way to refactor this would be. Should I move this back-end code to a library that I include in the model? A module? A gem?
Your thoughts would be much appreciated.
I basically have a separate core tasks for each data item i'm processing. i.e. parsing an RSS feed, parsing a HTTP URL, running regex on that html body, as well as a few other tasks and right now i have 500 or lines of code within the model; even though most of the stuff the model does is called through a back-end script run by cron
So to make it more wieldy and to make it easier to move to resque; i was thinking of doing separate classes for each resque queue, and using static methods there
Then I can require those classes by the back-end 'controller' script if you will... Does this approach make sense?
Your best bet from a testability and thought process standpoint is probably to break out those various concerns into their own (non-ARec) models. You might have RssParser, HtmlParser, ServiceRequest, etc., based on your first paragraph.
Depending on where this stuff is used (multiple projects?), it might make sense to make your own gem and version it.
I have written big and small resque classes before, and you'll save yourself a lot of pain if you make the resque classes as thin as possible.
The model in question is doing many things.
In a well-structured application, every class does one thing and does it well. This feature of a well-structured application is part of what makes it highly testable.
You should break apart the many things that the one model does into multiple self-contained classes, keeping the dependencies between them minimal. Then you will be able to test each of those new classes easily, and you will have a number of new stub points to test the overall model.
In my experience, if this code is going to be used in more than one application, maybe a gem is worth considering. Otherwise, adding the layer of indirection by moving the code to a gem doesn't seem to yield much benefit. In fact it can potentially slow down development depending on your deploy situation.
As for how to re-factor the model, look through all the model's code and as yourself the question "What is this model doing?". If you end up with lots of 'and's or 'or's, then you should probably strive to make each item delimited by 'and' or 'or' its own class (see http://en.wikipedia.org/wiki/Single_responsibility_principle). Breaking out the responsibilities in this manner makes the individual concerns much easier to write tests against. Especially when making external HTTP api calls, etc.

How do you read an existing Rails project?

When you start working on an existing Rails project what are the steps that you take to understand the code? Where do you start? What do you use to get a high level view before drilling down into the controllers, models, helpers and views? Do you have any specific techniques, tricks or tools that help speed up the process?
Please don't respond with "Learn Rails & Ruby" (like one of the responses the last guy who asked this got - he also didn't get much response to his question so I thought I would ask again and prompt a bit more). I'm pretty comfortable with my own code. It's sorting other people's that does my head in and takes me a long time to grok.
Look at the models. If the app was written well, this should give you a picture of its domain model, which is where the interesting logic should live. I also look at the tests for the models.
The way that the controllers/views were implemented should be apparent just by using the Rails app and observing the URLs.
Unfortunately, there are many occasions where too much logic lives in controllers and even views. That means you'll have to take a look into those directories too. Doubley-unfortunate, tests for these layers tend to be much less clear.
First I use the app, noting the interesting controller and action names.
Then I start reading the code for these controllers, and for the relevant models when necessary. Views are usually less important.
Unlike a lot of the people so far, I actually don't think tests are the place to start. I think they're too narrow, too focused. It'd be like trying to understand basic physics/mechanics by first zooming into intra-molecular forces and quantum mechanics. I also think you're relying too much on well-written tests, and in my experience, a lot of people don't write sufficient tests or write poor tests (which don't give an accurate sense of what the code should actually do).
1) I think the first thing to do is to understand what the hell the app actually does. Use it, at least long enough to build an idea of what its main purpose is and what the different types of data might be and which actions you can perform, and most importantly, why.
2) You need to step back and see the big picture. I think the best way to do that is by starting with schema.rb. This tells you a few really important things:
What is the vocabulary/concepts of this project. What does "User" actually mean in this app? Why does the app have both "User" and "Account" models and how are they different/related?
You could learn what models there are by looking in app/models but this will actually tell you what data each model holds.
Thanks to *_id fields, you'll learn the associations between the models, which helps you understand how it all fits together.
I'd follow this up by looking at each model's *.rb file for (hopefully) comments, validations, associations, and any additional logic relevant to each. Keep an eye out for regular ol' Ruby classes that might live in lib/.
3) I, personally, would then take a brief glance at routes.rb as it will tell you two key things: a brief survey of all of the actions in the app, and, if the routes and controllers/actions are well named and thought out, a quick sense of where different functionality might live.
At this point you're probably ready to dig into a specific thing you need to learn. Find the controller for the feature you're most interested in and crack it open. Start reading through the relevant actions, see which models are involved, and now maybe start cracking open tests if you want to.
Don't forget to use the rest of your tools: Ruby/Rails debuggers, browser dev tools, logs, etc.
I would say take a look at the tests (or specs if the project uses RSpec) to get an idea at the high-level of what the application is supposed to do. Once you understand from the top level how the models/views/controllers are expected to behave, you can drill into the implementations.
If the Rails project is in a somewhat stable state than I have always been a big fan of using the debugger to help navigate the code base. I'll fire up the browser and begin interacting with the app then target some piece of functionality and set a breakpoint at the beginning of the associated function. With that in place I just study the parameters going into the function and the value being returned to get a better understanding of what's going on. Once you get comfortable you can modify the functionality a little bit to ensure you understand what's going on. Just performing some static analysis on the code can be cumbersome! Good luck!
I can think of two reasons to be looking at an existing app with which I have no previous involvement: I need to make a change or I want to understand one or more aspects because I'm considering using them as input to changes I'm considering making to another app. I include reading-for-education/enlightenment in that second case.
A real benefit of the MVC pattern in particular, and many web apps in general is that they are fairly easily split into request/response pairs, which can to some extent be comprehended in isolation. So you can start with a single interaction and grow your understanding out from that.
When needing to modify or extend existing code, I should have a good idea of what the first change will be - if not then I probably shouldn't be fooling with the code yet! In a Rails app, the change is most likely to involve view, model or a combination of both and I should be able to identify the relevant items fairly quickly. If there are tests, I check that they run, then attempt to write a test that exposes the missing functionality and away we go. If there are no tests then it's a bit trickier - I'm going to worry that I might inadvertently break something: I'd consider adding tests to give myself a more confidence, which will in turn start to build some understanding of the area under study. I should fairly quickly be able to get into a red-green-refactor loop, picking up speed as I learn my way around.
Run the tests. :-)
If you're lucky it'll have been built on RSpec, and that'll describe the behavior regardless of the implementation.
I run rake test in a terminal
If the environment does not load, I take a look at the stack trace to figure out what's going on, and then I fix it so that the environment loads and run the tests again
I boot the server and open the app in a browser. Clicking around.
Start working with the tasks at hand.
If the code rocks, I'm happy. If the code sucks, I hurt it for fun and profit.
Aside from the already posted tips of running specs, and decomposing the MVC, I also like:
rake routes
as another way to get a high-level view of all the routes into the app
./script/console
The rails irb console is still my favorite way to inspect models and model methods. Grab a few records and work with them in irb. I know it helps me during development and test.
Look at the documentation, there is pretty good documentation on some projects.
It's a little bit hard to understand other's code, but try it...Read the code ;-)

Resources