Refactoring legacy code - ruby-on-rails

I'm working on a quite large legacy Rails app. Most of the code is downright horrible, and I'm trying to make it better as I go through it.
The problem is, there are no tests, and almost everything is wrong. So far I've went through the code and made a lot of handwritten notes on stuff that needs to be refactored, so that I can later do it when I get to adding tests.
But there are things that are just so simple and scream for quick refactoring. For example:
def isValid(valid)
name = Long::AndUglyModule::UglyClass.getvalid(valid)
return name
end
the whole class looks like this, which makes me wanna just rewrite it to
include Long::AndUglyModule
def is_valid(valid)
UglyClass.getvalid(valid)
end
the problem is, that I'm afraid of introducing some subtle mistakes. On the other hand, working with code that looks like this gives me loads of headaches.
Is it better to just do simple refactorings instantly, or leave the code as it is until I actually have to work with it or change it directly?

I have much experience with huge legacy code and entire refactoring.
Allow a hybrid of new and old.
Clearly separate both code bases.
Log exceptions.
Making the old code more beautiful can be a 10000% waste of time.
Refactor first where it makes sense: remove HTML framesets, declarative navigation. Small functions for cumbersome constructs.
Make a source processing app, translating antipattern in the code to better code. Multisource regex find and replacements are needed too, \1 for recognizing a repetition.
Introduce methods where the old code was too long.
Shrink code: copied & edited pieces to a business logic class keeping all together.
Process source wise.
Before all: start with statistics, KB, number of lines, percentage processed, time line. With a google spread sheet one can communicate the progress, and calculate the end date.
That legacy code takes much time is underestimated, so be sure you have good documentation.
There is much more to say, but this is what immediately relates to "small refactoring."
Conclusion:
One really cannot and should not do anything against the inherent urge of a programmer to refactor small pieces, but refactoring should either be a conversion over multiple sources and occurrences, or while a single use-case is corrected or redesigned.

Related

reading/parsing common lisp files from lisp without all packages available or loading everything

I'm doing a project which involves parsing the histories of common lisp repos. I need to parse them into list-of-lists or something like that. Ideally, I'd like to preserve as much of the original source file syntax as possible, in some way. For example, in the case of the text #+sbcl <something>, which I think means "If our current lisp is sbcl, read <something>, otherwise skip it", I'd like to get something like (#+ 'sbcl <something>).
I originally wrote a LALR parser in Python, which sort of worked, but it's not ideal for many reasons. I'm having a lot of difficulty getting correct output, and I have tons of special cases to add.
I figured that what I should really do is is use lisp itself, since it already has a lisp parser built in. If I could just read a file into sexps, I could dump it into something (cl-json would do) for further processing down the line.
Unfortunately, when I attempt to read https://github.com/fukamachi/woo/blob/master/src/woo.lisp, I get the error
There is no package with the name WOO.EV.TCP
which is of course coming from line 80 of that file, since that package is defined in src/ev/tcp.lisp, and we haven't read it.
Basically, is it possible to just read the file into sexps without caring whether the packages are defined or if they contain the relevant symbols? If so, how? I've tried looking at the hyperspec reader documentation, but I don't see anything that sounds relevant.
I'm out of practice with actually writing common lisp, but it seems potentially possible to hack around this by handling the undefined package condition by creating a blank package with that name, and handling the no-symbol-of-that-name-in-package condition by just interning a given symbol. I think. I don't know how to actually do this, I don't know if it would work, I don't know how many special cases would be involved. Offhand, the first condition is called no-such-package, but the second one (at least in sbcl) is called simple-error, so I don't even know how to determine whether this particular simple-error is the no-such-symbol-in-that-package error, let alone how to extract the relevant names from the condition, fix it, and restart. I'd really like to hear from a common lisp expert that this is the right thing to do here before I go down the road of trying to do it this way, because it will involve a lot of learning.
It also occurs to me that I could fix this by just sed-ing the file before reading it. E.g. turning woo.ev.tcp:start-listening-socket into, say, woo.ev.tcp===start-listening-socket. I don't particularly like this solution, and it's not clear that I wouldn't run into tons more ugly special cases, but it might work if there's no better answer.
I am almost sure there is no easy portable way to do this for a number of reasons.
(Just limiting things to the non-existent-package problem for now.)
First of all there is no portable access into the bit of the reader which decides that tokens are going to be symbols and then looks for package markers &c: that just happens according to the rules in 2.3. So you can't easily intervene in this.
Secondly there's not portably enough information in any kind of condition the reader might signal to be able to handle them.
There are several possible ways out of this bit of the problem.
If you felt sufficiently heroic you might be able to teach the reader that all of the token-starting characters are in fact things you control and then write a token-reader that somehow deals with the whole package thing by returning some object which isn't a symbol. But to do that you need to deal with numbers, and if you think that's simple, well, it's not.
If you felt less heroic you could write a more primitive token-reader which just doesn't even try to deal with anything except grabbing all the characters needed and returns some kind of object which wraps a string. This would avoid the whole number problem at the cost of losing a lot of intofmration.
If you don't care about portability, find an implementation, understand how its reader does it, and muck around with it. There are more open source or source-available implementations than I can easily count (perhaps I am not very good at counting) so this is a pretty good approach. It's certainly what I'd do.
But this is only the start of the problems. The CL reader is hairy and, in its standard configuration (the configuration which is used for things like compile-file unless people have arranged otherwise) can run completely arbitrary code at read time, including code which modifies the reader itself, some of which may do so in an implementation-dependent way. And people use this: there's a reason Lisp is called the 'programmable programming language' and it's that people program it.
I've decided to solve this using sed (actually Python's re.sub, but who's counting?) because it'll work for my actual use case, and was easy.
For future readers: The various people saying this is impossible in general are probably right. The other questions posted by #Svante look like good easy ways to solve part of the problem. Other parts of the problem might be solved more elegantly by replacing the reader macros for #., #+, #-, etc with ones which just make a list, which sounds less heroic than the suggestions from #tfb, but I don't have time for that shit.

When to stop DRYing up the code?

So DRYing up code is supposed to be good thing right? There was a situation in one of the projects I was working on where there were certain models/entities that were more-or-less the same except the context in which they were being used. That is, Every such entity had a title, descriptions, tags, a user_id etc and some other attributes. Hence their CRUD actions in their respective controller looked pretty similar.
My manager argued that its repetition of code and needs to be DRYed up. Hence he came up with CRUD ruby module that when included took care of CRUD actions for the controllers of all these entities. But eventually, Simplicity was compromised. The code lost readability as every "thing" was named "object". Debugging became difficult and the whole point of DRYing up the code was lost.
This was just one case. There are several of them where DRYing up resulted in complex, hard-to-debug code. So the question is, when do we stop DRYing up the code? Because not every time you realize the code has lost simplicity (And often the code author never realizes the simplicity of code is lost). Also, if we have to choose between simplicity and DRYed code, what should choose, if at all there comes a situation where you could get only either of them.
From what I understand, if DRYing up code is causing loss of simplicity, we are doing something terribly wrong. I think, we should be DRYing up code that is repeated and has single responsibility. If the code responsibilities are different and/or the abstraction of entities cannot be named, we are not repeating code. The code pattern might be repeated but its a different code altogether with a responsibilty of its own. If DRYing is resulting into vague code, you are probably trying to DRY up code with different responsibilities that have a similar pattern which is not really a good practice. DRYing should enhance the simplicity, not suppress it.
If you are following REST, then yes, the controllers will be very similar and largely boilerplate. I agree with your manager that it's a problem.
It sounds though like he came up with a suboptimal solution. For a better one, check out Jose Valim's inherited_resources plugin that is being incorporated into Rails 3.
Readability and Maintainability are the two most important features of good code. Unfortunately, a compromise has to be made sometimes. It's a balance question and not everyone is going to agree.
Myself, I lean towards your point of view as well. I would rather have some apparent repetition if it means the code is easier to understand.
As for the 'debugging' problem, I am in the habit when I create such a 'base class' to include a supplementary field. This field is a simple string which identify the most derived class (and is thus passed from Constructor to Constructor). Then each message is going to print this field + the object id "realtype[id]" and everything is suddenly much easier to debug.
Now on to DRY.
There are two things to DRY:
building a hierarchy
using generic code
The first point should now be well understood. A hierarchy of class means a IS-A relationship. If two classes have similar behavior but are otherwise functionally unrelated, then they SHOULD NOT be part of the same hierarchy. It only confuses the poor maintainer and hurts readability.
The second point can be used much more often, especially with scripting languages. For the precedent example, I would argue that instead of having a hierarchy of classes, you could simply define generic methods that would take different classes (modeling different business) and treat them uniformly. This way you avoid repetition (DRY) yet you do not sacrifice readability (imho).
My 2 cts.
If someone every told me--with a straight face--that my code needed DRYing, I would probably take that as a sign that anything else they were going to do was going to be really far-fetched and for-the-sake-of-it.
That having been said, there is also a difference between simplicity in writing code (laziness) and simplicity in the code itself (elegance). I agree, though, that there is a balance. I had this situation myself one particular time (in PHP, but oh how it reminds me of your dilemma):
$checked = ($somevariable) ? "checked=\"checked\"" :"";
echo "<input type="radio" $disabled_checked />";
$checked = ($someothervariable) ? "checked=\"checked\"" :"";
echo "<input type="radio" $checked />";
This isn't even a very good example of what I was dealing with. Essentially, because it's a radio input, both inputs needed a some way of knowing which was to be bubbled in. I knew it had what your boss might call "wetness" issues, so I racked my head trying to come up with some solution that would be graceful and to the point. Finally I showed it to a senior developer and he said "No, it's all in order, it does what it needs to. It's only one extra line."
I felt a relief at being reminded that I was hurting my project more then helping by worrying over this, but at the same time, I'm still disappointed that he was so casual about a fundamental principle (as though it wasn't one of his, though I'm sure it is).
So while I agree, your manager probably was doing something just for the sake of doing it, it is only when we strive to come up with the better methods and approaches that we get better languages like Ruby and Python and cooler libraries like Jquery.
Basically, what if next week you suddenly had 70 things instead of 2? If your boss's objects make that a snap, he was right. If it's the same amount of trouble (in the code or in the execution), he was wrong. But that doesn't mean there isn't a better answer then keeping it simple because it's only a couple of things.
The aim of the DRY principle is to help increase the "quality" of the code.
If the changes aren't improving the quality of the code, it is time to stop.
The ability to judge this comes with experience. As requirements change, the most appropriate way to refactor the code also changes, so it's impossible to have everything ideal - at least you need to freeze the requirements first.
Minimising the size of the code should generally not be a consideration in the quality unless you are codegolfing, so don't DRY when the only purpose is to reduce the size of the code.
Complicated tricks can do more harm than good.
A key reason for applying DRY to improve maintainability is to ensure that when a code change is necessary, that change only needs to be made in one place, thus avoiding the risk that it doesn't get changed everywhere that needs it.
But I'm not telling the whole story:
This interview with Dave Thomas has DT saying:
DRY says that every piece of system
knowledge should have one
authoritative, unambiguous
representation.
The first time I saw "DRY" was in The Pragmatic Programmer so I'm inclined to go with Dave on this.
There's another article worth reading here
But DRY is a principle, not a rule: the better we understand the principle, the more able we should be to recognise situations where it should be applied.
(And finally, I think I'd want a little more than "more-or-less the same" before I started "DRY"ing that code: if I could see a clear way in which the two things might diverge in the future then I'd be inclined to leave them alone).
For me, duplicated code is a smell that can have multiple origins:
Missing variables (introduce variable).
Missing methods (push expression into a method).
Feature envy (push behaviour down into the envied class).
Over-generalization (break up generic class into specific concrete classes).
Insufficient abstraction (push attributes and behaviour down into new class).
This list is probably incomplete. Consider it a starting point.
When you find duplication, think about what problem it's symptomatic of. Then take a stab at addressing that problem. When you're done, consider the readability of the new code. If it has deteriorated, you may be in one of these positions:
You misidentified the problem at the root of the duplication (revert, rethink, try again).
The duplication is a necessary trade-off (revert your change and live with it).
Your software is necessarily complex (commit your change and live with it).
Consider posting example code along with questions like this, if possible. They provide something concrete to work around. And remember, a lot of this stuff is very subjective.

How do you convince your manager that your project needs a huge refactoring? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have joined a rails project as a contractor. The project has been going for more than a year. The code is written by about 10 different developers and most of them are contractors as well. They have different code style. Some of them came from Java. The code has horrible scores with metric_fu. Many functions are very long (100 - 300 lines). Some functions have insane amount of logical branches, loops, and recursions. Each request generates a ton of sql queries. Performance is very bad. Many obsolete code that are never used but never got the chance to be cleaned up. The core architecture is plain wrong or over engineered. Code coverage is only about 25%. Views and partials are chaotic and terrible to read and understand.
The manager is in a position trying to satisfy the CEO by continuously adding new features, however newer features are increasingly hard to get implemented correctly without breaking something else. He knows the code is bad, but doesn't want to put too much effort in fixing them as refactoring will take too long.
As a contractor / developer, what is a good way to clear this situation and convenience the Manager or CEO to partition some time for refactoring?
Related Questions
How can I convince skeptical management and colleagues to allow refactoring of awful code?
How to refactor on a budget
Dealing with illogical managers
In my limited experiance:
It's impossible to convince a manager that it's necessary to set aside time to refactor. You can make him aware of it, and reinforce the point every time that you run into an issue because of bad code. Then just move on. Hopefully your boss will figure it out.
It's quite common to get in on a running project and think "this is total junk". Give it some time. You might begin to see a pattern in the madness.
I've been in similar situation. There are basically only two options:
You get some relaxed time and you may be granted time to refactor something
Due to the bad code further development of some component comes to a stall. You can't proceed to add anything because every little change causes everything else to stop functioning. In this emergency case you will get a "go" with refactoring.
I have just answered in some other question, my horror story:
https://stackoverflow.com/questions/1333077/dirty-coding-tricks-to-deliver-project-on-time/1333095#1333095
I have worked on a project where dirty tricks were the main driving principle of the development. Needlees to say, after some time these tricks have started to conflict with each other. In one analytics component, we had to implement the other very dirty trick - to hide away those calculated values which due to the conflicting tricks were not calculated properly. Afterward, the second level tricks started to conflict and we had to create tricks to deal with those. Ever since, even the mentioning of this component makes me feel horror that I may have to work on it again.
It's exactly the second situation where refactoring is the only way out.
In general, many managers without a technical background (actually, those who come from bad programmers as well) neither care nor understand the value of quality code and good architecture. You can't make them listen until something interrupts their plans, like a blow of "non-implementable" features, increasing and reoccurring bugs, customer requests that cannot be satisfied and so on. Only then understanding of the code problems may come for the first time. Usually, it's too late by then.
Refactoring code that sucks is part of coding, so you don't need to get anybody's approval unless your manager is watching your code and or hours VERY closely. The time I save refactoring today is time that I don't have to bill doing mad tricks to get normal code to work tomorrow (so it works out, in the end).
Busting up methods into smaller methods and deleting methods that are not used is part of your job. Reducing DB calls, in code that you call, is also necessary so that your code doesn't suck. Again, not really refactoring, just normal coding.
Convincing your manager depends on other factors, including (but not limited to) their willingness to be convinced, and your ability to convince.
Anyway, what is massive refactoring in RoR? Even if the "core architecture is just plain wrong," it can usually be straightened out a bit at a time. Make sure you break it into chunks /use branches so you don't break anything while you're busy fixing.
If this is impossible, then you come back to the social question of how to convince your manager. That's a simple question of figuring out what his/her buttons are, and pushing them without getting fired nor arrested. Shaming, withholding food, giving prizes, being a friend, anonymous kidnapping threats where you step in and save the day... It's pretty simple, really: creativity is the key!
Everyone is missing a point here:
Refactoring is part of the software development life cycle.
this is not only a RoR or any specific project but any other software development project.
If somehow you could convince your PM why it is important to refactor the existing code base before adding any new feature, you're done. You should clearly tell your PM that any further addition of new feature without any refactoring will take more time than required. And even if the feature is added, somehow, bug resolving sessions will take even more time since the code is very bloat and unmaintainable.
I really don't understand why people forget the principle of optimise later. Optimising later also includes refactor later IMHO.
One more thing, when taking design decisions, you should tell the consequences, good or bad, to your PM very very clearly.
You can create a different branch(I assume you are using git) for refactoring and start adding new feature in some other branch if your PM insist on adding new feature along with refactoring.
A tricky one, i have recently worked in such a company... they were always pushing for new things, again they knew it was bad, but no matter how hard I pushed it - i even got external consultants in to verify my findings - they seen it as a waste of time.
Eventually they seen the light... it only took multiple server crashes and at one point almost a full 8 days of no website to convince them.
Even at that they insisted it 'must' be the hosting service.
The key is to try and quantify how long their site will last before it crunches, and get some external verification to back you up - 'they' always trust outsiders who know nothing about your app! Also, try - if you can - to give a plan that involves gradual replacement at worst, and a plan for how long it would take to do that way. Also a plan for if 1 or 2 bodies were working on a complete rewrite hwo long it might take - but be realistic too or it will bit you in the bum! If you go that route (which is what we done) you can still have some work on the existing site as long as you incorporate it into the new.
I would suggest that you put focus on things that they can see for themselves, that is, they will surely notice that the application is slow in some functionalities, so pick up one of them and say something like "I can reduce the waiting time here, can I take some time to improve this specific thing?" (more well said, but you got the point :P).
Also consider that 10 developers before you did not refactor the code base, this may mean that it is a monstruos task, likely to make the situation worse, in this situation if something will go wrong after the refactoring it will be your fault if the program does not work properly anymore.
Just a though, but worth considering.
I'd take one small chunk of application and refactored and optimized it 'till it shines (and I'd do it in my personal time in order not to annoy my manager). Then you'll be able to show your manager/CEO the good results of refactoring and SQL optimizations.
If there is a need to refactor then the code will speak for itself. Minor refactoring can continue in during development. If you cant convince the manager then probably you should rethink if its necessary at all.
However if it is absolutely necessary then constructing metrics of development activities and the benefits should convince the manager.
I think one of options would be to highlight to the manager how re-factoring the code base now will save time (i.e. money) in the long-run. If the project is expecting to be running long term then making the changes now will clearly save you and other developers time in the future.
Best to use an example of a feature you've worked on estimating how long it would have taken if you had the cleaner code to work from in the first place. Good luck!
I am in same position right now, but with an agreement with the manager that, when the new feature should be implemented in some existing module to re-factor the module too (if it needs re-factoring), we are struggling now with the code created 4-5 years ago and definitely I find out that the re-factoring someone else s code is not trivial nor amusing to do, but very very helpful for the future re-use.

How Long Do You Keep Your Code?

I took a data structures class in C++ last year, and consequently implemented all the major data structures in templated code. I saved it all on a flash drive because I have a feeling that at some point in my life, I'll use it again. I imagine something I end up programming will need a B-Tree, or is that just delusional? How long do you typically save the code you write for possible reuse?
Forever (or as close as I can get). That's the whole point of a source control system.
-1 to saving everything that's ever produced. I liken that to a proud parent saving every single used nappy ever to grace the cheeks of their little nipper. It's shitty and the world doesn't benefit from it's existence.
How many people here go past the first page in google on a regular basis? Having so much crap around only seems to make it difficult to find anything useful.
+1 to keeping code forever. In this day and age, there's just no reason to delete data which could possibly be of value in the future. Even if you don't use the B-Tree as a useful structure, you may want to look at the code to see how you did something. Or even better, you may wish to come back to the code someday for instructional purposes. You'll never know when you might want to look at that one particular sniblet of code that accomplished a task in a certain way.
If I use it, it gets stuck in a Bazaar repository and uploaded to Launchpad. If it's a little side project that pitters out, I usually move it to a junk/ subdirectory.
I'll use it again. I imagine something I end up programming will need a B-Tree, or is that just delusional?
Something you write will need a B-tree, but you'll be able to use a library for it because the real world values working solutions over extra code.
I keep backups of all of my code for as long as possible. The important things are backed up on my web server and external hdd. You can always delete things later, but if you think you might find a use for it, why not keep it?
I still have (some) code I wrote as far back as college, and that would be 18 years ago :-). As is often the case, it is better to have it and never want it, than to want it and not have it.
Source control, keep it offsite and keep it for life! You'll never have to worry about it.
I have code from many, many years ago. In fact, I think I still have my first php script. If nothing else, it's a good way to see how much you have changed over time.
I agree with the other posters. I've kept my code from school in a personal source code repository. What harm does hanging on to it really do?
I would just put it on a disk for historical sake. Use the
Standard Template Library - one mistake people make is assuming that thier implementation of moderate to complex data structures are the best. I can't tell you how many times I have found a bug in a home grown B-tree implementation.
Keep everything! You never know when it will save you some work. About a year ago I needed some c code to parse an expression, tokenize it for storage, and evaluate the results latter. Ugly little piece of code.. But is seemed familiar, as it should have- I had to do a post-fix evaluator in college (30 years ago)- and still had the code. Admittedly it needed a little clean-up, but saved me a couple of days of work.
I implemented a red black tree in Java while in in college. I have always wanted to find that code again and cannot.
Now I do not have the time to recreate it from scratch since I have three kids and do not develop in Java.
I now keep everything so that I can relearn much faster. I also find it fascinating to see how I did something 1, 5, 10 years ago. It makes me feel good because I either did it right or I am better now and would do it differently
If I ever go back to college to give a lecture to future students it in on the list of things to do:
Save everything...
I'm a code packrat, for better or worse, but I guard it, because sometimes it's client-confidential.
On occasion, this has been really useful, like if a client lost their stuff, or their documentation.
I lost a lot of old code (from 10 years ago) because of computer failure that wasn't backed up but in fact I do not really care because I do not really want to see code that is programmed in very old language. Most of this code was written in VB5...
I agree that now it's easy to keep everything but I think sometime it's good to clean up our backup/computer storage because it's like in the real world, we do not need to keep everything forever.
Forever is the beauty of the electronic medium. That's one of the most attractive aspects for me.
But, the keeping of it depends on your coding style, and what you do with it.
I'd suggest tossing your code if you're the type that...
Never looks back.
Would rather re-write from your memory to improve your craft.
Isn't very organized.
Is bothered by latent storage to no end.
Likes to live on the edge.
Worships efficiency of memory.
Logical reasons for tossing could would be...
It bothers you.
It disrupts your workflow by getting in your way.
You're ashamed of it.
It confuses you and distracts you.
Like anything that takes up physical space in life, it's value is weighed against it's usefulness.
All my code is kept indefinitely, with plans to return to it at some point, reflect, and refactor. I do that because it's fun to see my progress, and provides very accessible learning experiences. Furthermore, the incorporation of all my code into a consolidated framework is something I work towards all the time.
Forever...
Good code never dies. ;)
I don't own most of the code I develop: my employer does. So I don't keep that code (my employer does - or should).
Since I discovered computing, I wrote code for devices that no longer exist in languages that are no longer worth. Maybe there is some emulator but keeping that code and running it would be nostalgia.
You can find B-tree information (and many other subjects) on Wikipedia (and many other places). There is no need to keep that code.
In the end I keep only code that I own and maintain.

How do you read an existing Rails project?

When you start working on an existing Rails project what are the steps that you take to understand the code? Where do you start? What do you use to get a high level view before drilling down into the controllers, models, helpers and views? Do you have any specific techniques, tricks or tools that help speed up the process?
Please don't respond with "Learn Rails & Ruby" (like one of the responses the last guy who asked this got - he also didn't get much response to his question so I thought I would ask again and prompt a bit more). I'm pretty comfortable with my own code. It's sorting other people's that does my head in and takes me a long time to grok.
Look at the models. If the app was written well, this should give you a picture of its domain model, which is where the interesting logic should live. I also look at the tests for the models.
The way that the controllers/views were implemented should be apparent just by using the Rails app and observing the URLs.
Unfortunately, there are many occasions where too much logic lives in controllers and even views. That means you'll have to take a look into those directories too. Doubley-unfortunate, tests for these layers tend to be much less clear.
First I use the app, noting the interesting controller and action names.
Then I start reading the code for these controllers, and for the relevant models when necessary. Views are usually less important.
Unlike a lot of the people so far, I actually don't think tests are the place to start. I think they're too narrow, too focused. It'd be like trying to understand basic physics/mechanics by first zooming into intra-molecular forces and quantum mechanics. I also think you're relying too much on well-written tests, and in my experience, a lot of people don't write sufficient tests or write poor tests (which don't give an accurate sense of what the code should actually do).
1) I think the first thing to do is to understand what the hell the app actually does. Use it, at least long enough to build an idea of what its main purpose is and what the different types of data might be and which actions you can perform, and most importantly, why.
2) You need to step back and see the big picture. I think the best way to do that is by starting with schema.rb. This tells you a few really important things:
What is the vocabulary/concepts of this project. What does "User" actually mean in this app? Why does the app have both "User" and "Account" models and how are they different/related?
You could learn what models there are by looking in app/models but this will actually tell you what data each model holds.
Thanks to *_id fields, you'll learn the associations between the models, which helps you understand how it all fits together.
I'd follow this up by looking at each model's *.rb file for (hopefully) comments, validations, associations, and any additional logic relevant to each. Keep an eye out for regular ol' Ruby classes that might live in lib/.
3) I, personally, would then take a brief glance at routes.rb as it will tell you two key things: a brief survey of all of the actions in the app, and, if the routes and controllers/actions are well named and thought out, a quick sense of where different functionality might live.
At this point you're probably ready to dig into a specific thing you need to learn. Find the controller for the feature you're most interested in and crack it open. Start reading through the relevant actions, see which models are involved, and now maybe start cracking open tests if you want to.
Don't forget to use the rest of your tools: Ruby/Rails debuggers, browser dev tools, logs, etc.
I would say take a look at the tests (or specs if the project uses RSpec) to get an idea at the high-level of what the application is supposed to do. Once you understand from the top level how the models/views/controllers are expected to behave, you can drill into the implementations.
If the Rails project is in a somewhat stable state than I have always been a big fan of using the debugger to help navigate the code base. I'll fire up the browser and begin interacting with the app then target some piece of functionality and set a breakpoint at the beginning of the associated function. With that in place I just study the parameters going into the function and the value being returned to get a better understanding of what's going on. Once you get comfortable you can modify the functionality a little bit to ensure you understand what's going on. Just performing some static analysis on the code can be cumbersome! Good luck!
I can think of two reasons to be looking at an existing app with which I have no previous involvement: I need to make a change or I want to understand one or more aspects because I'm considering using them as input to changes I'm considering making to another app. I include reading-for-education/enlightenment in that second case.
A real benefit of the MVC pattern in particular, and many web apps in general is that they are fairly easily split into request/response pairs, which can to some extent be comprehended in isolation. So you can start with a single interaction and grow your understanding out from that.
When needing to modify or extend existing code, I should have a good idea of what the first change will be - if not then I probably shouldn't be fooling with the code yet! In a Rails app, the change is most likely to involve view, model or a combination of both and I should be able to identify the relevant items fairly quickly. If there are tests, I check that they run, then attempt to write a test that exposes the missing functionality and away we go. If there are no tests then it's a bit trickier - I'm going to worry that I might inadvertently break something: I'd consider adding tests to give myself a more confidence, which will in turn start to build some understanding of the area under study. I should fairly quickly be able to get into a red-green-refactor loop, picking up speed as I learn my way around.
Run the tests. :-)
If you're lucky it'll have been built on RSpec, and that'll describe the behavior regardless of the implementation.
I run rake test in a terminal
If the environment does not load, I take a look at the stack trace to figure out what's going on, and then I fix it so that the environment loads and run the tests again
I boot the server and open the app in a browser. Clicking around.
Start working with the tasks at hand.
If the code rocks, I'm happy. If the code sucks, I hurt it for fun and profit.
Aside from the already posted tips of running specs, and decomposing the MVC, I also like:
rake routes
as another way to get a high-level view of all the routes into the app
./script/console
The rails irb console is still my favorite way to inspect models and model methods. Grab a few records and work with them in irb. I know it helps me during development and test.
Look at the documentation, there is pretty good documentation on some projects.
It's a little bit hard to understand other's code, but try it...Read the code ;-)

Resources