Ruby on Rails with Repository Pattern? - ruby-on-rails

After working with ASP.Net MVC, it has me thinking about Rails. I worked with Rails prior, but am a little rusty. ASP.Net MVC tutorials recomment hiding data layer implementation with the repository pattern. This allows easiesr Dependency Injection for Unit Testing, and nice decoupling of the controller from the model implementation.
I remember Rails’ controllers using Active Record objects directly, and unit tests using test databases that could be setup and torn down with ease. That solves the need to swap out for unit testing, but still it seems like a bad idea to have so much ActiveRecord code exposed in the controller.
So my question is, what is the latest best practice here? Are real (not mocked) databases still used for unit testing? Do Rails developers call ActiveRecord directly, or an abstraction?

My experience has been that Ruby on Rails integrates ActiveRecord so tightly (in most cases, it can become nearly completely transparent) that developers often use it without any Abstraction.
The thing to remember is that the Repository pattern and the Active Record pattern were both suggested in Patterns of Enterprise Architecture by Martin Fowler (which, if you haven't read it yet...you should). Active Record is tightly integrated in Rails. Microsoft .NET doens't tie you to a pattern...so the Repository pattern was adopted by the majority of the developers.

Does ActiveRecord even really constitute the "data layer", I wonder? After all, its purpose is to abstract (to a fairly reasonable extent) the actual interaction storage. If I have a model that inherits from ActiveRecord::Base and I reference that model in a controller, am I really interacting with the data layer?
Looking at a brief description of the Repository Pattern I'd say that methods of the find_by_ are already giving you much of what it describes, which is good, isn't it? OK, the abstraction layer is leaky (one might more generously say "pragmatic") in that we can go a lot closer to the metal if need be, and find_by_sql for example will pretty much make it obvious that we're dealing with a relational database of some kind.
I'd recommend never (or maybe I should say "rarely and not without extreme justification" - it's always tricky using absolutes) putting code in controllers that makes it possible to infer the data platform being used. It should all be pushed into the models - named_scope can be very useful here. For complex results, consider using "presentation" objects as the interface (Struct and my personal favourite OpenStruct can be very useful here).
While ActiveRecord is the de facto standard, given that it installs with Rails, it's not the only game in town. For non-SQL databases, something different is necessary, but even in the SQL domain there's Datamapper (is that based on the eponymous PoEAA pattern?)
In Rails 3.0 it's going to be a lot easier to pick and choose components such as the ORM as Yehuda and the boys unpick and clean up the interfaces.

You can do it either way. Most often, Rails functional tests are written to go all the way to the database, where data is populated from fixtures, as you describe.
But it's not uncommon to mock out the service layer calls, for example:
User.expects(:find_by_id).with("1").returns(u);
get :show, :id=>"1"
... or something like that. In fact, I do this all the time have control of the model object (or mock that out as well).

Folllowing Rails conventions always leads down the Path of Least Painful Memories, so it is advised.
Depending on your definition of "real"... For unit testing I've seen that people tend to use the same data schema as their main site, and seed the database before the tests are ran / during the tests (by using Factory Girl, Machinist or plain ol' fixtures) and then the tests are ran based on that data.
Rails developers call ActiveRecord directly on this data, as in the real world.

Controllers are supposed to access models in MVC. Rails is all about avoiding some of the needless abstractions that characterise the enterprise world.

Related

Structuring an MVC Application wityh Entity Framework and building using TDD

Background
I am about to start the process of creating a new application with MVC 5 and EF6 and building it out with TDD. This is my first MVC application so i have decided to use it as a bit of a learning platform to better understand a whole range of patterns and methodologies that i have been exposed to but have only used in passing up until this point.
I started with this in my head:
EF - Model
Repositories
Services
UI (controllers views)
Removing the Repositories
I shifted this thinking to remove one layer, repositories simply as my understanding has grown i can see the EF (specifically IDbSet) implements a repository pattern or sorts and the context itself is a unit of work, so wrapping it around a further abstraction, for this application at least seems pointless, at that level anyway.
EF will be abstracted at the Service Layer
Removing the Repo's doesn't mean EF will be directly exposed to the controllers as in most cases i will use the services to expose certain methods and business logic to the controllers, but not exclusively exclude EF as i can use it outside of services to do things like building specific queries which could be used at a service level and a controller level, the service layer will simply be a simpler way of mapping specifics from the controller to the EF and data concerns.
This is where it gets a bit ropey for me
Service Layer
My services feel a little bit like repositories in the way they will map certain functions (getById etc), which i am not sure is just naturally the way they are or if my understanding of them is way off and there is more information that i can't find to better my knowledge.
TDD & EF
I have read a ton of stuff about the EF and how you can go about testing with unit wise, how you shouldn't bother as the leakyness of IQueryable and the fact that Linq-to-entities and Linq-to-objects means that you won't get the results that you intend all of the time, but this has led to simply confusing the hell out of me to the point where i have an empty test file and my head is completely blank because i am now over thinking the process.
Update on TDD the reason the TDD tag was included as i thought maybe someone would have an idea on how they approach something like this without a repository because that is an abstraction for abstractions sake. Would they not unit test against it and use other tests to test the query-able behavior like a integration test or end to end test? but from my limited understanding that wouldn't be TDD as the tests would not be driving my design in this instance?
Finally, To The Point
Is the:
EF
Service
UI
architecture a good way to go, initially at least?
Are there any good examples of a well defined service layer out there so i can learn, and are they in the main a way to map certain business operations that have data connotations to some for of persistence mechanic (in this case an ORM and EF) without having the persistence requirements of say a repository?
With the TDD stuff, is it ok to forgo unit tests for service methods that are basically just calling EF and returning data and just opting for slower integration tests (probably in a seperate project so they are not part of the main test flow and can be run on a more ad-hoc basis?
Having one of those weeks and my head feels like it is about to explode.
Lol I've had one of those weeks myself for sure. ;)
I've had the same kind of internal discussions over how to structure MVC projects, and my conclusion is find what's most comfortable to you.
What I usually do is create the following projects:
Core/Domain - here I have my entities/domain model, and any
other thing that may be shared among layers: interfaces, for
example, configuration, settings, and so on.
Data/EF - here
I have all my EF-dependent code: DataContext and Mappings
(EntityTypeConfiguration). Ideally I could create another
version of this using, say NHibernate and MySQL, and the rest of the
solution will stay the same.
Service - this depends on Core
and Data. I agree in the beginning it will look like a simple facade
to your Data, but as soon as you start adding features, you'll find
this is the place to add your "servicemodels". I'm not saying
ViewModel as this is quite Web-ui related. What i mean with
"ServiceModel" is creating a simpler version of your domain objects.
Real example: hide your CreatedOn, CreatedBy properties, for
example. Also, whenever one of your controller's actions grow to
anything over quite simplistic, you should refactor and move that
logic to the service and return to the controller what you really
need.
Web/UI This will be your webApp. It will depend on Core and Service.
You didn't mention dependency injection but you have to definitely look at it.
For testing, you can test your Data using a SqlCompact provider that re-creates the database for each test instead of using a full SqlExpress. This means your DataContext should accept a connectionString parameter. ;)
I've learned a lot seeing big projects source code, like http://www.nopcommerce.com. You could also have a look at http://sharparchitecture.net/ although I bet you already saw that.
Be prepared to have some nightmares with complex object graphs in EntityFramework. ;)
My final advice is: find something specific to do and dive in. Too much abstraction will keep you from starting, and starting is key to practice and understanding.

How would the 'Model' in a Rails-type webapp be implemented in a functional programming language?

In MVC web development frameworks such as Ruby on Rails, Django, and CakePHP, HTTP requests are routed to controllers, which fetch objects which are usually persisted to a backend database store. These objects represent things like users, blog posts, etc., and often contain logic within their methods for permissions, fetching and/or mutating other objects, validation, etc.
These frameworks are all very much object oriented. I've been reading up recently on functional programming and it seems to tout tremendous benefits such as testability, conciseness, modularity, etc. However most of the examples I've seen for functional programming implement trivial functionality like quicksort or the fibonnacci sequence, not complex webapps. I've looked at a few 'functional' web frameworks, and they all seem to implement the view and controller just fine, but largely skip over the whole 'model' and 'persistence' part. (I'm talking more about frameworks like Compojure which are supposed to be purely functional, versus something Lift which conveniently seems to use the OO part of Scala for the model -- but correct me if I'm wrong here.)
I haven't seen a good explanation of how functional programming can be used to provide the metaphor that OO programming provides, i.e. tables map to objects, and objects can have methods which provide powerful, encapsulated logic such as permissioning and validation. Also the whole concept of using SQL queries to persist data seems to violate the whole 'side effects' concept. Could someone provide an explanation of how the 'model' layer would be implemented in a functionally programmed web framework?
Without wanting to bash object oriented MVC frameworks -- I don't know Rails, but Django is an excellent piece of software to my eye -- I'm not sure that Object-Relational Mapping is a particularly good metaphor1.
Of course in an OO language it may seem natural to want to think of tables in terms of objects, but in a functional language it is perfectly natural to think of tables in terms of tables. A single row can be represented easily using an algebraic data type (in Haskell and other statically typed functional languages) or a map (a.k.a. a dictionary; an associative structure mapping keys to values); a table then becomes a sequence of rows, which after all it is even at the DB level. Thus there is no special mapping from the DB construct of a table to some other construct available in the programming language; you can simply use tables on both sides.2
Now this does not in any way mean that it is necessary to use SQL queries to manipulate the data in the DB, foregoing the benefits of abstraction over varios RDBMSs' quirks. Since you're using the Clojure tag, perhaps you might be interested in ClojureQL, an embedded DSL for communicating with various DBs in a generic way. (Note that it's being reworked just now.) You can use some such DSL for extracting data; manipulate the data thus obtained using pure functions; then display some results and maybe persist some data back to the DB (using the same DSL).
1 If you think comparing a technology to the Vietnam war is a bit extreme, I guess I agree, but that doesn't mean that article doesn't do a very good job of discribing why one might not want to sink in the ORM quagmire.
2 Note that you could use the same approach in an OO language and abstract over DB backends in the same way in which it's done in FP languages (see the next paragraph). Of course then your MVC framework would no longer look quite like Rails.
Have a look at the Conjure web application framework for an example of how one might implement an MVC framework in a functional programming language. Conjure uses clj-record for the model layer, which has support for associations and validations.

Do I really need an ORM?

We're about to begin development on a mid-size ASP.Net MVC 2 web site. For a typical page, we grab data and throw it up on the web page, i.e. there is not much pre-processing of the data before it is sent to the UI.
We're now making the decision whether or not to use an ORM and if yes, which one. We had been looking at EF2 AKA EF4 (ASP.Net Entity Framework in VS 2010) as one possibility.
However, I'm thinking a simple solution in this case may be just to use datatables. The reason being that we don't plan to move the data around or process it a lot once we fetch it, so I'm not sure there is that much value in having strongly-typed objects as DTOs. Also, this way we avoid mapping altogether, thereby I think simplifying the code and allowing for faster development.
I should mention budget is an issue on this project, as well as speed of execution. We are striving for simplicity anywhere we can, both to keep the budget smaller, the schedule shorter, and performance fast.
We haven't fully decided this yet, but are currently leaning towards no ORM. Will we be OK with the no ORM approach or is an ORM worth it?
An ORM-tool isn't mandatory!
Jon's advice is sensible, but I think using DataTables isn't ideal.
If you're using an ORM tool, the object model is far simpler than a full-blown OO domain model. Moreover, Linq2Sql and Subsonic, for example, are straight-forward to use. Allowing very quick code changes when the database changes.
You say you won't move the data around or process it a lot, but any amount of processing will be far easier in ORM objects rather than in DataTables. Again, if the application changes and more processing is required the DataTable solution will be fragile.
If you're not going to practice full-blow Object Oriented Programming (and by that I mean you should have a very deep understanding of OOP, not just the ability to blurt out principles and design pattern names) then NO, it's not worth going for an ORM.
An ORM is only useful if your organization is fully invested in Object Oriented application design, and thus having the problem of having an Object to Relational model mapping. If you're not fully into OO, ORMs will become some sort of nasty roadblock that your organization will then feel it doesn't need.
If your team/organization's programming style has always leaned to keeping business logic in the DB (e.g., stored procs) or sticking to a more or less functional/procedural/static approach at writing objects and methods, do away with ORMs and stick to ADO.NET.
It sound as if you only need to show data and dont do CRUD.
I have found that ORM is not the best way to go when displaying lists that consists of data from various tables. You end up loading large objectgraphs just to get this one needed field.
A SQL-statement is just so much better at this.
I would still return lists of strongly typed objects from the DAL. By that you have a better chance of getting a compile time error when a change in the DAL is not reflected in other layers.
If you already have stored procedures you need then there probably isn't that much to gain from an ORM. If not though my experience has been that working with Linq to Entites has been much faster than the traditional stored procedure/strongly typed dataset approach assuming you are comfortable with Linq queries.
If you aren't worried about mapping to an object model then Linq to SQL is even simpler to use. You certainly don't need to be using a full OO model to reap the productivity benefits.
It would disagree with Malcolm's point about having to bring back graphs, if the ORM supports Linq you can use a projection to return a flat result with just the data you want with the added advantage the query is usually simpler than the corresponding SQL since you can use relationships rather than join.
Having made the switch and become comfortable with the technology I can think of this almost no good reason not to use one, they all support falling back to SQL stored procedures if you really need to. There will be a learning curve though and in this case that may make it not worth your while.
I agree with Joe R's advice - for speed of making changes & also the speed of initial development, LINQ-to-SQL or subsonic will get you up and going in no time.
If your application is really this simple and it's just a straight data out/data in direct mapping to the tables, have you considered looking at ASP.net dynamic data?
I'd also point to a good article about this by Scott Guthrie.
It largely depends on how familiar you are with ORMs.
I personally think that NHibernate, which is the king of ORMs in the .NET world, allows much more rapid development as once set up, you can pretty much forget about how you are getting data out of the database.
However, there is a steep learning curve to it, especially if you try and do things in a non-hacky way (which you should), so if your team doesn't have experience here and time is pressing then it probably won't cut it.
Linq2SQL is way too simple. Don't know about Subsonic, but if you are going to use an ORM it may be a good balance between having rapid development and getting something too powerful and complex.
Ultimately though, as a team, I think you want to learn NHibernate which is not time consuming to set up on a small to medium project once you know what you are doing, but is very powerful.

Do we use Rails ActiveRecord as a Hybrid Structure, i.e. Data Structure+Object?

I have been using Rails for over 4 years so obviously I like Rails and like doing things the Rails Way and sometimes I unknowingly fall to the dark side.
I recently picked up Clean Code by Uncle Bob. I am on Chapter 6 and a bit confused whether we as rails developers break the very fundamental rule of OO design, i.e. Law of Demeter or encapsulation? The Law of Demeter states that an object should not know the innards of another object and it should not invoke methods on objects that are returned by a method because when you do that then it suggests one object knows too much about the other object.
But very often we call methods on another object from a model. For example, when we have a relationship like 'An order belongs to a user'. Then very often we end up doing order.user.name or to prevent it from looking like a train wreck we set up a delegate to do order.name.
Isn't that still like breaking the Law of Demeter or encapsulation ?
The other question is: is ActiveRecord just a Data Structure or Data Transfer Object that interfaces with the database?
If yes, then don't we create a Hybrid Structure, i.e. half object and half data structure by putting our business rules in ActiveRecord Models?
Rails is Rails. What else is there to say. Yes, some of the idioms in Rails violate good design principles. But we tolerate this because it's the Rails way.
Having said that, there is far too much model usage in most rails applications. Far too often I see view code directly accessing models. I see business rules folded into the active record object. A better approach would be to isolate the business rules from the active records and isolate the views from the models. This wouldn't violate any rails idioms, and would make rails applications a lot more flexible and maintainable.
IMHO if you follow the purist approach too much then you end up in a mess like Java where it uses all the right design patterns but no-one can remember the eight lines of code you need just to open a file and read its contents.
Rails' ActiveRecord framework is an implementation of Martin Fowler's Active Record design pattern. Active Records in Rails are certainly not just dumb data structures or DTOs because they have behaviour: they perform validation, they can tell you if their attributes have changed etc. and you're free and indeed encouraged, to add your own business logic in there.
Rails in general encourages good practice e.g. MVC and syntactic vinegar to make doing bad things difficult and/or ugly.
Yes, ActiveRecord deliberately breaks encapsulation. This is not so much a limitation of Rails as it is a limitation of the pattern it's based on. Martin Fowler, whose definition of ActiveRecord was pretty much the template Rails used, says as much in the ActiveRecord chapter of POEAA:
Another argument against Active
Record is the fact that it couples
the object design to the database
design. This makes it more difficult
to refactor either design as a project
goes forward.
This is a common criticism of Rails from other frameworks. Fowler himself says ActiveRecord is mainly to be used
...for domain logic that isn't too
complex...if your business logic is
complex, you'll soon want to use your
object's direct relationships,
collections, inheritance and so forth.
These don't map easily onto Active Record.
Fowler goes on to say that for more serious applications with complex domain logic the Data Mapper pattern, which does a better job of separating the layers, is preferable. This is one of the reasons that Rails upcoming move to Merb has been generally seen as a positive move for Rails, as Merb makes use of the DataMapper pattern in addition to ActiveRecord.
I'm not sure Demeter is the primary concern with ActiveRecord. Rather I think breaking encapsulation between the data and domain layers breaks Uncle Bob's Single Responsibility Principle. Demeter I think is more a specific example of how to follow the Open/Closed Principle. But I think the broader idea behind all these is the same: classes should do one thing and be robust against future changes, which to some degree ActiveRecord is not.
Concerning "Law of Demeter" one thing I've not seen mentioned is the concept of distance. By that I mean, "How closely related are the object involved?" It is my opinion that this would make some difference whether I care to follow "Law of Demeter" or not.
In the case of ActiveRecord, the objects involved in most of the LoD violations are inseparably bound together into a close relationship. Changing the internal data structure of these objects require a change in the database to reflect that new structure. The tables of a database are typically "bound" together into a single database, which even reflects these "associations" through foreign key constraints (or at least contain primary & foreign keys).
So I don't generally concern myself with following LoD between my AR objects. I know that they are tightly bound to each other due to their very nature.
On the other hand I would be more concerned about LoD between more distant objects, especially those that cross MVC boundaries or any other such design device.

Sequel in conjunction with ActiveRecord any gotchas?

I'm considering using Sequel for some of my hairier SQL that I find too hard to craft in Active Record.
Are there any things I need to be aware of when using Sequel and ActiveRecord on the same project? (Besides the obvious ones like no AR validations in sequel etc...)
Disclaimer: I'm the Sequel maintainer.
Sequel is easy to use along side of or instead of ActiveRecord when using Rails. You do have to setup the database connection manually, but other than that, the usage is similar. Your Sequel model files go in app/models and work similarly to ActiveRecord models.
Setting up the database connections isn't tedious, it's generally one line in environment.rb to require sequel, and a line in each environment file (development.rb, test.rb, production.rb) to do something like:
DB = Sequel.connect(...)
So it's only tedious if you consider 4 lines of setup code tedious.
Using raw SQL generally isn't a problem unless you are targeting multiple databases. The main reason to avoid it is the increased verbosity. Sequel supports using raw SQL at least as easily as ActiveRecord, but the times where you need to use raw SQL are generally fairly rare in Sequel.
BTW, Sequel ships with multiple validation plugins. The validation_class_methods plugin is similar to ActiveRecord validations, using class methods. The validation_helpers plugin has a simpler implementation using instance level methods, but both can do roughly the same thing.
Finally, I'll say that if you already have working ActiveRecord code that does what you want, it's probably not worth the effort to port the code to Sequel unless you plan on adding features.
Personally, I wouldn't do it. Just managing connection more-or-less by hand would be tedious, for a start. I'd be more inclined, if I felt Sequel was the stronger option, to hold off for Rails 3.0 (or perhaps start developing against Edge Rails) where it should be fairly easy to switch ORMs, if Yehuda and co are doing their stuff right. A lot more Merb-like than now, at least.
This was DHH's take on the subject (I'm not saying it should be taken as gospel truth, mind, but it is, so to speak, from the horse's mouth):
But Isn’t Sql Dirty?
Ever since programmers started to
layer object-oriented systems on top
of relational databases, they’ve
struggled with the question of how
deep to run the abstraction. Some
object-relational mappers seek to
eradicate the use of SQL entirely,
striving for object oriented purity by
forcing all queries through another OO
layer.
Active Record does not. It was built
upon the notion that SQL is neither
dirty nor bad, just verbose in the
trivial cases. The focus is on
removing the need to deal with the
verbosity in those trivial cases but
keeping the expressiveness around for
hard queries – the type SQL was
created to deal with elegantly.
Therefore, you shouldn’t feel guilty
when you use find_by_sql() to handle
either performance bottlenecks or hard
queries. Start out using the
object-oriented interface for
productivity and pleasure, and the dip
beneath the surface for a
close-to-the-metal experience when you
need to.
(Quote was found here, original text is on p334 of AWDRWR, the "hammock" book).
I think that's reasonable.
Are we talking about something that find_by_sql can't handle? Or are we talking about complex non-SELECT stuff that execute can't deal with?
Any examples we could look at?

Resources