Is entity framework a bad choice for multiple websites and large aplications? - asp.net-mvc

Scenario : We currently have a website and are working on building couple of websites with an admin website. We are using asp.net-mvc , SQL Server 2005 and Entity Framework 4. So, currently we have a single solution that has all the websites and all the websites are using the same entity framework model. The Model currently has over 70 tables and will potentially have a lot more in the future... around 400?
Questions : Is Entity Framework model going to be slower when it is going to grow bigger? I have read quite a few articles where they say it is pretty slow due to the additional layers of mapping when as compared to say ado.net? Also , we thought of having multiple models but it seems that it is a bad practice too and is LINQ useful when we are not using any ORM?
So, we are just curious what and how all the large websites using a similiar technology as we have achieve good performance while using an ORM like EF or do they never opt for an ORM ? I have also worked on a LINQ to SQL application that had over 150 tables and we encurred a huge startup penalty, site took 15-20 seconds to respond when first loaded. I am pretty sure this was due to large startup cost of LINQ to SQL ORM. It would be great if someone can share their experience and thoughts regarding this ? What are the best practices to follow and I know it depends on every application but if performance is a concern then what are the best steps to be taken ?

I don't have a definite answer for you, but I have found this SO post: ORM performance cost, it will probably be informative for you, expecially the second highest answer mentioning this site:
http://ormbattle.net/
My personal experience is that for any ORM mapper I have seen so far, Joel's law of leaky abstraction applies heavily. So if you are going to use EF, make sure you have alternatives for optimization at hand.

I think you can certainly get EF4 to work in a performant way with a database with a large number of tables. That said, you will certainly have to overcome a number of hurdles that are specific to EF.
I don't think LinqToSql is a good alternative since Microsoft has stopped enhancing it for the most part.
What other alternatives have you considered? ADO.NET? NHibernate? Stored Procedures?
I know NHibernate may have trouble establishing the SessionFactory for 400 tables quickly, but that only happens once when the website application starts, which should be fairly rare if the application is used heavily. Each web request generally has a new Session and creating sessions from the session factory is very quick and inexpensive.

My biggest concern with EF is the management of the thing, if you have multiple models, then you're suddenly going to have multiple work to do maintaining them, making sure you never update the wrong model for the right database, or vice versa. This is a problem for us at the moment, and looks to only get worse.
Personally, I like to write SQL rather than rely on an abstraction on top of an abstration. The DB knows SQL and so I keep it happy with hand-crafted stored procedures, or hand-crafted SQL in some cases. One huge benefit to this is that I can reply code to see what it was trying to do, and see the resulting data by c&p the sql from the log to the sql query editor. That, in my opinion, makes support so much easier it entirely invalidates any programmer benefit you might have from using an ORM in the first place (especially as EF generates absolutely unreadable SQL).
In fact, come to think of it, the only benefit an ORM gives you is that you can code a bit quicker (once you have everything set up and are not changing the schema, of course), and ultimately, I don't think the benefit is worth the cost, not when you consider that I spend most of my coding time thinking about what I'm going to do as the 'doing it' part is relatively small cmpared to the design, test, support and maintain parts.

Related

Do I really need an ORM?

We're about to begin development on a mid-size ASP.Net MVC 2 web site. For a typical page, we grab data and throw it up on the web page, i.e. there is not much pre-processing of the data before it is sent to the UI.
We're now making the decision whether or not to use an ORM and if yes, which one. We had been looking at EF2 AKA EF4 (ASP.Net Entity Framework in VS 2010) as one possibility.
However, I'm thinking a simple solution in this case may be just to use datatables. The reason being that we don't plan to move the data around or process it a lot once we fetch it, so I'm not sure there is that much value in having strongly-typed objects as DTOs. Also, this way we avoid mapping altogether, thereby I think simplifying the code and allowing for faster development.
I should mention budget is an issue on this project, as well as speed of execution. We are striving for simplicity anywhere we can, both to keep the budget smaller, the schedule shorter, and performance fast.
We haven't fully decided this yet, but are currently leaning towards no ORM. Will we be OK with the no ORM approach or is an ORM worth it?
An ORM-tool isn't mandatory!
Jon's advice is sensible, but I think using DataTables isn't ideal.
If you're using an ORM tool, the object model is far simpler than a full-blown OO domain model. Moreover, Linq2Sql and Subsonic, for example, are straight-forward to use. Allowing very quick code changes when the database changes.
You say you won't move the data around or process it a lot, but any amount of processing will be far easier in ORM objects rather than in DataTables. Again, if the application changes and more processing is required the DataTable solution will be fragile.
If you're not going to practice full-blow Object Oriented Programming (and by that I mean you should have a very deep understanding of OOP, not just the ability to blurt out principles and design pattern names) then NO, it's not worth going for an ORM.
An ORM is only useful if your organization is fully invested in Object Oriented application design, and thus having the problem of having an Object to Relational model mapping. If you're not fully into OO, ORMs will become some sort of nasty roadblock that your organization will then feel it doesn't need.
If your team/organization's programming style has always leaned to keeping business logic in the DB (e.g., stored procs) or sticking to a more or less functional/procedural/static approach at writing objects and methods, do away with ORMs and stick to ADO.NET.
It sound as if you only need to show data and dont do CRUD.
I have found that ORM is not the best way to go when displaying lists that consists of data from various tables. You end up loading large objectgraphs just to get this one needed field.
A SQL-statement is just so much better at this.
I would still return lists of strongly typed objects from the DAL. By that you have a better chance of getting a compile time error when a change in the DAL is not reflected in other layers.
If you already have stored procedures you need then there probably isn't that much to gain from an ORM. If not though my experience has been that working with Linq to Entites has been much faster than the traditional stored procedure/strongly typed dataset approach assuming you are comfortable with Linq queries.
If you aren't worried about mapping to an object model then Linq to SQL is even simpler to use. You certainly don't need to be using a full OO model to reap the productivity benefits.
It would disagree with Malcolm's point about having to bring back graphs, if the ORM supports Linq you can use a projection to return a flat result with just the data you want with the added advantage the query is usually simpler than the corresponding SQL since you can use relationships rather than join.
Having made the switch and become comfortable with the technology I can think of this almost no good reason not to use one, they all support falling back to SQL stored procedures if you really need to. There will be a learning curve though and in this case that may make it not worth your while.
I agree with Joe R's advice - for speed of making changes & also the speed of initial development, LINQ-to-SQL or subsonic will get you up and going in no time.
If your application is really this simple and it's just a straight data out/data in direct mapping to the tables, have you considered looking at ASP.net dynamic data?
I'd also point to a good article about this by Scott Guthrie.
It largely depends on how familiar you are with ORMs.
I personally think that NHibernate, which is the king of ORMs in the .NET world, allows much more rapid development as once set up, you can pretty much forget about how you are getting data out of the database.
However, there is a steep learning curve to it, especially if you try and do things in a non-hacky way (which you should), so if your team doesn't have experience here and time is pressing then it probably won't cut it.
Linq2SQL is way too simple. Don't know about Subsonic, but if you are going to use an ORM it may be a good balance between having rapid development and getting something too powerful and complex.
Ultimately though, as a team, I think you want to learn NHibernate which is not time consuming to set up on a small to medium project once you know what you are doing, but is very powerful.

Is Using Db4o For Web Sites a judicious choice?

Is using Db4o as a backend datastore for a Web site (ASP.NET MVC) a judicious choice as an alternative to MS SQL Server ?
The main issue with DB4o is: Can you cut your object net in some useful manner? If not, then you'll keep too many objects in RAM for too long and your performance will suffer.
For example, in SQL, you can create a cursor and then easily traverse a huge set of results. You can also query for a small set of columns while DB4o always loads the whole objects (and its references and the references of the references). With DB4o, you must make sure that DB4o doesn't try to pull in all objects from the DB at once.
You'll also need to get used to querying things your "DB" by filling out example objects which feels weird in the beginning.
That depends, what kind of site your creating, the traffic your expecting etc...Are you going to handle a million requests a second, or 100 a minute...Does your domain justify using a Object Database? Do you really need it?
In general, most sites are not heavy hitters so they might not require all the scale out functionality (I believe and this is only a belief that traditional RDBMS have been tested and designed to handle extreme loads where as Object DB's might not have been given the same attention).
So then the question is does your domain justify this? Your going to base a core piece of your site on a technology that you will not find a lot of experts in. So how do you handle turn over rate? Are you willing to take the cost associated with training all current and future employees on this?

In terms of performance, which is faster: Linq2SQL or Linq2Entities (in ASP.net MVC)

Comparing the two data models in an asp.net MVC app, which provides better performance, LINQ 2 SQL or LINQ 2 Entities.
They don't always perform the same database operations when you use the same methods, or have the same LINQ methods for that matter....
You should benchmark the specific query yourself. However, LINQ to Entities provides an extra abstraction layer in comparison to LINQ to SQL which is likely to impede performance a little.
Performance issues aside, if your application is expected to work with SQL Server backends only and you need a one to one mapping between entities and tables, LINQ to SQL is simpler to use.
Maintainability over performance should be a goal at the outset of most projects even if performance is a requirement. Performance can be tuned later and with good design the ORM/data mapper can usually be swapped out later if it is necessary.
Don't bog yourself down with decisions like this early in a project. Sort with what is easier or what you already know. Then tune later.
Benchmark yourself for good results. Your environment is invariably always different to what the developer/manufacturer might use for comparisons.
It depends very specifically on what you're doing. There is no single answer to this.
If you are willing to take the performance hit involved in using an ORM, both of them should be suitable.
My advice would is always go with an ORM for as much code as you can. If you have something that's running unacceptably slowly then make a stored procedure to do that specific query.
This article seems to make some comprehensive comparisons:
http://toomanylayers.blogspot.com/2009/01/entity-framework-and-linq-to-sql.html
My general input though when I hear performance being mentioned is to choose a solution that best solves your problem until you discover that your problem is performance.

If using LINQ to SQL is there any good reason to learn SQL queries/syntax anymore?

I do understand SQL querying and syntax because of previous work using ASP.NET web forms and stored procedures, but I would not call myself an "expert" in it.
Since I have been using ASP.NET MVC and LinqToSql it seems that so much of the heavy lifting is done for me and encapsulated away at the SQL end that I'm questioning whether there is any benefit in continuing to top-up my knowledge of SQL queries or whether I'm better off focusing my "learning time" on other things.
Your thoughts?
You should absolutely know SQL and keep your knowledge up-to-date. ORM is designed to ease the pain of doing something tedious that you know how to do, much like a graphing calculator is designed to do something that you can do by hand (and should know how).
The minute you start letting your ORM do things in the database that you don't fully understand is the minute you've lost control over your model.
In my opinion, knowing SQL is more valuable than any vendor specific technology. There will always be cases when those nice prepackaged frameworks will not be able to solve a particular situation and knowledge of advanced SQL will be required.
It is still important to learn SQL queries/syntax. The reason is you need to at least understand how Linq to SQL translate to the database behind the scenes.
This will help you when you find problems, for example something not updating correctly. Or a query performance needs to increase.
It is the same that you need to understand what assembly language is and how it eventually becomes machine language. However in all you don't have to be an expert, but at least be able to write in it and understand it.
It is still important to know SQL and the paradigm (set-based) behind it to be able to create efficient SQL statements, even if your using LinqToSql or any other OR/M.
There will always be situations where you will want to write the query in native SQL because it is not possible to write it in LinqToSql / HQL / whatever, or LinqToSql is just not able to generate a performant query for it.
There will always be situations where you will want to execute an ad-hoc query on a database using native sql, etc...
I think LinqToSQL (or other Linq to SQL providers) should not prevent you of knowing SQL.
When your query is not returning what you expect, or when it takes 30 minutes to run on the production database, you'd better be able to understand what LTS has generated, and why it is failing.
I know, it's a rehashed topic, and it might not be applicable to what you do ("small" database that will never hit that kind of problem etc), but it pays not to get too oblivious of abstraction layers sometimes.
The other reason is, Linq does not the whole range of what you can do in SQL, so you might have to resort to writing "raw" SQL, even if the result is materialised as objects.
It depends what you're working on, and from what you said it might make more sense to focus on other areas.
Having said that I find knowing SQL allows the following:
The ability to write queries to extract data from systems easily.
For adhoc queries, or for checking things.
The ability to write complex stored procedures, which allows me to group complex data processing in one place, where it should be, in the database.
The ability to fine tune LinqToSql by adding indexes, and understanding the SQL/query plan's it procedures.
Most of these are more of a help on more complex systems, so if you're not working on those it might not be as much of a help.
It may help in your situation to list the technologies which might be of use, and then prioritise them.
In order words make a development plan for yourself, which may encompass more then just learning technical knowledge but allow a more broad focus like design patterns, communication skills and other areas.
SQL is a tool. Linq to SQL is also a tool. Having more tools in your belt is a good thing. It'll give you more perspectives when attacking a problem.
Consider a scenario where you may want to do multiple queries or multiple updates to the db in one operation. If you can write TSQL you can potentially save yourself a lot of roundtrips to the database.
I would say you definately need to know your SQL in depth, because you need to know what code your Linq-expression generates and what effects the code will have if you want high performing queries. Sure you might get the job done in most cases, but sometimes there is a huge difference in performance in very subtle difference in Linq-syntax.
I ran into this this morning actually, where I had done .Any(d => d.Id == (...).First().Id) instead of doing where (...).Any(i => i.Id == d.Id). This resulted in the query executing five times slower.
Sometimes you need to analyze the actual Sql-query to realise the mistakes you make.
Its always a good think to learn the underlying language for stuff like Linq To SQL. SQL is pretty much standardized and it will help you understand a new paradigm in programming.
You may not always be working in .NET.
Doesn't hurt to know the underlying concepts.
LINQ to SQL is not being maintained anymore in favor of the Entity Framework
Sooner or later you will run into problems that need at leat a working knowledge of SQL to solve. And sooner or later you will run into requirements that are best realised in the DB (whether in SP-s or in triggers or views or whaterver).
LINQ To SQL will only work with .NET. IF you happen get another job where you are not working with .NET, then you will have to go back to writing Stored Procs.
Knowing SQL will also give you a better understanding of how the server operates as well as possibly making you a better database designer.

Ruby on Rails scalability/performance? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have used PHP for awhile now and have used it well with CodeIgniter, which is a great framework. I am starting on a new personal project and last time I was considering what to use (PHP vs ROR) I used PHP because of the scalability problems I heard ROR had, especially after reading what the Twitter devs had to say about it. Is scalability still an issue in ROR or has there been improvements to it?
I would like to learn a new language, and ROR seems interesting. PHP gets the job done but as everyone knows its syntax and organization are fugly and it feels like one big hack.
To expand on Ryan Doherty's answer a bit...
I work in a statically typed language for my day job (.NET/C#), as well as Ruby as a side thing. Prior to my current day job, I was the lead programmer for a ruby development firm doing work for the New York Times Syndication service. Before that, I worked in PHP as well (though long, long ago).
I say that simply to say this: I've experienced rails (and more generally ruby) performance problems first hand, as well as a few other alternatives. As Ryan says, you aren't going to have it automatically scale for you. It takes work and immense amounts of patience to find your bottlenecks.
A large majority of the performance issues we saw from others and even ourselves were dealing with slow performing queries in our ORM layer. We went from Rails/ActiveRecord to Rails/DataMapper and finally to Merb/DM, each iteration getting more speed simply because of the underlying frameworks.
Caching does amazing wonders for performance. Unfortunately, we couldn't cache our data. Our cache would effectively be invalidated every five minutes at most. Nearly every single bit of our site was dynamic. So if/when you can't do that, perhaps you can learn from our experience.
We had to end up seriously fine tuning our database indexes, making sure our queries weren't doing very stupid things, making sure we weren't executing more queries than was absolutely necessary, etc. When I say "very stupid things", I mean the 1 + N query problem...
# 1 query
Dog.find(:all).each do |dog|
# N queries
dog.owner.siblings.each do |sibling|
# N queries per above N query!
sibling.pets.each do |pet|
# Do something here
end
end
end
DataMapper is an excellent way to handle the above problem (there are no 1 + N problems with it), but an even better way is to use your brain and stop doing queries like that. When you need raw performance, most of the ORM layers won't easily handle extremely custom queries, so you might as well hand write them.
We also did common sense things. We bought a beefy server for our growing database, and moved it off onto it's own dedicated box. We also had to do TONS of processing and data importing constantly. We moved our processing off onto its own box as well. We also stopped loading our entire freaking stack just for our data import utilities. We tastefully loaded only what we absolutely needed (thus reducing memory overhead!).
If you can't tell already... generally, when it comes to ruby/rails/merb, you have to scale out, throwing hardware at the problem. But in the end, hardware is cheap; though that's no excuse for shoddy code!
And even with these difficulties, I personally would never start projects in another framework if I can help it. I'm in love with the language, and continually learn more about it every day. That's something that I don't get from C#, though C# is faster.
I also enjoy the open source tools, the low cost to start working in the language, the low cost to just get something out there and try to see if it's marketable, all the while working in a language that often times can be elegant and beautiful...
In the end, it's all about what you want to live, breathe, eat, and sleep in day in and day out when it comes to choosing your framework. If you like Microsoft's way of thinking, go .NET. If you want open source but still want structure, try Java. If you want to have a dynamic language and still have a bit more structure than ruby, try python. And if you want elegance, try Ruby (I kid, I kid... there are many other elegant languages that fit the bill. Not trying to start a flame war.)
Hell, try them all! I tend to agree with the answers above that worrying about optimizations early isn't the reason you should or shouldn't pick a framework, but I disagree that this is their only answer.
So in short, yes there are difficulties you have to overcome, but the elegance of the language, imho, far outweighs those shortcomings.
Sorry for the novel, but I've been there and back with performance issues. It can be overcome. So don't let that scare you off.
RoR is being used with lots of huge websites, but as with any language or framework, it takes a good architecture (db scaling, caching, tuning, etc) to scale to large numbers of users.
There's been a few minor changes to RoR to make it easier to scale, but don't expect it to scale magically for you. Every website has different scaling issues, so you'll have to put in some work to make it scale.
Develop in the technology that is going to give your project the best chance of success - quick to develop in, easy debugging, easy deployment, good tools, you know it inside out (unless the point is to learn a new language), etc.
If you get tens of million of uniques a month you can always hire in a couple of people and rewrite in a different technology if you need to as ...
... you'll be rake-ing in the cache (sorry - couldn't resist!!)
First of all, it would perhaps make more sense to compare Rails to
Symfony, CodeIgniter or CakePHP, since Ruby on Rails is a complete web application
framework. Compared to PHP or PHP frameworks, Rails applications offer
the advantages that they are small, clean, and readable. PHP is perfect
for small, personal pages (originally it stood for "Personal Home Page"),
while Rails is a full MVC framwork which can be used to build large
sites.
Ruby on Rails has not a larger scalability issue than comparable PHP frameworks.
Both Rails and PHP will scale well if you have only a moderate number
of users (10,000-100,000) which operate on a similar number of objects.
For a few thousand users a classic monolithic architecture will
be sufficient. With a bit of M&M (Memcached and MySQL) you can also
handle millions of objects. The M&M architecture uses a MySQL server to
handle writes and Memcached to handle high read loads. The traditional
storage pattern, a single SQL server using normalized relational tables
(or at best a SQL Master/Multiple Read Slave setup), no longer works
for very large sites.
If you have billions of users like Google, Twitter and Facebook, then
probably a distributed architecture will be better. If you really want to
scale your application without limit, use some kind of cheap commodity hardware
as a foundation, divide your application into a set of services, keep
each component or service scalable itself (design every component as
a scalable service), and adapt the architecture to your application.
Then you will need suitable scalable datastores like NoSQL databases
and distributed hash tables (DHTs), you will need sophisticated map-reduce
algorithms to work with them, you will have to deal with SOA, external
services, and messaging. Neither PHP nor Rails offer a magic bullet here.
What is breaks down to with RoR is that unless you're in Alexa's top 100, you will not have any scalability problems. You'll have more issues with stability on shared hosting unless you can squeeze Phusion, Passenger, or Mongrel out.
Take a little while to look at the problems the Twitter people had to deal with, then ask yourself if your app is going to need to scale to that level.
Then build it in Rails anyway, because you know it makes sense. If you get to Twitter-level volumes then you'll be in the happy position of considering performance optimisaton options. At least you'll be applying them in a nice language!
You can't compare PHP and ROR, PHP is a scripting language as Ruby, and Rails is a framework as CakePHP.
Stated that, I strongly suggest you Rails, because you will have an application strictly organized in MVC pattern, and this is a MUST for your scalability requirement. (Using PHP you had to take care about the project organization on your own).
But for what about scalability, Rails it's not just MVC: For instance, you can start to develop your application with a database, changing it on road without any effort (in the most part of cases), so we can state that a Rails application is (almost) database indipendent because it's ORM (that allow you to avoid database query), you can do a lot of other stuff. (take a look to this video http://www.youtube.com/watch?v=p5EIrSM8dCA )
Just wanted to add some more info to Keith Hanson's smart point about 1 + N problem where he states:
DataMapper is an excellent way to handle the above problem (there are no 1 + N problems with it), but an even better way is to use your brain and stop doing queries like that. When you need raw performance, most of the ORM layers won't easily handle extremely custom queries, so you might as well hand write them.
Doctrine is one of the most popular ORM's for PHP. It addresses this 1 + N complexity problem intrinsic to ORMs by providing a language called Doctrine Query Language (DQL). This allows you to write SQL like statements that use your existing model relationships. e.g
$q = Doctrine_Query::Create()
->select(*)
->from(ModelA m)
->leftJoin(m.ModelB)
->execute()
I'm getting the impression from this thread that the scalability issues of ROR come down primarily to the mess that ORMs are in with regard to loading child objects - ie the '1+N' problem mentioned above. In the above example that Ryan gave with dogs and owners:
Dog.find(:all).each do |dog|
#N queries
dog.owner.siblings.each do |sibling|
#N queries per above N query!!
sibling.pets.each do |pet|
#Do something here
end
end
end
You could actually write a single sql statement to get all that data, and you could also 'stitch' that data up into the Dog.Owner.Siblings.Pets object heirarchy of your custom-written objects. But could someone write an ORM that did that automatically, so that the above example would incur a single round-trip to the DB and a single SQL Statement, instead of potentially hundreds? Totally. Just join those tables into one dataset, then do some logic to stitch it up. It's a bit tricky to make that logic generic so it can handle any set of objects but not the end of the world. In the end, tables and objects only relate to each other in one of three categories (1:1, 1:many, many:many). It's just that no one ever built that ORM.
You need a syntax that tells the system upfront what children you want to load for this particular query. You can sort of do this with the 'eager' loading of LinqToSql (C#), which is not a part of ROR, but even though that results in one round trip to the DB, it's still hundreds of separate SQL statements the way it has currently been set up. It's really more about the history of ORMs. They just got started down the wrong path with that and never really recovered in my opnion. 'Lazy loading' is the default behavior of most ORMs, ie incurring another round trip for every mention of a child object, which is crazy. Then with 'eager' loading - loading the children upfront, that is set up statically in everything I am aware outside of LinqToSql - ie which children always load with certain objects - as if you would always need the same children loaded when you loaded a collection of Dogs.
You need some kind of strongly-typed syntax saying that this time I want to load these children and grandchilren. Ie, something like:
Dog.Owners.Include()
Dog.Owners.Siblings.Include()
Dog.Owners.Siblings.Pets.Include()
then you could issue this command:
Dog.find(:all).each do |dog|
The ORM system would know what tables it needs to join, then stitch up the resulting data into the OM heirarchy. It's true that you can throw hardware at the current problem, which I'm generally in favor of, but it's no reason the ORM (ie Hibernate, Entity Framework, Ruby ActiveRecord) shouldn't just be better written. Hardware really doesn't bail you out of an 8 round-trip, 100-SQL statement query that should have been one round trip and one SQL statement.

Resources