Related
What's the purpose of stored procedures compared to the use of an ORM (nHibernate, EF, etc) to handle some CRUD operations? To call the stored procedure, we're just passing a few parameters and with an ORM we send the entire SQL query, but is it just a matter of performance and security or are there more advantages?
I'm asking this because I've never used stored procedures (I just write all SQL statements with an ORM and execute them), and a customer told me that I'll have to work with stored procedures in my next project, I'm trying to figure out when to use them.
Stored Procedures are often written in a dialect of SQL (T-SQL for SQL Server, PL-SQL Oracle, and so on). That's because they add extra capabilities to SQL to make it more powerful.
On the other hand, you have a ORM, let say NH that generates SQL.
the SQL statements generated by the ORM doesn't have the same speed or power of writing T-SQL Stored Procedures.
Here is where the dilemma enters: Do I need super fast application tied to a SQL Database vendor, hard to maintain or Do I need to be flexible because I need to target to multiple databases and I prefer cutting development time by writing HQL queries than SQL ones?
Stored Procedure are faster than SQL statements because they are pre-compiled in the Database Engine, with execution plans cached. You can't do that in NH, but you have other alternatives, like using Cache Level 1 or 2.
Also, try to do bulk operations with NH. Stored Procedures works very well in those cases. You need to consider that SP talks to the database in a deeper level.
The choice may not be that obvious because all depends of the scenario you are working on.
The main (and I'm tempted to say "only") reason you should use stored procedures is if you really need the performance.
It might seem tempting to just create "functions" in the database that do complex stuff quickly. But it can quickly spiral out of control.
I've worked with applications that encapsulate so much business logic in SQL, that it becomes virtually impossible to refactor anything. Literally hundreds of stored procedures that are black boxes for devs working with the ORM.
Such applications become brittle, hard to debug and hard to understand. By allowing business logic to live in stored procedures you are allowing SQL developers to make design choices that they shouldn't be making, in a tool that is much harder to work in, log and debug than an ORM. I've seen stored procedures that handle payment processing. Truly core stuff. Stuff that becomes so central to an application that nobody dares to touch it, all because some guy with good SQL skills made a script 5 years to fix something quickly, it was never migrated to the ORM and eventually grew into an unmanageable monster, full of tangled logic nobody understands. Devs end up having to blindly trust whatever it's doing. And what's worse, it's almost always outside test coverage, so you may break everything when you deploy, even if your tests pass with mocked data, but some ancient stored procedure suddenly starts acting up.
Abusing stored procedures is one of the worst forms of technical debt you can accumulate. The database, which is the persistence layer, should not be used for business logic. You should keep that distinction as strict as you can.
Of course, there will be cases where an ORM will have horrible performance or simply won't support a feature you need from SQL. If doing things in raw SQL is truly inevitable, only then should you consider stored procedures.
I've seen Stored Procedure Hell. You don't want that.
There are significant performance advantages to stored procedures in some circumstances. Often the queries generated by Linq and other ORMs can be inefficient, but still good enough for your purposes. Some RBDMS (such as SQL Server) will cache the execution plans for stored procedures, saving on query time. For more complex queries that you use frequently, this savings in performance can be critical.
For most normal CRUD, though, I have found that it is usually better for maintainability just to use the ORM if it is available and if its operations serve your needs. Entity Framework works quite well for me in the .NET world most of the time (in combination with Linq), and I like Propel a lot for PHP.
I stumbled over this pretty old question but I am shocked that the most important benefit of Stored Procedures is not even mentioned.
Security and resource protection
Using SPs you are able to grand execution rights for that SP to a user. The user can execute the SP and only that SP. You do not even have to give the user read or write access to the tables used. The user does not even have to know the tables used.
Using ORM you have to give read or/and write access to the tables used and users. The user can read all data from all the tables you granted the rights and even can combine them in queries, if you want it or not, and also can run queries that creates heavy load on the Database server.
This is especially useful in cases where application development and database development is done by different teams and the database is used by more than one application.
The primary use I find for them is to implement an abstraction layer and encapsulate query logic. In the same way that I write functions in a procedural language.
As le dorfier mentions one of the the primary reasons sprocs (and/or views) should be used is to provide an abstraction layer between a database and its clients (web apps, reports, ETLs etc)
This 'DB API' can make it easier to change/refactor your database without necessarily affecting clients.
See - Why use stored procs - for a more in depth discussion
I mainly stick to linq to sql as an ORM and i think its great, but there is still a place for stored procedures. Mainly i use the when the query i want to run is very complex, with many joins (especially outer joins, which suck in Linq), lots of aggregation in subqueries perhaps, recursive CTE's, and other similar scenarios.
For general crud though, there is no need.
hi all
We do know that CommandType property of a SqlCommand object has 3 options: TableDirect, Text and StoredProcedure or "SP".
Knowing that "SP" has benefits over two other options, my question is do you make lots of SP in your own systems?
Or What solution do you have instead of creating SP?
Thank you
Aside of creating Stored Procedures you can use Object Relational Mapping
Such as:
linq to sql
Nhibernate
Entity Framework
Data Access :SP's vs ORMs
Choose the best way that suits you.
In all production system I used SPs and pure ADO.NET Core to access the data. Systems range from having 100-300 tables and about 500-1000 stored procedures.
Most of the Data Access code is generated using a tool. I've posted the source code and sample application on my blog if you're interested in using/modifying it. The tool can generate over 100,000 lines of code in about 20-25 seconds going against a database with about 750 stored procedures.
Data Access Layer - Code Gen
Of course if you're no familiar with Databases, data modeling/design and stored procedures you're probably better off using Linq to SQL or EF4 (Entity Framework version 4) or similar. If you need brute force performance then ADO.NET core along with Stored procedures is the way to go.
Re: your first question
When you go down the path of stored procedures, the number of stored procedures begins to grow continually for the life of the project. Outside of the basic CRUD operations, each stored procedure tends to be tightly bound to a particular problem and not very re-usable. A rule of thumb is that I can expect 8-12 stored procedures for each data table (excluding reference or code tables, such as the list of states or countries).
The very large number of procs makes naming conventions very important so that you can find anything without constantly visually re-scanning the whole list of 400-500 procs.
Re: your second question
There are a lot of ugly things that happen with sql written inside of strings inside of C# or VB.NET -- it's error prone, ugly, etc.
Linq, nHybernate and many others exist, but the "concept count" (the number of things you need to learn to start being productive), is much higher than learning how to write a good stored procedure executer in C#.
I try to make sure that stored procedures are only created for database functionality - not business logic.
It's Database Functionality when I have some database architecture that's a bit obscure and I want to hide that from callers.
It's Business Logic when it is simply the way in which my application adds or updates or how much validation they do, etc., etc.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What are the pros and cons to keeping SQL in Stored Procs versus Code
What would be appropriate scenario when stored procedures should be used?
I stumbled upon implementation where almost whole data manipulation was handled by store procedures, even simplest form of INSERT/DELETE statements were wrapped and used via SP's.
So, what's the rationale for using stored procedures in general?
Sorry for such a beginners question..
Outwith the reasons #Tom has already pointed out which are probably the most important (Speed/Security) I would also say that another good reason to use Stored Procedures is code re-use. If you find yourself writing the same SQL all over the place its usually a sign that it should be a stored procedure. Also, another good reason is it allows not only developers, but DBA's the ability to change/add new procedures if required.
Two reasons I know of:
Security
The stored procedures are secure from attacks such as SQL injection attacks
Speed
Stored procedures are sometimes precompiled which makes execution faster.
for me it is the same question as whether or not to create a function/method while programming.
For example if the functionality is needed to be repeated in many places, or is going to be called more than once then it should be in a function.
It allows you to keep data access near the data. I have worked in systems where all data access was stored procs with server side wrapper functions. It was pretty clean ( but not as 'cool' as an ORM )
When other systems need to access your data and you need to provide an API at the database - Procs would be a way to allow you control over what/how they access it.
I am answering from an enterprise perspective.
two types of designs,
Put all/most of business logic on DB server
Put all/most of Business logic on application server.
In #1, you use Stored procedure to implement your application logic instead of the programming language.
In #2, you use the programming language to implement the logic, and it is easier to debug and allow code reuse and all other features that any programming language provides.
If you are a DB guru (and your app is mid to small), choose the first approach, otherwise choose second approach.
BTW, you will find .NET app uses the first approach, and Java app followed the second.
Given that database is generally the least scalable component (of a web application), are there any situations where one would put logic in procedures/triggers over keeping it in his favorite programming language (ruby...) or her favorite web framework (...rails!).
Server-side logic is often much faster, even with procedural approach.
You can fine-tune your grant options and hide the data you don't want to show
All queries in one places are more convenient than if they were scattered all around the code.
And here's a (very subjective) article in my blog on the reason I prefer stored procedures:
Schema Junk
BTW, triggers (as opposed to functions / stored procedures / packages) I generally dislike.
They are completely other story.
You're keeping the processing in the database, along with the data.
If you process on the server side, then you have to transfer the data out to a server process across the network, process it, and (optionally) send it back. You have the network bandwidth/latency issues, plus memory overheads.
To clarify - if I have 10m rows of data, my two extreme scenarios are to a) pull those 10m rows across the network and process on the server side, or b) process in place in the database using the server and language (SQL) optimised for this purpose. Note that this is a generalisation and not a hard-and-fast rule, but it's the one I follow for most scenarios.
When many heterogeneous applications and various other systems need to access your single database and be sure through their operations data stays consistent without integrity conflicts. So you put your logic into triggers and stored procedures that will offer an interface to external clients.
Maybe not for most web-based systems, but certainly for enterprise databases. Stored procedures and the like allow you much greater control over security and performance, as well as offering a bit of encapsulation for the database itself. You can change the schema all you want as long as the stored procedure interface remains the same.
In (almost) every situation you would keep the processing that is part of the database in the database. Application code cannot substitute for triggers, you won't get very far before you have updated the database and failed to fire the application's equivalent of the triggers (the first time you use the DBMS's management console, for instance).
Let the database do the database work and let the application to the application's work. If you have a specific performance problem with the database, and that performance problem can be addressed by moving processing from the database, in that case you might want to consider doing so.
But worrying about database performance without a database performance problem existing (which is what you seem to be doing here) is both silly and, sadly, apparently a pre-occupation of many Stackoverlow posters.
Least scalable? SQL???
Look up, "federating."
If the database is shared, having logic in the database is better in order to control everything that happens. If it's not it might just make the system overly complicated.
If you have multiple applications that talk to your database, stored procedures and triggers can enforce correctness more pervasively. Accordingly, if correctness is more important than convenience, putting logic in the database is sensible.
Scalability may be a red herring, though. Sometimes it's easier to express the behavior you want in the domain layer of an OO language, but it can be actually more expensive than doing the idiomatic SQL way.
The security mechanism at a previous company was first built in the service layer, then pushed to the db side. The motivation was actually due to some limitations in a data access framework we were using. The solution turned out to be a bit buggy because our security model was complicated, but the upside was that bugs only had to be fixed in the database; we didn't have to worry about different clients following different rules.
Triggers mean 3rd-party apps can modify the database without creating logical inconsistencies.
If you do that, you are tying your business logic to your model. If you code all your business logic in T-SQL, you aren't going to have a lot of fun if later you need to use Oracle or what have you as your database server. Actually, I'm not sure I understand this question exactly. How do you think this would improve scalability? It really shouldn't.
Personally, I'm really not a fan of triggers, particularly in a database dedicated to a single application. I hate trying to track down why some data is inconsistent, to find it's down to a poorly written trigger (and they can be tricky to get exactly correct).
Security is another advantage of using stored procs. You do not have to set the security at the table level if you don't use dynamic code (Including ithe stored proc). This means your users cannot do anything unless they have a proc to to it. This is one way of reducing the possibility of fraud.
Further procs are easier to performance tune than most application code and even better, when one needs to change, that is all you have to put on production, not recomplie the whole application.
Data integrity must be maintained at the database level. That means constraints, defaults values, foreign keys, possibly triggers (if you have very complex rules or ones involving multiple tables). If you do not do this at the database level, you will eventually have integrity issues. Peolpe will write a quick fix for a problem and run the code in the query window and the required rules are missed creating a larger problem. A millino new records will have to be imported through an ETL program that doesn't access the application because going through the application code would take too long running one record at a time.
If you think you are building an application where scalibility will be an issue, you need to hire a database professional and follow his or her suggestions for design based on performance. Databases can scale to terrabytes of data but only if they are originally designed by someone is a specialist in this kind of thing. When you wait until the while application is runnning slower than dirt and you havea new large client coming on board, it is too late. Database design must consider performance from the beginning as it is very hard to redesign when you already have millions of records.
A good way to reduce scalability of your data tier is to interact with it on a procedural basis. (Fetch row..process... update a row, repeat)
This can be done within a stored procedure by use of cursors or within an application (fetch a row, process, update a row) .. The result (poor performance) is the same.
When people say they want to do processing in their application it sometimes implies a procedural interaction.
Sometimes its necessary to treat data procedurally however from my experience developers with limited database experience will tend to design systems in a way that do not leverage the strenght of the platform because they are not comfortable thinking in terms of set based solutions. This can lead to severe performance issues.
For example to add 1 to a count field of all rows in a table the following is all thats needed:
UPDATE table SET cnt = cnt + 1
The procedural treatment of the same is likely to be orders of magnitude slower in execution and developers can easily overlook concurrency issues that make their process inconsistant. For example this kind of code is inconsistant given the avaliable read isolation levels of many RDMBS platforms.
SELECT id,cnt FROM table
...
foreach row
...
UPDATE table SET cnt = row.cnt+1 WHERE id=row.id
...
I think just in terms of abstraction and ease of servicing a running environment utilizing stored procedures can be a useful tool.
Procedure plan cache and reduced number of network round trips in high latency environments can also have significant performance advantages.
It is also true that trying to be too clever or work very complex problems in the RDBMS's half-baked procedural language can easily become a recipe for disaster.
"Given that database is generally the least scalable component (of a web application), are there any situations where one would put logic in procedures/triggers over keeping it in his favorite programming language (ruby...) or her favorite web framework (...rails!)."
What makes you think that "scalability" is the only relevant concern in a system design ? I agree with rexem where he commented that it is very obvious that you are "not" biased ...
Databases are sets of assertions of fact. Those sets become more valuable if they can also be guaranteed to conform to certain integrity rules. Those guarantees are not worth a dime if it is the applications that are expected to enforce such integrity. Triggers and sprocs are the only way SQL systems have to allow such guarantees to be offered by the DBMS itself.
That aspect outweighs "scalability" anytime, anywhere, anyhow.
If I use stored procedures, can I use an ORM?
EDIT:
If I can use a ORM, doesn't that defeat part of the database agnosticity reason for using an ORM? In other words, why else would I want to use an ORM, if I am binding myself to a particular database with stored procedures (or is that assumption wrong)?
Using ORM to access stored procedures is one of the best uses of ORM. It'll give you strongly typed objects, while you still have full control over the SQL.
In my experience I would let the ORM handle the 'CRUD' operations, and leave the specialty work to the stored procedures. Generally, using a stored procedure for 'CRUD' operations is overkill, and to let the ORM handle it, could drastically improve your productivity.
Yes, you can, all main ORMs support stored procedures.
As for your assumption, you are particulary right, when you use stored procedures with ORM you are coupling your project to a particular database. But in practice it is 99% that you will not need to change your database provider, so in this case you use ORM not to abstract from concrete DB provider, but to help yourself with object-relational mapping task - which is a main ORM's task and which ORM was originally made for.
It raises an interesting point.
Once you have ORM, and relatively simple queries, why do you need stored procedures? SP's are intimately bound to the database. ORM frees you from having to maintain a lot of DB-specific code. What is DB-specific can be isolated and managed.
I suggest that an ORM is a golden chance to cut the complexity and put all the processing in the code where it belongs.
Use the database for what it does best -- store data.
Use your application for what it does best -- process data.
You can use both ORM features and stored procedures functionality at once. Particularly use ORM until it fits you, but if you have some trouble with performance or need some low level tune - include stored procedures in your business-logic.
Yes you can but you will want to spend some time investigating what capabilities the ORM provides around stored procedures.
Most will allow you to run a stored procedure that returns a strongly typed object / entity. More advanced ORM's will allow you to plug stored procedures in for performing CRUD actions as well (so your generic querying, deleting etc goes via a stored procedure rather than a dynamic query).
Generally ORM's are great for generating ad-hoc queries and getting strongly typed entities but having strong stored procedure support has the benefit of allowing you to (sometimes) more easily access native capability of your RDMS that may not be exposed as first class citizens in the ORM - especially if the ORM supports many database engines.
Following up from your edit:
Often you will want to use the ad-hoc querying engine provided by the ORM however as I alluded to earlier - sometimes you want to query using a capability not exposed from the ORM.
The benefits of strongly typed entities is invaluable as it means you have domain object usually, rather than data readers, data tables etc. You can cleanly encapsulate behaviors and logic within those entities that you have retrieved.
The list of additional benefits is very long indeed - for example, with the LightSpeed ORM (and most others) your entities will support standard binding interfaces, error reporting interfaces, validation etc. On the querying side you will lose out on lazy loading etc unless you write it yourself.
Database "agnosticity" (?) is not the only reason to use an ORM. However, you could take advantage of being DB independent on 99% of your interactions with the DB and in 1% (or 2% or 10% or whatever) you might need stored procedures for speed/clarity/complexity. If you changed DBs, you would need to rewrite those.
I use netTiers a lot at work and we let it generate our stored procedures for us. These only handle the basic CRUD operations, but they are very fast and save me a TON of time. netTiers will also let us create custom stored procedures and generate our data access code with these procedures.
You can, but many of the more advanced ORM features tend to become more cumbersome to use. Something like iBatis is very easy to integrate with stored procedures, while the more sophisticated features of more complex engines like (N?)Hibernate like generation of dynamic SQL and lazy loading of large fields can become more of a hassle than they're worth.
I believe that any tool that frees you from redoing work and concentrate in solving the problems is valid. ORMs appear to be that tool when it come to basic CRUD operations - even if using SPs to better implement a requirement (like using a hammer on a nail, it's just the right tool for task).
The point is: there's no black or white, just a scale of gray. Very inneficient and badly coded applications use the excuse of being 'DB agnostic' to explain the exagerated use of DB resources. In many cases, being very tied to a database is not good too. The objective is: getting maximum 'DB agnosticism' while not wasting customer IT resources without need.
There's no 'old vs new', just people saying that extreme 'pure' approaches are better. I don't really believe so. I believe that, as with any tool, the 'best' (notice the quotes) approach is using ORM until still is the right tool to make your data access. And use SPs inside your ORM when you reach a point where you're wasting resources and reducing scalability and 'worth life' (I forgot the english expression equivalent for the portuguese 'vida Ăștil') of TI resources. Or, in other words, use SP when it's for the processing at hand what a hammer is for the nail.