What is the difference between Stored Procedures and Prepared Statements... And which one is better and why...!! I was trying to google it but haven't got any better article...
Stored procedures are a sequence of instructions in PL/SQL language. Is a programming language implemented by some DBMS, that lets you store sequences of queries frequently applied to your model, and share the processing load with the application layer.
Prepared statements are queries written with placeholders instead of actual values. You write the query and it is compiled just once by the DBMS, and then you just pass values to place into the placeholders. The advantage of using prepared statements is that you enhance the performance considerably, and protect your applications from SQL Injection.
The difference is you cant store prepared statements. You must "prepare" them every time you need to execute one. Stored procedures, on the other hand, can be stored, associated to a schema, but you need to know PL/SQL to write them.
You must check if your DBMS supports them.
Both are very usefull tools, you might want to combine.
Hope this short explanation to be useful to you!
The other answers have hinted at this, but I'd like to list the Pros and Cons explicitly:
Stored Procedures
PROS:
Each query is processed more rapidly than a straight query, because the server pre-compiles them.
Each query need only be written once. It can be executed as many times as needed, even across different sessions and different connections.
Allows queries to include programming constructs (such as loops, conditional statements, and error handling) that are either impossible or difficult to write in SQL alone.
CONS
Require knowledge of whatever programming language the database server uses.
Can sometimes require special permissions to write them or call them.
Prepared Statements
PROS
Like stored routines, are quick because queries are pre-compiled.
CONS
Need to be re-compiled with each connection or session.
To be worth the overhead, each prepared statement must be executed more than once (such as in a loop). If a query is executed only once, more overhead goes into preparation of the prepared statement than you get back since the server needs to compile the SQL anyway, but also make the prepared statement.
For my money, I'd go with Stored Procedures every time since they only need to be written and compiled once. After that, every call to the procedure leads to saved time, whether you're on a new connection or not, and whether you're calling the procedure in a loop or not. The only downside is needing to spend some time learning the programming language. If I didn't have permissions to write stored procedures, I would use a prepared statement, but only if I had to repeatedly make the same query multiple times in the same session.
This is the conclusion I've come to after several months of off-and-on research into the differences between these two constructs. If anyone is able to correct bad generalizations I'm making, it will be worth any loss to reputation.
A stored Procedure is stored in the DB - depending on which DB (Oracle, MS SQL Server etc.) it is compiled and potentially prepared optimized when you create it on the server...
A prepared statement is a statement which is parsed by the server and an execution plan is created by the server ready for execution whenever you run the statement... usually it makes sense when a statement is run more than once... depending on the DB server (Oracle etc.) and even sometimes configuration options these "preparation" are either session-specific or "global"...
There is no "better" when you compare these two since they have their specific use cases...
Related
What's the purpose of stored procedures compared to the use of an ORM (nHibernate, EF, etc) to handle some CRUD operations? To call the stored procedure, we're just passing a few parameters and with an ORM we send the entire SQL query, but is it just a matter of performance and security or are there more advantages?
I'm asking this because I've never used stored procedures (I just write all SQL statements with an ORM and execute them), and a customer told me that I'll have to work with stored procedures in my next project, I'm trying to figure out when to use them.
Stored Procedures are often written in a dialect of SQL (T-SQL for SQL Server, PL-SQL Oracle, and so on). That's because they add extra capabilities to SQL to make it more powerful.
On the other hand, you have a ORM, let say NH that generates SQL.
the SQL statements generated by the ORM doesn't have the same speed or power of writing T-SQL Stored Procedures.
Here is where the dilemma enters: Do I need super fast application tied to a SQL Database vendor, hard to maintain or Do I need to be flexible because I need to target to multiple databases and I prefer cutting development time by writing HQL queries than SQL ones?
Stored Procedure are faster than SQL statements because they are pre-compiled in the Database Engine, with execution plans cached. You can't do that in NH, but you have other alternatives, like using Cache Level 1 or 2.
Also, try to do bulk operations with NH. Stored Procedures works very well in those cases. You need to consider that SP talks to the database in a deeper level.
The choice may not be that obvious because all depends of the scenario you are working on.
The main (and I'm tempted to say "only") reason you should use stored procedures is if you really need the performance.
It might seem tempting to just create "functions" in the database that do complex stuff quickly. But it can quickly spiral out of control.
I've worked with applications that encapsulate so much business logic in SQL, that it becomes virtually impossible to refactor anything. Literally hundreds of stored procedures that are black boxes for devs working with the ORM.
Such applications become brittle, hard to debug and hard to understand. By allowing business logic to live in stored procedures you are allowing SQL developers to make design choices that they shouldn't be making, in a tool that is much harder to work in, log and debug than an ORM. I've seen stored procedures that handle payment processing. Truly core stuff. Stuff that becomes so central to an application that nobody dares to touch it, all because some guy with good SQL skills made a script 5 years to fix something quickly, it was never migrated to the ORM and eventually grew into an unmanageable monster, full of tangled logic nobody understands. Devs end up having to blindly trust whatever it's doing. And what's worse, it's almost always outside test coverage, so you may break everything when you deploy, even if your tests pass with mocked data, but some ancient stored procedure suddenly starts acting up.
Abusing stored procedures is one of the worst forms of technical debt you can accumulate. The database, which is the persistence layer, should not be used for business logic. You should keep that distinction as strict as you can.
Of course, there will be cases where an ORM will have horrible performance or simply won't support a feature you need from SQL. If doing things in raw SQL is truly inevitable, only then should you consider stored procedures.
I've seen Stored Procedure Hell. You don't want that.
There are significant performance advantages to stored procedures in some circumstances. Often the queries generated by Linq and other ORMs can be inefficient, but still good enough for your purposes. Some RBDMS (such as SQL Server) will cache the execution plans for stored procedures, saving on query time. For more complex queries that you use frequently, this savings in performance can be critical.
For most normal CRUD, though, I have found that it is usually better for maintainability just to use the ORM if it is available and if its operations serve your needs. Entity Framework works quite well for me in the .NET world most of the time (in combination with Linq), and I like Propel a lot for PHP.
I stumbled over this pretty old question but I am shocked that the most important benefit of Stored Procedures is not even mentioned.
Security and resource protection
Using SPs you are able to grand execution rights for that SP to a user. The user can execute the SP and only that SP. You do not even have to give the user read or write access to the tables used. The user does not even have to know the tables used.
Using ORM you have to give read or/and write access to the tables used and users. The user can read all data from all the tables you granted the rights and even can combine them in queries, if you want it or not, and also can run queries that creates heavy load on the Database server.
This is especially useful in cases where application development and database development is done by different teams and the database is used by more than one application.
The primary use I find for them is to implement an abstraction layer and encapsulate query logic. In the same way that I write functions in a procedural language.
As le dorfier mentions one of the the primary reasons sprocs (and/or views) should be used is to provide an abstraction layer between a database and its clients (web apps, reports, ETLs etc)
This 'DB API' can make it easier to change/refactor your database without necessarily affecting clients.
See - Why use stored procs - for a more in depth discussion
I mainly stick to linq to sql as an ORM and i think its great, but there is still a place for stored procedures. Mainly i use the when the query i want to run is very complex, with many joins (especially outer joins, which suck in Linq), lots of aggregation in subqueries perhaps, recursive CTE's, and other similar scenarios.
For general crud though, there is no need.
hi all
We do know that CommandType property of a SqlCommand object has 3 options: TableDirect, Text and StoredProcedure or "SP".
Knowing that "SP" has benefits over two other options, my question is do you make lots of SP in your own systems?
Or What solution do you have instead of creating SP?
Thank you
Aside of creating Stored Procedures you can use Object Relational Mapping
Such as:
linq to sql
Nhibernate
Entity Framework
Data Access :SP's vs ORMs
Choose the best way that suits you.
In all production system I used SPs and pure ADO.NET Core to access the data. Systems range from having 100-300 tables and about 500-1000 stored procedures.
Most of the Data Access code is generated using a tool. I've posted the source code and sample application on my blog if you're interested in using/modifying it. The tool can generate over 100,000 lines of code in about 20-25 seconds going against a database with about 750 stored procedures.
Data Access Layer - Code Gen
Of course if you're no familiar with Databases, data modeling/design and stored procedures you're probably better off using Linq to SQL or EF4 (Entity Framework version 4) or similar. If you need brute force performance then ADO.NET core along with Stored procedures is the way to go.
Re: your first question
When you go down the path of stored procedures, the number of stored procedures begins to grow continually for the life of the project. Outside of the basic CRUD operations, each stored procedure tends to be tightly bound to a particular problem and not very re-usable. A rule of thumb is that I can expect 8-12 stored procedures for each data table (excluding reference or code tables, such as the list of states or countries).
The very large number of procs makes naming conventions very important so that you can find anything without constantly visually re-scanning the whole list of 400-500 procs.
Re: your second question
There are a lot of ugly things that happen with sql written inside of strings inside of C# or VB.NET -- it's error prone, ugly, etc.
Linq, nHybernate and many others exist, but the "concept count" (the number of things you need to learn to start being productive), is much higher than learning how to write a good stored procedure executer in C#.
I try to make sure that stored procedures are only created for database functionality - not business logic.
It's Database Functionality when I have some database architecture that's a bit obscure and I want to hide that from callers.
It's Business Logic when it is simply the way in which my application adds or updates or how much validation they do, etc., etc.
I'm developing a Java EE application (JSF + Richfaces + Oracle 10g), and i wanted to use JPA.
But in the end, i didn't see any advantages of using it, because it's going to complexify my code.
I found that calling my procedures (stored procedures in my orale DB) is better than using JPA (because i can, for example, change some lines in those procedures without the need to re'compile my "WAR" project every time i have some error + i can use PL/SQL which helps me a lot)
So, i wanted to ask you people, when to use JPA ?
Isn't making your own queries (you can choose the right ordering for you selects, the columns that you want to select, and not all the columns: because of ORM and the fact that your entities attributes are mapped to the columns of your table, and that oblige you to select all the attributes present in your entity,....)
Is it my method that i used (stored function)
But in the end, i didn't see any advantages of using it, because it's going to complexify my code.
This might be subjective but, personally, I find JDBC typically more verbose, harder to maintain and thus somehow more complex. With an ORM like JPA, you don't have to write all the CRUD queries, you don't have to handle the mapping of query results to objects, you don't have to handle the low level stuff yourself, etc.
I found that calling my procedures (stored procedures in my orale DB) is better than using JPA (because i can, for example, change some lines in those procedures without the need to re'compile my "WAR" project every time i have some error + I can use PL/SQL which helps me a lot)
This is totally dependent on your definition of "better". In your case, you might prefer SP because of your development workflow (I don't run my code in container to setup the persistence part) and because you are comfortable with PL/SQL. But again, personally, I don't find SP "better":
I don't really enjoy hand writing SQL queries for everything
I find that SP are harder to test, debug, maintain
I find that SP are bad for code reuse
I find that SP lock you in (I consider this as a disadvantage)
I tend to find systems build around SP harder to scale
Read also Who Needs Stored Procedures, Anyways? (amongst other resource) for more opinions.
So, I wanted to ask you people, when to use JPA ?
When you want to speed-up development, to focus on implementing business code instead of plumbing. And of course, when appropriate.
Isn't making your own queries (...)
Did you identify a particular performance problem? Or are you just assuming there will be a problem. To my experience, retrieving more columns than required is most of time not an issue. And if it becomes an issue, there are solutions.
You can define what fields are retrieved in a query, using JPA.
Why is the complexity of your code going up with use of JPA ? You state no example. Before you'd have to do messy JDBC, and now you just call "em.persist(obj)" ... is that really more complex.
You ought to be thinking in terms of what relations there are between objects (if any) and that be the determiner on whether you need an ORM
Given that database is generally the least scalable component (of a web application), are there any situations where one would put logic in procedures/triggers over keeping it in his favorite programming language (ruby...) or her favorite web framework (...rails!).
Server-side logic is often much faster, even with procedural approach.
You can fine-tune your grant options and hide the data you don't want to show
All queries in one places are more convenient than if they were scattered all around the code.
And here's a (very subjective) article in my blog on the reason I prefer stored procedures:
Schema Junk
BTW, triggers (as opposed to functions / stored procedures / packages) I generally dislike.
They are completely other story.
You're keeping the processing in the database, along with the data.
If you process on the server side, then you have to transfer the data out to a server process across the network, process it, and (optionally) send it back. You have the network bandwidth/latency issues, plus memory overheads.
To clarify - if I have 10m rows of data, my two extreme scenarios are to a) pull those 10m rows across the network and process on the server side, or b) process in place in the database using the server and language (SQL) optimised for this purpose. Note that this is a generalisation and not a hard-and-fast rule, but it's the one I follow for most scenarios.
When many heterogeneous applications and various other systems need to access your single database and be sure through their operations data stays consistent without integrity conflicts. So you put your logic into triggers and stored procedures that will offer an interface to external clients.
Maybe not for most web-based systems, but certainly for enterprise databases. Stored procedures and the like allow you much greater control over security and performance, as well as offering a bit of encapsulation for the database itself. You can change the schema all you want as long as the stored procedure interface remains the same.
In (almost) every situation you would keep the processing that is part of the database in the database. Application code cannot substitute for triggers, you won't get very far before you have updated the database and failed to fire the application's equivalent of the triggers (the first time you use the DBMS's management console, for instance).
Let the database do the database work and let the application to the application's work. If you have a specific performance problem with the database, and that performance problem can be addressed by moving processing from the database, in that case you might want to consider doing so.
But worrying about database performance without a database performance problem existing (which is what you seem to be doing here) is both silly and, sadly, apparently a pre-occupation of many Stackoverlow posters.
Least scalable? SQL???
Look up, "federating."
If the database is shared, having logic in the database is better in order to control everything that happens. If it's not it might just make the system overly complicated.
If you have multiple applications that talk to your database, stored procedures and triggers can enforce correctness more pervasively. Accordingly, if correctness is more important than convenience, putting logic in the database is sensible.
Scalability may be a red herring, though. Sometimes it's easier to express the behavior you want in the domain layer of an OO language, but it can be actually more expensive than doing the idiomatic SQL way.
The security mechanism at a previous company was first built in the service layer, then pushed to the db side. The motivation was actually due to some limitations in a data access framework we were using. The solution turned out to be a bit buggy because our security model was complicated, but the upside was that bugs only had to be fixed in the database; we didn't have to worry about different clients following different rules.
Triggers mean 3rd-party apps can modify the database without creating logical inconsistencies.
If you do that, you are tying your business logic to your model. If you code all your business logic in T-SQL, you aren't going to have a lot of fun if later you need to use Oracle or what have you as your database server. Actually, I'm not sure I understand this question exactly. How do you think this would improve scalability? It really shouldn't.
Personally, I'm really not a fan of triggers, particularly in a database dedicated to a single application. I hate trying to track down why some data is inconsistent, to find it's down to a poorly written trigger (and they can be tricky to get exactly correct).
Security is another advantage of using stored procs. You do not have to set the security at the table level if you don't use dynamic code (Including ithe stored proc). This means your users cannot do anything unless they have a proc to to it. This is one way of reducing the possibility of fraud.
Further procs are easier to performance tune than most application code and even better, when one needs to change, that is all you have to put on production, not recomplie the whole application.
Data integrity must be maintained at the database level. That means constraints, defaults values, foreign keys, possibly triggers (if you have very complex rules or ones involving multiple tables). If you do not do this at the database level, you will eventually have integrity issues. Peolpe will write a quick fix for a problem and run the code in the query window and the required rules are missed creating a larger problem. A millino new records will have to be imported through an ETL program that doesn't access the application because going through the application code would take too long running one record at a time.
If you think you are building an application where scalibility will be an issue, you need to hire a database professional and follow his or her suggestions for design based on performance. Databases can scale to terrabytes of data but only if they are originally designed by someone is a specialist in this kind of thing. When you wait until the while application is runnning slower than dirt and you havea new large client coming on board, it is too late. Database design must consider performance from the beginning as it is very hard to redesign when you already have millions of records.
A good way to reduce scalability of your data tier is to interact with it on a procedural basis. (Fetch row..process... update a row, repeat)
This can be done within a stored procedure by use of cursors or within an application (fetch a row, process, update a row) .. The result (poor performance) is the same.
When people say they want to do processing in their application it sometimes implies a procedural interaction.
Sometimes its necessary to treat data procedurally however from my experience developers with limited database experience will tend to design systems in a way that do not leverage the strenght of the platform because they are not comfortable thinking in terms of set based solutions. This can lead to severe performance issues.
For example to add 1 to a count field of all rows in a table the following is all thats needed:
UPDATE table SET cnt = cnt + 1
The procedural treatment of the same is likely to be orders of magnitude slower in execution and developers can easily overlook concurrency issues that make their process inconsistant. For example this kind of code is inconsistant given the avaliable read isolation levels of many RDMBS platforms.
SELECT id,cnt FROM table
...
foreach row
...
UPDATE table SET cnt = row.cnt+1 WHERE id=row.id
...
I think just in terms of abstraction and ease of servicing a running environment utilizing stored procedures can be a useful tool.
Procedure plan cache and reduced number of network round trips in high latency environments can also have significant performance advantages.
It is also true that trying to be too clever or work very complex problems in the RDBMS's half-baked procedural language can easily become a recipe for disaster.
"Given that database is generally the least scalable component (of a web application), are there any situations where one would put logic in procedures/triggers over keeping it in his favorite programming language (ruby...) or her favorite web framework (...rails!)."
What makes you think that "scalability" is the only relevant concern in a system design ? I agree with rexem where he commented that it is very obvious that you are "not" biased ...
Databases are sets of assertions of fact. Those sets become more valuable if they can also be guaranteed to conform to certain integrity rules. Those guarantees are not worth a dime if it is the applications that are expected to enforce such integrity. Triggers and sprocs are the only way SQL systems have to allow such guarantees to be offered by the DBMS itself.
That aspect outweighs "scalability" anytime, anywhere, anyhow.
What is the preferred way to use stored procedures between the following two methods and why:
One general SP such as 'GetOrders' which returns all the columns for the table Order. Several different parts of the application will use the same SP.
OR
Several more specific SPs such as 'GetOrdersForUse1' and 'GetOrdersForUse2' which return a subset of all the columns. Each SP is only used by one part of the application.
In the general case, the application will only use a subset of the columns returned by the SP. I was thinking of using the specific method for performance reasons but is it really going to be worth the extra work? I am developing a web site using ASP.NET and SQL 2005.
Like all great things it depends. How different is the logic in your variations. If for example the only difference is the return columns, then all your saving is some bandwidth over the network and some memory both of which are a lot cheaper then the time its going to take to create the variations test them and maintain them.
Now if there is very significant different selection logic going (joining different tables etc), then you might be better off having specialized SP's.
One last thing don't prematurely optimize. Build it simple and working first, then when you discover you need that extra millisecond then you can look at tweaking.
I would go for your second option simply because you should NOT be extracting data from the database that you don't need (either rows or columns) - it puts unnecessary strain of the DBMS and transmits useless data across the wire, wasting network bandwidth.
Think of them as functions, if you would write a separate function then probably use separate stored procedures. If there is any doubt remaining, use separate stored procedures because:
it will save bandwith
it will save memory
I find that separate stored procedures are easier to maintain than one giant one.