GOCHESTER.COM
Search:
 
 
Membership
 
  Member Login
Become a Member
Reset Password
 
     

 
Site Links
 
  Index
News Articles
View Articles By Category
 
     

 
The Curse and Blessings of Dynamic SQL
Category: Database related articles
By: Chester Zhang - July 14th, 2008
This article has been read: 126 times.




The Curse and Blessings of Dynamic SQL


An SQL text by Erland Sommarskog,
SQL Server MVP.


An earlier version of this article is
also available in
German,
Spanish and
Vietnamese. Translations
provided by SQL Server MVP Frank Kalis, Simon Hayes and
Tam Vu respectively.


If you follow the various newsgroups on Microsoft SQL Server,
you often see people asking why they can't do:


SELECT * FROM @tablename
SELECT @colname FROM tbl
SELECT * FROM tbl WHERE x IN (@list)

For all three examples you can expect someone to answer Use dynamic SQL
and give a quick example on how to do it. Unfortunately, for all three examples
above, dynamic SQL is a poor solution.
On the other hand, there are situations where dynamic SQL
is the best or only way to go.


In this article I will discuss the use of dynamic SQL
in stored procedures and to a minor extent from client languages. To set the
scene, I start with a very quick overview on application
architecture for data access. I then proceed to describe the feature dynamic
SQL as such,
with a quick introduction followed by the gory syntax details. Next, I continue with a discussion on SQL injection, a
security issue that you absolutely must have good understanding of when
you work with dynamic SQL. This is followed by a section where I discuss why
we use stored procedures, and how that is affected by the use of dynamic SQL.
I carryon with a section on good practices and tips for writing
dynamic SQL. I conclude by reviewing a number of
situations where you could use dynamic SQL and
whether it is a good or bad idea to do it.


The article covers all versions of SQL Server from SQL 6.5 to
SQL 2005, with emphasis on SQL 2000 and SQL 2005.



Contents:



   Accessing a Data from an Application

   Introducing Dynamic SQL


      A First Encounter

      sp_executesql

      EXEC()


   SQL Injection – a Serious Security Issue

   Dynamic SQL and Stored Procedures


      The Permission System

      Caching Query Plans

      Reducing Network Traffic

      Encapsulating Logic

      Keeping Track of what Is Used

      Easiness of Writing SQL Code

      Addressing Bugs and Problems


   Good Coding Practices and Tips for Dynamic SQL


      Use Debug Prints!

      Nested Strings

      Spacing and Formatting

      Dealing with Dynamic Table and Column Names

      Quotename, Nested Strings and Quotestring

      QUOTED_IDENTIFIER

      sp_executesql and Long SQL Strings in SQL 2000

      Dynamic SQL in User-Defined Functions

      Cursors and Dynamic SQL


   EXEC() at Linked Server

   Common Cases when to (Not) Use Dynamic SQL


      SELECT * FROM @tablename

      SELECT * FROM sales + @yymm

      UPDATE tbl SET @colname = @value WHERE keycol = @keyval

      SELECT col AS @myname

      SELECT * FROM @dbname + '..tbl'

      SELECT * FROM tbl WHERE col IN (@list)

      SELECT * FROM tbl WHERE @condition

      Dynamic Search Conditions

      SELECT * FROM tbl ORDER BY @col

      SELECT TOP @n FROM tbl

      CREATE TABLE @tbl

      CREATE TABLE with Unknown Columns

      Linked Servers

      OPENQUERY

      Dynamic Column Widths

      Dynamic SQL and Maintenance Tasks


   Acknowledgements and Feedback

   Revision History

Note: many of
the code samples in this text works against the pubs and Northwind databases
that ship with SQL 2000 and SQL 7, but not with SQL 2005. You can download
these databases from

Microsoft's web site
.




Accessing a Data from an Application


Before I describe dynamic SQL, I like to briefly discuss the various ways you can
access data from an application to give an overview of what I'll be
talking about in this article.


(Note: all through this text I will
refer to client as anything that accesses SQL Server from the outside.
In the overall application architecture that may in fact be a middle tier or
a business layer, but as that is of little interest to this article, I use
client
in the sake of brevity.)


There are two main roads to go, and then there are forks and sub-forks.



  1. Send SQL statements from the client to SQL
    Server.

    1. Rely on SQL generated by the client API, using options like
      CommandType.TableDirect and methods like .Update.

    2. Compose the SQL strings in the client code.

      1. Build the entire SQL string with parameter values expanded.

      2. Use parameterised queries.



  2. Perform access through stored procedures.


    1. Stored procedures in T-SQL


      1. Use static SQL only.

      2. Use dynamic SQL together with static SQL.


    2. Stored procedures in a CLR language such as C# or VB .Net. (SQL 2005
      only.)



Fork 1-a may be good for simple tasks, but you are likely to
find that you outgrow it as the complexity of your application increases.
In any case, this approach falls entirely outside the scope of this article.


Many applications are built along the principles of fork 1-b,
and as long as you take the sub-fork 1-b-ii, it does not have to
be bad. (Why 1-b-i is bad, is
something I will come back to. Here I will just drop two keywords:
SQL
Injection
and Query-Plan Reuse.) Nonetheless, in many shops the mandate is
that you should use stored procedures. When you use stored procedures with
only static SQL, users do
not need direct permissions to access the tables, only permissions to execute the stored
procedures, and thus you can use the stored procedure to control what users
may and may not do.


The main focus for this text is sub-fork 2-a-ii. When used
appropriately, dynamic SQL in stored
procedures can be a powerful addition to static SQL. But some of the questions on the newsgroups leads to
dynamic SQL in stored procedures that is so meaningless, that these people
would be better off with fork 1-b instead.


Finally, fork 2-b, stored procedures in the CLR, is in many
regards very similar to fork 1-b, since all data access from CLR
procedures is through generated SQL strings, parameterised or unparameterised. If you have settled on SQL
procedures for your application, there is little point in rewriting them into
the CLR. However, CLR code can be a valuable supplement for tasks that are
difficult to perform in T-SQL, but you yet want to perform server-side.


Introducing Dynamic SQL


In this chapter I will first look at some quick examples of dynamic SQL and
point out some very important implications of using dynamic SQL. I will then
describe sp_executesql and EXEC() in detail, the two commands you can use to
invoke dynamic SQL from T-SQL.


A First Encounter


Understanding dynamic SQL itself is not difficult. Au contraire, it's rather
too easy to use. Understanding the fine details, though, takes a little
longer time. If you start out using dynamic SQL casually, you are bound to face
accidents when things do not work as you have anticipated.


One of the problems
listed in the introduction was how to write a stored procedure that takes a
table name as its input. Here are two examples, based on the two ways to do dynamic SQL in
Transact-SQL:


CREATE PROCEDURE general_select1 @tblname sysname,
@key varchar(10) AS
DECLARE @sql nvarchar(4000)
SELECT @sql = ' SELECT col1, col2, col3 ' +
' FROM dbo.' + quotename(@tblname) +
' WHERE keycol = @key'
EXEC sp_executesql @sql, N'@key varchar(10)', @key

CREATE PROCEDURE general_select2 @tblname nvarchar(127),
@key varchar(10) AS
EXEC('SELECT col1, col2, col3
FROM ' + @tblname + '
WHERE keycol = ''' + @key + '''')

Before I say anything else, permit me to point out that these are examples of
bad usage of dynamic SQL.
Passing a table name as a parameter
is not how you should write stored procedures, and one aim of this article is
to explain this in detail. Also, the two examples are not equivalent. While
both examples are bad, the second
example has several problems that the first does not have. What these
problems are will be apparent as you read this text.


Whereas the above looks very simple and easy, there are some very important things
to observe. The first thing is permissions. You may know that when you
use stored procedures, users do not need permissions to access the tables accessed by the stored procedure. This does not apply when
you use dynamic SQL!
For the procedures above to execute
successfully, the users must have SELECT permission on the table in @tblname. In SQL 2000 and earlier this is an absolute rule with no
way around it. SQL 2005 provides alternative ways, something I will
come
back to in the section The Permission System.


Next thing to observe is that the dynamic SQL is not part of
the stored procedure
, but constitutes its own scope. Invoking a block
of dynamic SQL is akin to call a nameless stored procedure created ad-hoc. This
has a number of consequences:



  • Within the block of dynamic SQL, you cannot access local variables
    (including table variables) or parameters of the calling stored procedure.
    But you can pass parameters – in and out – to a block of dynamic SQL if you
    use sp_executesql.

  • Any USE statement in the dynamic SQL will not affect the calling stored procedure.

  • Temp tables created in the dynamic SQL will not be accessible from the
    calling procedure since they are dropped when the dynamic SQL exits.
    (Compare to how temp tables created in a stored procedure go away when you
    exit the procedure.) The block of
    dynamic SQL can however access temp tables created
    by the calling procedure.

  • If you issue a SET command in the dynamic SQL, the effect of the SET
    command lasts for the duration of the block of dynamic SQL
    only and does not affect the caller.

  • The query plan for the stored procedure does not include the dynamic SQL.
    T
    he block of dynamic SQL has a query plan of its own.


As you've seen there are two ways to invoke dynamic SQL, sp_executesql and
EXEC(). sp_executesql was added in SQL 7, whereas EXEC() has been around
since SQL 6.0. In application code, sp_executesql should be your choice 95%
of the time for reasons that will prevail. For now I will only give two
keywords: SQL Injection and
Query-Plan Reuse. EXEC() is mainly useful for quick throw-away things and DBA tasks, but also
comes to the rescue in SQL 2000 and SQL 7
when the SQL string exceeds 4000 characters. And, obviously, in SQL 6.5, EXEC() is the sole choice. In the next
two sections we will look at these two commands in detail.



sp_executesql


sp_executesql is a built-in stored procedure that takes two
pre-defined parameters and any number of user-defined parameters.


The first parameter @stmt is mandatory, and contains a batch of one or
more SQL statements. The data type of @stmt is ntext in SQL 7 and SQL 2000,
and nvarchar(MAX) in SQL 2005. Beware that you must pass an nvarchar/ntext
value (that is, a Unicode value). A varchar value won't do.


The second parameter @params is optional, but you will use it 90% of the
time. @params declares the parameters that you refer to in @stmt. The syntax
of @params is exactly the same as for the parameter list of a stored procedure. The
parameters can
have default values and they can have the OUTPUT marker. Not all parameters you declare must actually
appear in the SQL string. (Whereas all variables that appear in the SQL
string must be declared, either with a DECLARE inside @stmt, or in
@params.) Just like @stmt, the data
type of @params in SQL 7/2000 is ntext and nvarchar(MAX) in SQL 2005.


The rest of the parameters are simply the parameters that you declared in
@params, and you pass them as you pass parameters to a stored procedure, either
positional or named. To get a value back from your output parameter, you must
specify OUTPUT with the parameter, just like when you call a stored
procedure. Note that @stmt and @params must be specified positionally. You
can provide the parameter names for them, but these names are blissfully ignored.


Let's look at an example. Say that in your database, many tables
have a column LastUpdated, which holds the time a row last was
updated. You want to be able to find out how many rows in each table that were modified at
least once during a period. This is not something you run as part of the application, but
something you run as a DBA from time to time, so you just keep it as a script
that you have a around. Here is how it could look like:


DECLARE @tbl    sysname,
@sql nvarchar(4000),
@params nvarchar(4000)
,
@count int

DECLARE tblcur CURSOR STATIC LOCAL FOR
SELECT object_name(id) FROM syscolumns WHERE name = 'LastUpdated'
ORDER BY 1
OPEN tblcur

WHILE 1 = 1
BEGIN
FETCH tblcur INTO @tbl
IF @@fetch_status <> 0
BREAK

SELECT @sql =
N' SELECT @cnt = COUNT(*) FROM dbo.' + quotename(@tbl) +
N' WHERE LastUpdated BETWEEN @fromdate AND ' +
N' coalesce(@todate, ''99991231'')'
SELECT @params = N'@fromdate datetime, ' +
N'@todate datetime = NULL, ' +
N'@cnt int OUTPUT'
EXEC sp_executesql @sql, @params, '20060101', @cnt = @count OUTPUT

PRINT @tbl + ': ' + convert(varchar(10), @count) + ' modified rows.'
END

DEALLOCATE tblcur

I've put the lines that pertain directly to the dynamic SQL in bold face. You
can see that I have declared the @sql and @params variables to be of the maximum
length for nvarchar variables in SQL 2000. In SQL 2005, you may want to make it a routine to
make @sql nvarchar(MAX), more about this just below.


When I assign the @sql variable, I am careful to format the statement so that
it is easy to read, and I leave in spaces to avoid that two concatenated
parts are glued together without space in between, which could cause a syntax
error. I put the table name in
quotename() in case a table name has any special
characters in it. I also prefix the table name with "dbo.", which is a good habit, as we will see when we look at dynamic SQL and
query plans. Overall, I will cover this sort of
good practices more in detail later in the text. Note also the appearance of
''
around the date literal – the rule in T-SQL is that to include the string
delimiter in a string, you must double it.


In this example, the dynamic SQL has three parameters: one mandatory input
parameter, one optional input parameter, and one
output parameter. I've assumed that this time the DBA wanted to see
all changes made after 2006-01-01, which is why I've left out @todate in the call
to sp_executesql. Since I left out one variable, I must specify the last,
@cnt by name – the same rules as when you call a stored procedure. Note also
that the variable is called @cnt in the dynamic SQL, but @count in the
surrounding script. Normally, you might want to use the same name, but I
wanted to stress that the @cnt in the dynamic SQL is only visible within the
dynamic SQL, whereas @count is not visible there.


You may note that I've prepend the string literals with N to denote that
they are Unicode strings. As @sql and @params are declared as nvarchar,
technically this is not necessary (as long as you stick your 8-bit character
set). However, would you provide any of the strings directly in the call to
sp_executesql, you must specify the N, as in this fairly silly example:


EXEC sp_executesql N'SELECT @x', N'@x int', @x = 2

If you remove any of the Ns, you will get an error message. Since sp_executesql is a built-in stored procedure, there is no implicit
conversion from varchar.


You may wonder why I do not pass @tbl as a parameter as well. The answer is
that you can't. Dynamic SQL is just like any other SQL. You can't specify a
table name through a variable in T-SQL, that's the whole story. Thus, when you
need to specify things like table names, column names etc dynamically,
you must interpolate them into the string.


If you are on SQL 2000 or SQL 7, there is a limitation with sp_executesql
when it comes to the length of the SQL string. While the parameter is ntext,
you cannot use this data type for local variables. Thus, you will have to
stick to nvarchar(4000). In many cases this will do fine, but it is not
uncommon to exceed that limit. In this case, you will need to use EXEC(),
described just below.


On SQL 2005, this is not an issue. Here you can use the new data type
nvarchar(MAX) which can hold as much data as ntext,
but without the many restrictions of ntext.


EXEC()



EXEC() takes one parameter which is an SQL statement to
execute. The parameter can be a concatenation of
string variables and string literals, but cannot include calls to functions
or other operators. For very simple
cases, EXEC() is less hassle than sp_executesql. For instance, say that you
want to run UPDATE STATISTICS WITH FULLSCAN on some selected tables. It could
look like this:


FETCH tblcur INTO @tbl
IF @@fetch_status <> 0 BREAK
EXEC('UPDATE STATISTICS [' + @tbl + '] WITH FULLSCAN')

In the example with sp_executesql, I used qoutename(), but here I've let it
suffice with adding brackets, in case there is a table named Order
Details
(which there is in the Northwind database). Since EXEC only permits
string literals and string variables to be concatenated and not arbitrary
expressions, this is not legal:


EXEC('UPDATE STATISTICS ' + quotename(@tbl) + ' WITH FULLSCAN')

Best practice is to always use a variable to hold the SQL statement, so the
example would better read:


FETCH tblcur INTO @tbl
IF @@fetch_status <> 0 BREAK
SELECT @sql = 'UPDATE STATISTICS ' + quotename(@tbl) + ' WITH FULLSCAN'
EXEC(@sql)

The fact that you can concatenate strings within EXEC() can permit you to
make very quick things, which can be convenient at times, but can lead to
poor habits in application code. However, there are situations where this is an
enormous blessing. As I mentioned, in SQL 7 and SQL 2000, you can in practice
only use 4000 characters in your SQL string with sp_executesql. EXEC does
not have this limitation, since you can say


EXEC(@sql1 + @sql2 + @sql3)

Where all of @sql1, @sql2 and @sql3 can be 4000 characters long – or even
8000 characters as EXEC() permits you to use varchar.


Since you cannot use parameters, you cannot as easily get values out from
EXEC() as you can with sp_executesql. You can, however, use INSERT-EXEC
to insert the result set from EXEC() into a table. I will show you an example
later on, when I also show you how you can
use EXEC() to pass longer strings than 4000 characters to sp_executesql.


In SQL 2005, EXEC() permits impersonation so that you can say:


EXEC(@sql) AS USER = 'mitchell'
EXEC(@sql) AS LOGIN = 'CORDOBA\Miguel'

This is mainly a syntactical shortcut that saves you from embedding the
invocation of dynamic SQL in EXECUTE AS and REVERT. (I discuss these
statements more in detail in my article
Granting Permissions Through Stored
Procedures
.)


SQL 2005 adds a valuable extension to EXEC(): you can use
it to execute
strings on linked servers. I will cover this form
of EXEC() in a separate section
later in this text.


SQL Injection – a Serious Security Issue


Before you start to use dynamic SQL all over town, you need to learn about
SQL injection
and how you protect your application against it. SQL
injection is a technique whereby an intruder enters data that causes your application
to execute SQL statements you did not intend it to. SQL injection is possible as soon there is dynamic SQL which is
handled carelessly, be that SQL statements sent from the client, dynamic SQL
generated in T-SQL stored procedures, or SQL batches executed from CLR stored
procedures. This is not a line of attack that is unique to
MS SQL Server, but all RDBMS are open to it.


Here is an example. The purpose of the procedure below is to permit users to
search for orders by various conditions. A real-life example of such a
procedure would have many more parameters, but I've cut it down to two to be
brief. (This is, by the way, a problem for which dynamic SQL is a very good
solution.) As the procedure is written, it is open for SQL injection:


CREATE PROCEDURE search_orders @custid   nchar(5)     = NULL,
@shipname nvarchar(40) = NULL AS
DECLARE @sql nvarchar(4000)
SELECT @sql = ' SELECT OrderID, OrderDate, CustomerID, ShipName ' +
' FROM dbo.Orders WHERE 1 = 1 '
IF @custid IS NOT NULL
SELECT @sql = @sql + ' AND CustomerID LIKE ''' + @custid + ''''
IF @shipname IS NOT NULL
SELECT @sql = @sql + ' AND ShipName LIKE ''' + @shipname + ''''
EXEC(@sql)

Before we look at a real attack, let's just discuss this from the point of view
of user-friendliness. Assume that the input for the parameters @custid and @shipname comes directly
from the user and a naïve and innocent user wants to look for orders where ShipName is Let's Stop N Shop, so he enters Let's. Do you see
what will happen? Because @shipname includes a single quote, he will get a
syntax error. So even if you think that SQL injection is no issue to you,
because you trust your users, you still need to read this section, so that they
can search for Brian O'Brien and Samuel Eto'o.


So this is the starting point. A delimiter, usually a single quote, affects your dynamic SQL, and
a malicious user
can take benefit of this. For
instance, consider this input for @shipname:


' DROP TABLE Orders --

The resulting SQL becomes:


SELECT * FROM dbo.Orders WHERE 1 = 1  AND ShipName LIKE '' DROP TABLE orders --'

This is a perfectly legal batch of T-SQL, including the text in red. Of
course, since there is something called permissions in SQL Server, this
attack may or may not succeed. A plain
user who runs a Windows application and who logs into SQL Server with his
own login, is not likely to have
permissions to drop a table. But it is not uncommon for web applications to
have a general login that runs SQL queries on behalf of the users. And if this web app logs into SQL Server with sysadmin or db_owner
privileges, the attack succeeds. Mind you, with sysadmin rights, the
attacker can add users and logins as he pleases. And if the service account
for SQL Server has admin privileges in Windows, the attacker has access into
your network far beyond SQL Server through xp_cmdshell. (Which is
disabled by default on SQL 2005, but if the attacker has achieved
sysadmin rights on the server, he can change that.)


Typically, an attacker first tests what happens
if he enters a single quote (') in an input field or a URL. If this
yields a syntax error, the attacker knows that there is a vulnerability. He
then finds out if he needs any extra tokens to terminate the query, and then
he can add his own SQL statement. Finally he adds a comment character to kill
the rest of the SQL string to avoid syntax errors. Single quote is the most
common character to reveal openings for SQL injection, but if you have
dynamic table and column names, there are more options an attacker could
succeed with.
Take this dreadful version of general_select:


CREATE PROCEDURE general_select2 @tblname nvarchar(127),
@key varchar(10) AS
EXEC('SELECT col1, col2, col3
FROM ' + @tblname + '
WHERE keycol = ''' + @key + '''')

and assume that @tblname comes from a URL. There are quite some options that
an attacker could use to take benefit of this hole.


And don't overlook numeric values: they can very well be used for SQL
injection. Of course, in a T-SQL procedure where the value is passed as an
int parameter there is no risk, but if a supposedly numeric value is directly
interpolated into an SQL string in client code, there is a huge potential for
SQL injection.


Keep in mind that user input comes from more places than just input fields on
a form. The most commonly used area for injection attacks on the Internet is
probably parameters in URLs and cookies. Thus, be very careful how you handle
anything that comes from the user.


You may think that it takes not only skill, but also luck for someone to find
and exploit a hole for SQL injection. But remember that there are too many hackers out there
with too much time on their hands. SQL injection is a serious security issue, and you
must take precautions to protect your applications against it.


Thankfully, it is not difficult at all. I've seen mentioning of various ways
to validate input data, but all that is a joke. There are three steadfast
principles you need to follow:



  • Never run with more privileges than necessary. Users that log into an
    application with their own login should normally only have EXEC
    permissions on stored procedures. If you use dynamic SQL, it should be
    confined to reading operations so that users only need SELECT permissions.
    A web site that logs into a database should not have any elevated
    privileges, preferably only EXEC and
    (maybe) SELECT permissions. Never let the web site log in as sa!

  • For web applications: never expose error messages from SQL Server to the
    end user.

  • Always used
    parameterised statements.
    That is, in a T-SQL procedure use sp_executesql,
    not EXEC().


The first point is mainly a safeguard, so that if there is a injection hole,
the intruder will not be able to do that much harm. The second point makes
the task for the attacker more difficult as he cannot get feedback from his
attempts.


But it is the third point that is the
actual protection, and that we will look a little closer at. The procedure search_orders above should be coded as:


CREATE PROCEDURE search_orders @custid   nchar(5) = NULL,
@shipname nvarchar(40) = NULL AS
DECLARE @sql nvarchar(4000)
SELECT @sql = ' SELECT OrderID, OrderDate, CustomerID, ShipName ' +
' FROM dbo.Orders WHERE 1 = 1 '
IF @custid IS NOT NULL
SELECT @sql = @sql + ' AND CustomerID LIKE @custid '
IF @shipname IS NOT NULL
SELECT @sql = @sql + ' AND ShipName LIKE @shipname '
EXEC sp_executesql @sql, N'@custid nchar(5), @shipname nvarchar(40)',
@custid, @shipname

Since the SQL string does include any user input, there is
no opening for SQL
injection. It's as simple as that. By the way, note that since we can include
parameters in the parameter list, even if they don't actually appear in the
SQL string, we don't need any complicated logic to build the parameter list,
but can keep it static. In the same vein, we can always pass all input
parameters to the SQL string.


As you may recall, you cannot pass everything as parameters to dynamic SQL,
for instance table and column names. In this case you must enclose all such
object names in quotename(), that I will return to in the section
Good Coding Practices and Tips for Dynamic SQL
.


The example above was for dynamic SQL in a T-SQL stored procedure. The same advice
applies to SQL generated in client code or in a CLR stored procedure. Since
this is so important, here is an example of coding the above in VB6 and ADO:


Set cmd = CreateObject("ADODB.Command")
Set cmd.ActiveConnection = cnn

cmd.CommandType = adCmdText
cmd.CommandText = " SELECT OrderID, OrderDate, CustomerID, ShipName " & _
" FROM dbo.Orders WHERE 1 = 1 "
If custid <> "" Then
cmd.CommandText = cmd.CommandText & " AND CustomerID LIKE ? "
cmd.Parameters.Append
cmd.CreateParameter("@custid", adWChar, adParamInput, 5, custid)
End If

If shipname <> "" Then
cmd.CommandText = cmd.CommandText & " AND ShipName LIKE ? "
cmd.Parameters.Append cmd.CreateParameter("@shipname", _
adVarWChar, adParamInput, 40, shipname)
End If

Set rs = cmd.Execute

Since the main focus of this text is dynamic SQL in T-SQL procedures, I will
explain this example only briefly. In ADO you use ? as a parameter
marker, and you can only pass parameters that
actually appear in the SQL string. (If you
specify too many parameters, you will get a completely incomprehensible error
message.) If you use the SQL Profiler to see what ADO
sends to SQL Server, you will find that it invokes – sp_executesql.


Protection against SQL injection is not the only advantage of using
parameterised queries. In the section Caching Query
Plans
, we will look more in detail on parameterised queries and at a
second very important reason to use them. This section also includes an example of composing and sending a parameterised SQL statement for SqlClient
in VB .Net.


You may think that an even better protection against SQL injection is to use
stored procedures with static SQL only. Yes, this is true,
but!
It
depends on how you call your stored procedures from the client. If you
compose an EXEC command into which you interpolate the input values, you are
back on square one and you are as open to SQL injection as ever.
In ADO, you need to call
your procedure with the command type adCmdStoredProc and use .CreateParameter to specify the parameters. By specifying adCmdStoredProc, you call the stored procedure through RPC,
Remote Procedure Call
, which not only protects you against SQL
injection, but it is also more efficient. Similar measures apply to other client APIs;
all APIs I know of supply a way to call a stored procedure through RPC.


Dynamic SQL and Stored Procedures


In the introduction, I presented various strategies for
data-access for an application, and I said that in many shops all data access
is through stored procedures. In this section, I will look a little closer at
the advantages with using stored procedures over sending SQL statements from
the client. I will also look at what happens when you use dynamic SQL in a
stored procedure, and show that you lose some of the advantages with stored
procedures, whereas other are unaffected.


The Permission System


Historically, using stored procedures has been the way to give users
access to data. In a locked-down database, users do not have permissions to
access tables directly. Instead, the application performs all
access through stored procedures that retrieve and update data in a
controlled way, so that users only get to see data they have access to, and
they cannot perform updates that violate business rules. This works as long as the
procedure and the tables have the same owner, typically dbo (the
database owner), through a mechanism known as ownership chaining.


As I have already mentioned, ownership chaining does not work when you
use dynamic SQL
. The reason for this is very simple: the block of
dynamic SQL is not a procedure and does not have any owner.
Thus the chain
is broken.


SQL 2005


In SQL 2005 this can be addressed by signing a procedure that uses dynamic
SQL with a certificate. You associate the certificate with a user, and grant
that user (which is a user that cannot log in) the rights needed for the
dynamic SQL to execute successfully. A second method in SQL 2005 is to use
the EXECUTE AS clause to impersonate a user that has been granted the
necessary permissions. This method is easier to use, but has side effects
that can have unacceptable consequences for auditing, row-level security
schemes and system monitoring. For this reason, my strong recommendation is
to use certificates.


Describing these methods more closely, would take up too much space here.
Instead I've written a separate article about them, Giving Permissions through Stored
Procedures
, where I discusses both certificates and impersonation in
detail, and I also take a closer look on ownership chaining.


If you write CLR procedures that perform data access, the same is true
for them.
Ownership chaining never applies since all data access in a CLR procedure is
through dynamic SQL. But you can use certificates or
impersonation to avoid having to give users direct permissions on the
tables.


SQL 2000 and earlier


On SQL 2000 there is no way
to combine dynamic SQL with the encapsulation of permissions that you can get
through stored procedures. Any use of dynamic
SQL requires that the users have direct permissions on the accessed tables. If your security
scheme precludes giving users permissions to access tables directly, you cannot
use dynamic SQL
. It is that plain and simple. Depending on the
sensitivity of the data in the application, it may be acceptable to give the
users SELECT permissions on the tables (or on some tables) to permit the use
of dynamic SQL. I strongly recommend against granting users INSERT, UPDATE
and DELETE rights on tables only to permit dynamic SQL
in some occasional procedure.


There are nevertheless two alternatives, application roles and "application
proxies", but they require you to change the application architecture, so it
is nothing you introduce at whim.


Application roles were introduced in SQL 7. Users log into SQL Server but have no permissions on their own beyond
the database access. Instead, the application activates the application role by
sending a password somehow embedded into it, and this application
role has permissions to read and update tables. With application roles, it
does not really matter if you use stored procedures or not. The same is true
for "application proxies" where the application authenticates the users outside SQL Server and logs into SQL
Server on their behalf with a proxy login. This proxy login impersonates the users in SQL Server, and
thus their permissions apply. However, since the users do not have any login, they cannot
log into SQL Server outside the application. In Giving Permissions...,
I discuss these two methods a little further.


For both ofr these methods, keep in mind about SQL injection, and don't grant your application role or the
login-less users anything beyond SELECT, INSERT, UPDATE and DELETE permissions on tables.
(And preferably only SELECT, and stick all updates into stored procedures
with static SQL).


Caching Query Plans


Every query you run in SQL Server requires a query plan. When you run a query
the first time, SQL Server builds a query plan for it – or as the terminology
goes – it compiles the query. SQL Server saves the plan in cache, and next time you run
the query, the plan is reused. The query plan stays in cache
until it's aged out because it has not been used for a while, or it is
invalidated for some reason. (Why this happens falls outside the scope of
this article.)


The reuse of cached query plans is very important for the performance
of queries where the compilation time is in par with the execution time or
exceeds it. If
a query needs to run for four minutes, it does not matter much if the query
is recompiled for an extra second each time. On the other hand, if the execution time of the
query is 40 ms but it takes one second to compile the query, there is a
huge gain with the cached plan, particularly if the query is executed over and
over again.


Up to SQL 6.5 the only plans there were put
into the cache were plans for stored
procedures. Loose batches of SQL were compiled each time. And since the
query plan for dynamic SQL is not part of the stored procedure, that includes
dynamic SQL as well. Thus in SQL 6.5, the use of dynamic SQL nullified the
benefit with stored procedures in this regard.


Starting with SQL 7, SQL Server also caches the plans for bare statements
sent from a client or generated through dynamic SQL. Say that you send this
query from the client, or execute it with EXEC():


SELECT O.OrderID, SUM(OD.UnitPrice * OD.Quantity)
FROM Orders O
JOIN [Order Details] OD ON O.OrderID = OD.OrderID
WHERE O.OrderDate BETWEEN '19980201' AND '19980228'
AND EXISTS (SELECT *
FROM [Order Details] OD2
WHERE O.OrderID = OD2.OrderID
AND OD2.ProductID = 76)
GROUP BY O.OrderID

The query returns the total order amount for the orders in February 1998 that
contained the product Lakkalikööri. SQL Server will put
the plan into the cache,
and next time you run this query, the plan will be reused. But only if it is exactly the same query.
Since the cache lookup is by a hash value computed from the query text, the cache is space- and case-sensitive.
Thus, if you add a
single space somewhere, the plan is not reused. More importantly, it is not
unlikely that next time you want to run the query for a different product, or a
different period.


All this changes, if you instead use sp_executesql to run your query
with parameters:


DECLARE @sql nvarchar(2000)
SELECT @sql = 'SELECT O.OrderID, SUM(OD.UnitPrice * OD.Quantity)
FROM dbo.Orders O
JOIN dbo.[Order Details] OD ON O.OrderID = OD.OrderID
WHERE O.OrderDate BETWEEN @from AND @to
AND EXISTS (SELECT *
FROM dbo.[Order Details] OD2
WHERE O.OrderID = OD2.OrderID
AND OD2.ProductID = @prodid)
GROUP BY O.OrderID'
EXEC sp_executesql @sql, N'@from datetime, @to datetime, @prodid int',
'19980201', '19980228', 76

The principle for cache lookup is the same as for a non-parameterised query:
SQL Server hashes the query text and looks up the hash value in the cache,
still in a case- and space-sensitive fashion. But since the parameter values
are
not part of the query text, the same plan can be reused even when the input
changes.


To make this really efficient there is one more thing you need to observe.
Do you see that I've prefixed all tables in the query with dbo? There
is a very important reason for this. On SQL 2000,
this is an absolute must for effecient use of the query-plan cache. If you leave out dbo from a single
table, each user will get his own copy of the plan in the cache. This is
because on SQL 2000, each user has a default schema which is equal to the
username. So when user1 runs a query that goes "SELECT ... FROM
Orders", SQL Server must first check if there is a table user1.Orders,
before it looks for dbo.Orders. Since user1.Orders could appear
on the scene at any time, SQL Server needs to have a separate plan for each user.


The recommendation to include dbo applies very much to SQL 2005 as well, but
it is not an absolute must. This is because on SQL 2005, owner and schema has
been separated from each other so that users can have dbo as their
default schema, in which case they can share a plan even when tables are not prefixed
with dbo. (Since only Orders will be unambiguous.) Note here that for
this to happen, users must have been created with the new command CREATE USER.
If you – or the DBA
use sp_adduser out of habit, users still have their own default
schema, and the situation is the same as on SQL 2000.
Thus, to be safe, always prefix your tables with dbo in your dynamic SQL on SQL 2005 as
well. (Unless, of course, you are actually using different schemas and not
only the dbo schema, something which is a lot easier to do now. If you want to read more about owner/schema separation, there is a
section on it in my article Granting Permissions through
Stored Procedures
.)


If you instead use stored procedures, it is not equally important to prefix
tables with dbo. Microsoft still recommends that you do, but even if
you don't, users with different default schema can share the same query
plan.


From what I have said here, it follows that if you use dynamic SQL with
EXEC() you lose an important benefit of stored procedures
whereas with sp_executesql you don't. At least in
theory. It's easy to forget that dbo, and if you leave it out in just a
single place in the query, you will get as
many entries in the cache for the query as there are users running it. Recall
also that the cache is space-
and case-sensitive, so if you generate the same query in several places, you
may inadvertently have different spacing or inconsistent use of case.
Also, since the cache lookup is by a hash value computed from the query text, I
would assume that this is somewhat more expensive than looking up a stored
procedure. In fact, under extreme circumstances, heavy use of dynamic SQL, can lead to serious
performance degradation. Some of my MVP colleagues have observed systems with
lots of memory (> 20 GB) when the plan cache has been so filled with plans
for SQL statements, that there have been hash collisions galore, and the
cache lookup alone could take several seconds. Presumably, the applications in
question either did not use parameterised queries at all, or they failed to
prefix tables with dbo.


So far, I've only talked about dynamic SQL in stored procedures. But in this
regard there is very little difference to SQL statements sent from
the client, or SQL statements generated in CLR procedures. The same rules
apply: unparameterised statements are cached but with little probability for
reuse, whereas parameterised queries can be as efficient as stored
procedures if you remember to always prefix the tables with dbo. (And still
with the caveat that the cache lookup is space- and case-sensitive.) Most client APIs implement
parameterised queries by calling sp_executesql under the covers.


In the section on SQL Injection, I included an example on how to do
parameterised queries with ADO and VB6.
Here is an example with VB .Net and SqlClient:


cmd.CommandType = System.Data.CommandType.Text
cmd.CommandText = _
" SELECT O.OrderID, SUM(OD.UnitPrice * OD.Quantity)" & _
" FROM dbo.Orders O " & _
" JOIN dbo.[Order Details] OD ON O.OrderID = OD.OrderID" & _
" WHERE O.OrderDate BETWEEN @from AND @to" & _
" AND EXISTS (SELECT *" & _
" FROM dbo.[Order Details] OD2" & _
" WHERE O.OrderID = OD2.OrderID" & _
" AND OD2.ProductID = @prodid)" & _
" GROUP BY O.OrderID"

cmd.Parameters.Add("@from", SqlDbType.Datetime)
cmd.Parameters("@from").Value = "1998-02-01"

cmd.Parameters.Add("@to", SqlDbType.Datetime)
cmd.Parameters("@to").Value = "1998-02-28"

cmd.Parameters.Add("@prodid", SqlDbType.Int)
cmd.Parameters("@prodid").Value = 76

In difference to ADO, SqlClient uses names with @ for parameters. The syntax
for defining parameters is similar to ADO, but not identical. This article is
long enough, so I will not go into details on how the Parameters
collection works. Instead, I refer you to MSDN where both SqlClient and ADO
are documented in detail. Whatever client API you are using,
please
learn how to use parameterised commands with it. Yes, there is a tone of
desperation in my voice. I don't know how many posts I've seen on the
newsgroups over the years where people build their SQL strings by
interpolating the values from input fields into the SQL string, and thereby
degrading the performance of their application, and worst of all opening
their database to SQL injection.


... and just when you thought you were safe, I need to turn this upside down. Recall what I said in the
beginning of this section, that if the query is going to run for four minutes, one
second extra for compilation is not a big deal. And if that recompilation
slashes the execution time from forty minutes to four, there is a huge gain.
Most queries benefit from cached parameterised plans, but not all do. Say
that you have a query where the user can ask for data for some time span. If the user asks for a summary for a single day, there is a
good non-clustered index that can be used for a sub-second response time. But
if the request is for the entire year, the same index would be a disaster, and
a table scan is to prefer. On SQL 2005 you can force a
query to be recompiled each
time it is executed by adding OPTION (RECOMPILE)
to the end of the query, and thus you can still use sp_executesql to get the
best protection against SQL injection. On SQL 2000
and earlier, it may in fact be better to interpolate critical parameters into the
query string when you need to force recompilation each time.


For the sake of completeness, I should mention that SQL
Server is able to auto-parameterise queries. If you submit:


SELECT OrderID, OrderDate FROM dbo.Orders WHERE CustomerID = N'ALFKI'

SQL Server may recast this as


SELECT OrderID, OrderDate FROM dbo.Orders WHERE CustomerID = @P1

so if next time you submit BERGS instead of ALFKI, the query plan will be reused.
Auto-parameterisation comes in two flavours: simple and forced. Simple is the
default and is the only option on SQL 2000 and
earlier. With simple parameterisation, auto-parameterisation happens only with very simple
queries, and, it seems, with some inconsistency. With forced
parameterisation, SQL Server parameteries all queries that comes its way
(with some exceptions documented in Books Online). Forced parameterisation
is, in my opinion, mainly a setting to cover up for poorly designed
third-party application that uses unparameterised dynamic
SQL. For your own development you should not
rely on any form of auto-parameterisation. (But in the situation you really a want a new query
plan each time, you may have to verify that it doesn't happen when you don't
want to.)


They say seeing is believing. Here is a demo that you can try on yourself, if
you have SQL 2005. First create this database:


CREATE DATABASE many_sps 
go
USE many_sps
go
DECLARE @sql nvarchar(4000),
@x int
SELECT @x = 200
WHILE @x > 0
BEGIN
SELECT @sql = 'CREATE PROCEDURE abc_' + ltrim(str(@x)) +
'_sp @orderid int AS
SELECT O.OrderID, O.OrderDate, O.CustomerID, C.CompanyName,
Prodcnt = OD.cnt, Totalsum = OD.total
FROM Northwind..Orders O
JOIN Northwind..Customers C ON O.CustomerID = C.CustomerID
JOIN (SELECT OrderID, cnt = COUNT(*), total = SUM(Quantity * UnitPrice)
FROM Northwind..[Order Details]
GROUP BY OrderID) AS OD ON OD.OrderID = O.OrderID
WHERE O.OrderID = @orderid'
EXEC(@sql)
SELECT @x = @x - 1
END

Then in SQL Server Management Studio, press F7
navigate down to the list of stored procedures. Select all procedures. Then
from the context menu select to script them as CREATE
TO
to a new query window. How long time this takes depends on your
hardware, but on my machine it took 90 seconds and at the same time SQL
Server grabbed over 250 MB of memory. If you
use the Profiler to see what Mgmt Studio is up to, you will see that for each
procedure, Mgmt Studio emits a couple of queries with the procedure name
embedded. That is, no parameterised statements. Once scripting is complete,
issue this command:


ALTER DATABASE many_sps SET PARAMETERIZATION FORCED

and redo the operation. On my machine scripting now completed in five
seconds!. This demonstrates that the difference between parameterised and
unparameterised can be dramatic. (And that Microsoft can not use their own
products properly.) If you run SQL Server on
your local machine, you can see this from one more angle, you can stop and restart
SQL Server before the two scripting operations, and then use Task Manager to
see how much physical memory SQL Server uses
in the two cases. That difference lies entirely in the plan cache.


Reducing Network Traffic



Another advantage with stored procedures over SQL sent from the client is that less bytes travel the network. Rather than sending a
50-line query over the network, you only need to pass the name of a stored procedure
and a few parameters. This gets more significant if the computation requires
several queries, possibly with logic in between. If all logic is outside the
database, this could mean that data has to travel up to the client, only to travel back in the next moment. With stored procedures you can
use temp tables to hold intermediate results. (You can use temp tables
from outer layers as well, although it may require some careful use of your
client API.)


In this case, the dividing line goes between sending SQL from the client or
running stored procedures. If the stored procedures use static SQL only, or
invoke dynamic SQL does not matter, nor does it matter if it is a CLR procedure.
You still get the gains of reduced network traffic.


Encapsulating Logic



This is not a question of security or performance, but one of
good programming practice and modularising your code. By using stored procedures, you don't have to bog down
your client code with the construction of SQL statements. Then again, it depends
a little on what you put into those stored procedure. Myself, I am of the
school that the business logic should be where the data is, and in this case
there is no dispute that you should use stored procedures to encapsulate your
logic.


But there are also people
who prefer to see the database as a unintelligent container of data, and who
prefer to have the business logic
elsewhere. In this case, the arguments for using stored procedures
for encapsulation may not be equally compelling. You could just as well employ careful programming practices in
your client language and send SQL strings.


Nothing of this changes if you use dynamic SQL in your stored procedures. The
stored procedure is still a container for some piece of logic, and how it
looks on the inside does not matter. I'm here assuming that most of your
procedures use static SQL only. If all your stored procedures
generate dynamic SQL, then you are probably better off in this regard to do it all in client code. Then again, sometimes there is no other application
than Query Analyzer or SQL Server Management Studio. (Typically this would be
tasks that are run by an admin.) In this case, the only container of logic
available is stored procedures, and it's immaterial whether they use dynamic
SQL or not.


Keeping Track of what Is Used



In a complex system with hundreds of tables, you may need to know where a
certain table or column is referenced, because you are considering changing
or dropping it. If all access to tables is from static SQL in stored
procedures, you may be able find all references by using the system
stored procedure sp_depends or query a system table directly. (sysdepends
in SQL 2000, sys.sql_dependencies in SQL 2005.) I say may, because it is very difficult to maintain complete dependency
information in SQL Server. If you drop and recreate a table, all dependency
information for the table is lost. What I do myself is to regularly build an empty database
from our version-control system, and since our build tool
loads all tables before any stored procedure or trigger, I know that I can
trust the dependency information in that database.


If you throw dynamic SQL into the mix – be that SQL sent from client,
dynamic SQL in T-SQL procedures, or SQL generated by CLR stored procedures
- you lose this opportunity. The alternative is to employ brute-force search,
and if the construction of dynamic SQL is confined to some well-defined set
of modules, this may work. If not, you may end up with a database where no
one ever dares to drop or change a column or a table, and which eventually
becomes unbearable complex and inefficient because of all the legacy baggage
it's carrying around.


While the main dividing line here is between static SQL and any form of
dynamic SQL, dynamic SQL in T-SQL stored procedures is probably the least
harmful, as there is less code to search. In SQL 2005, you can even search
the column sys.sql_modules.definition using SQL. (And in SQL 2000 you
can search syscomments, but as the procedure text there is chopped into
4000-char slices, this is less reliable.)


In any case, an occasional stored procedure that uses dynamic SQL is not
likely cause the Armageddon I pictured above. But it is
a good argument for being restrictive with dynamic SQL in any form.


Easiness of Writing SQL Code


One distinct advantage of writing stored T-SQL procedures is that you get a
syntax check directly. With dynamic SQL, a trivial syntax error may not show up
until run time. Even if you test your code carefully, there may be some query, or
some variation of a query, that is only run in odd cases and not covered in
your test suite.


It has to be admitted that the strength of this argument is somewhat reduced by the fact
that T-SQL is not too industrious on reporting semantic errors.
Because of deferred name resolution, SQL Server will not examine queries in
stored procedures, where one or more tables are missing, be that misspellings
or temp tables created within the procedure. Nevertheless, SQL Server
does report sufficiently many errors, for this to be a very important reason
to use stored procedures.


Another side of this coin is that when you write dynamic SQL, you embed the
SQL code into strings, which makes programming far more complex. Your SQL
code is a string delimited by single quotes('), and this string
may include strings itself, and to include a single quote into the string you
need to double it. You can easily get lost in a maze of quotes if you don't
watch out. (In the section Good Coding Practices
and Tips for Dynamic SQL
, we will look a little closer
on how to deal
with this problem.) The most commonly used client languages with T-SQL -
Visual Basic, C#, C++, VBScript – all use the double quote (")
as their string delimiter, so dynamic SQL in client code or CLR stored
procedures is less prone to that particular problem. Then again, in VB you
don't have multi-line strings, so at the end of each line you have to have a double
quote, an ampersand and an underscore for continuation. It sure does not
serve to make coding easier. You are relieved from all this hassle, if you
use stored procedures with static SQL only.


Addressing Bugs and Problems


Somewhat surprisingly, one of the strongest arguments for stored procedures today may
be
that they permit you to quickly address bugs and performance problems in the
application.


Say that you generate SQL statements in your application, and that there is
an error in it. Or that it simply performs unbearably slow. To fix it, you need to
build a new executable or DLL, which is likely to contain other code that also
has changed since the module was shipped. This
means that before the fix can be put into production, the module will have to go
through QA and testing.


On the other hand, if the problem is in a stored procedure, and the fix is
trivial, you may be able to deploy a fix into production within an hour after
the problem was reported.


This difference is even more emphasised, if you are an ISV and you ship a
product that the customer is supposed administer himself. If your application
uses stored procedures, a DBA may be able to address problems directly
without opening support cases. For instance, if a procedure runs unacceptably
slow, he may be able to fix that by adding an index hint. In contrast,
with an application that generates SQL in the
client, his hands will be tied. Of course, as an ISV you may not want your
customers to poke around in your code, even less to change it. You may also prefer
to ship your procedures WITH ENCRYPTION to protect
your intellectual property, but this is best controlled
through license agreements. (If you encrypt your procedures, the DBA can still
change them, as long as he is able to find a way to decrypt them. Which any
DBA that knows how to use Google can do.)


In this case, it does not matter whether the stored procedure uses static SQL
only, or if it also uses dynamic SQL. For CLR procedures it depends on many objects
you have in your assemblies. If you have one assembly per object, installing a new version of a CLR procedure
is as simple as replacing a T-SQL procedure.


(I should add that SQL 2005 offers a new feature that permits the DBA to
change the plan for a query without altering the code, by adding a plan guide.
This is quite an advanced feature, and I refer to Books Online for details.)


Good Coding Practices and Tips for Dynamic SQL


Writing
dynamic SQL is a task that requires discipline to avoid that you lose control
over your code. If you
just go ahead, your code can become very messy, and be difficult to read, troubleshoot
and maintain. In this section, we will look at how to avoid this. I will also
discuss some special cases: how you can use sp_executesql for input longer
than 4000 chars in SQL 2000, and how to use dynamic SQL with cursors, and the
combination of dynamic SQL and user-defined functions.


Use Debug Prints!


When you write a stored procedure that generates dynamic SQL, you should
always include a @debug parameter:


CREATE PROCEDURE dynsql_sp @par1 int,
...
@debug bit = 0 AS
...
IF @debug = 1 PRINT @sql

When you get a syntax error from the dynamic SQL, it can be very confusing, and
you may not even discern where it comes from. And even when you do, it can be
very difficult to spot the error only by looking at the code that constructs the SQL.
Once the SQL code is slapped in your face, the error is much more likely to be apparent to you.
So always include a @debug parameter and a PRINT!


Nested Strings


As I've already mentioned, one problem with dynamic SQL is that you often need to deal with nested
string delimiters. For instance, in the beginning of this article, I showed
you the procedure general_select2. Here it is again:


CREATE PROCEDURE general_select2 @tblname nvarchar(127),
@key varchar(10) AS
EXEC('SELECT col1, col2, col3
FROM ' + @tblname + '
WHERE keycol = ''' + @key + '''')

(Again, I like to emphasise that this sort of procedure is poor use of
dynamic SQL.)



SQL is one of those language where the method to include a string
delimiter itself in a string literal is to double it. So those four consecutive
single quotes ('''') is a string literal with the value of a one
single quote (').
This is a fairly simple example, it can get a lot worse. If you work with
dynamic SQL, you must learn to master nested strings. Obviously, in this case you
can easily escape the mess by using sp_executesql instead – yet another reason
to use parameterised statements. However, there are situations when you need to
deal with nested quotes even with sp_executesql. For instance, earlier in this
article, I had this code:


N' WHERE LastUpdated BETWEEN @fromdate AND '
N' coalesce(@todate, ''99991231'')'

We will look at some tips of dealing with nested strings later in this
section.


Spacing and Formatting


Another thing to be careful with is the spacing as you concatenate the parts
of a query.
Here is an example where it goes wrong:


EXEC('SELECT col1, col2, col3
FROM' + @tblname + '
WHERE keycol = ''' + @key + '''')


See that there is a space missing after FROM? When you compile the stored procedure
you will get no error, but when you run it, you will be told that the columns
keycol
, col1, col2, col3 are missing. And since you know that the
table you passed to the procedure has these columns you will be mighty confused. But this is
the actual code generated, assuming the parameters foo and abc:


SELECT col1, col2, col3
FROMfoo
WHERE keycol = 'abc'

This is not a syntax error, because FROMfoo is a column alias to col3.
And, yes, it's legal to use a WHERE clause, even if there is no FROM clause. But
since the columns cannot exist out of the blue, you get an error for that.


This is also a good example why you should use debug prints. If the code
looks like this:


SELECT @sql =' SELECT col1, col2, col3
FROM' + @tblname + '
WHERE keycol = ''' + @key + ''''
IF @debug = 1 PRINT @sql
EXEC(@sql)

It would be much easier to find the error by running the procedure with
@debug = 1. (Obviously, had we included the dbo prefix, this error
could not occur at all.)


Overall, good formatting is essential when working with dynamic SQL. Try to
write the query as you would have written it in static SQL, and then add the
string delimiters outside of that. T-SQL permits you to embed newlines in
string literals (as testified by the example above), so in difference to VB,
you don't need a string delimiter on each line. An advantage of this is that
your debug PRINT is easier to read, and in case of a syntax error, the line
number in the error message may guide you.


You may prefer, though, to
have a string terminator on each line. A tip in such case is to do something
like this:


EXEC(' SELECT col1, col2, col3 ' +
' FROM ' + @tblname +
' WHERE keycol = ''' + @key + '''')

As you see, I have a space after the opening single quote on each line to avoid syntax problems due to missing spaces.


Dealing with Dynamic Table and Column Names


Passing table and column names as parameters to a procedure with dynamic SQL
is rarely a good idea for application code. (It can make perfectly sense for
admin tasks). As I've said, you cannot pass a table or a column name as a
parameter to sp_executesql, but you must interpolate it into the SQL string.
Still you should protect it against SQL
injection, as a matter of routine. It could be that bad it comes from user
input.


To this end, you should use the built-in function quotename() (added in
SQL 7). quotename() takes two parameters: the first is a string, and the second
is a pair of delimiters to wrap the string in. The default for the second
parameter is []. Thus, quotename('Orders') returns
[Orders]
. quotename() takes care of nested delimiters, so if you have
a really crazy table name like Left]Bracket, quotename() will
return [Left]]Bracket].


Note that when you work with names with several components, each component
should be quoted separately. quotename('dbo.Orders') returns
[dbo.Orders], but that is a table in an unknown
schema of which the first four characters are d, b, o and
dot. As long as you only work with the dbo schema, best practice is to
add dbo in the dynamic SQL and only pass the table name. If you work
with different schemas, pass the schema as a separate parameter. (Although
you could use the built-in function parsename() to split up a
@tblname
parameter in parts.)


While general_select still is a poor idea as a stored procedure, here
is nevertheless a version that summarises some good coding
virtues for dynamic SQL:


CREATE PROCEDURE general_select @tblname nvarchar(128),
@key varchar(10),
@debug bit = 0 AS
DECLARE @sql nvarchar(4000)
SET @sql = 'SELECT col1, col2, col3
FROM dbo.' + quotename(@tblname) + '
WHERE keycol = @key'
IF @debug = 1 PRINT @sql
EXEC sp_executesql @sql, N'@key varchar(10)', @key = @key


  • I'm using sp_executesql rather than EXEC().

  • I'm prefixing the table name with dbo.

  • I'm wrapping @tblname in quotename().

  • There is a @debug parameter.


Quotename, Nested Strings and Quotestring



The main purpose of quotename() is to quote object names, which is why the
default for the second parameter is brackets. But you can specify other
delimiters as well, including single quotes, which means that any single quote
in the input is doubled. Thus, if you for some reason prefer to use
EXEC(), you can use quotename() to protect yourself against SQL
injection by help of this function. Here is an example.


IF @custname IS NOT NULL
SELECT @sql = @sql + ' AND custname = ' + quotename(@custname, '''')

Say that @custname has the value D'Artagnan. This part of the dynamic SQL
becomes:


AND custname = 'D''Artagnan'

There is a limitation with quotename(): its input parameter
is nvarchar(128), so it does not handle long strings. A remedy is this user-defined function:


CREATE FUNCTION quotestring(@str nvarchar(1998)) RETURNS nvarchar(4000) AS
BEGIN
DECLARE @ret nvarchar(4000),
@sq char(1)
SELECT @sq = ''''
SELECT @ret = replace(@str, @sq, @sq + @sq)
RETURN(@sq + @ret + @sq)
END

This version is for SQL 2000. On SQL 2005 replace 1998 and 4000 with MAX,
so that it works for any string length.
Here is an example of using this function:


IF @custname IS NOT NULL
SELECT @sql = @sql + ' AND custname = ' + dbo.quotestring(@custname)


The result is the same as above.



On SQL 7, you would have to implement quotestring as a stored procedure.
SQL 6.5 does not have replace(), so you are a bit out of luck there.



So with quotename() and quotestring(),
do we have as good protection against SQL
injection as we have with parameterised commands? Maybe. I don't know of any way to
inject SQL that slips through quotename() or quotestring(). Nevertheless, you
are interpolating user input into the SQL string, whereas with parameterised
commands, you don't.



(I
should add that I got the suggestion to use quotename() or a user-defined
function from SQL Server MVP Steve Kass.)


QUOTED_IDENTIFIER


Another alternative to
escape the mess of nested quotes, is make use
of the fact that T-SQL actually has two string delimiters. To wit, if the
setting QUOTED_IDENTIFIER is OFF, you can also use double quotes(")
as a string delimiter. The default
for this setting depends on context, but the preferred setting is
ON, and it
must be ON in order to use XQuery, indexed views and indexes on computed columns.
Thus, this is not a first-rate alternative, but if you are aware of the caveats,
you can do this:


CREATE PROCEDURE general_select @tblname nvarchar(127),
@key key_type,
@debug bit = 0 AS
DECLARE @sql nvarchar(4000)

SET @sql = 'SET QUOTED_IDENTIFIER OFF
SELECT col1, col2, col3
FROM dbo.' + quotename(@tblname) + '
WHERE keycol = "' + @key + '"'
IF @debug = 1 PRINT @sql
EXEC(@sql)


Since there are two different quote characters, the code is much easier to
read. The single quotes are for the SQL string and the double quotes
are for
the embedded string literals.


All and all, this is an inferior method to both sp_executesql and quotestring(), since you are not protected against SQL injection
(what if @key includes a double quote?). But it
would be OK to do for some sysadmin task (where SQL injection is not likely
to be an issue), and it may be the best way to go on SQL 6.5.


sp_executesql and Long SQL Strings in SQL 2000


There is a limitation with sp_executesql on SQL 2000
and SQL 7, since you cannot use longer SQL
strings than 4000 characters. (On SQL 2005,
use nvarchar(MAX) to avoid this
problem.) If you
want to use sp_executesql when your query string exceeds this limit to make use of parameterised query plans, there is actually a
workaround. To wit, you can wrap sp_executesql in EXEC():


DECLARE @sql1 nvarchar(4000),
@sql2 nvarchar(4000),
@state char(2)
SELECT @state = 'CA'
SELECT @sql1 = N'SELECT COUNT(*)'
SELECT @sql2 = N'FROM dbo.authors WHERE state = @state'
EXEC('EXEC sp_executesql N''' + @sql1 + @sql2 + ''',
N''@state char(2)'',
@state = ''' + @state + '''')

This works, because the @stmt parameter to sp_executesql is ntext, so by
itself, it does not have any limitation in size.


You can even use output parameters by using
INSERT-EXEC, as
in this example:


CREATE TABLE #result (cnt int NOT NULL)
DECLARE @sql1 nvarchar(4000),
@sql2 nvarchar(4000),
@state char(2),
@mycnt int
SELECT @state = 'CA'
SELECT @sql1 = N'SELECT @cnt = COUNT(*)'
SELECT @sql2 = N'FROM dbo.authors WHERE state = @state'
INSERT #result (cnt)
EXEC('DECLARE @cnt int
EXEC sp_executesql N''' + @sql1 + @sql2 + ''',
N''@state char(2),
@cnt int OUTPUT'',
@state = ''' + @state + ''',
@cnt = @cnt OUTPUT
SELECT @cnt')
SELECT @mycnt = cnt FROM #result

You have my understanding if you think this is too messy to be worth it.


Dynamic SQL in User-Defined Functions


This very simple: you cannot use dynamic SQL from used-defined functions
written in T-SQL. This is because you are not permitted do anything in a UDF
that could change the database state (as the UDF may be invoked as part of a
query). Since you can do anything from dynamic SQL, including updates, it is
obvious why dynamic SQL is not permitted.


I've seen more than one post on the newsgroups where people have
been banging their head against this. But if you want to use dynamic SQL in a
UDF, back out
and redo your design. You have hit a roadblock, and in SQL 2000 there is no
way out.


In SQL 2005, you could implement your function as a CLR function. Recall that
all data access from the CLR is dynamic SQL. (You are safe-guarded, so that if
you perform an update operation from your function, you will get caught.) A
word of warning though: data access from scalar UDFs can often give performance
problems. If you say


SELECT ... FROM tbl WHERE dbo.MyUdf(somecol) = @value

and MyUdf performs data access, you have more or less created a hidden
cursor.


Cursors and Dynamic SQL


Not that cursors are something you should use very frequently, but people often
ask about using dynamic SQL with cursors, so I give an example for the sake
of completeness. You cannot say DECLARE CURSOR EXEC(); you have to put the
entire DECLARE CURSOR statement in dynamic SQL:


SELECT @sql = 'DECLARE my_cur INSENSITIVE CURSOR FOR ' +
'SELECT col1, col2, col3 FROM ' + @table
EXEC sp_executesql @sql

You may be used to using the LOCAL keyword with your cursors. However, it is
important to understand that you must use a global cursor, as a local cursor
will disappear when the dynamic SQL exits. (Because, as you know by now, the
dynamic SQL is its own scope.) Once you have declared the
cursor in this way, you can use the cursor in a normal fashion. You must be
extra careful with error-handling though, so that you don't exit the
procedure without deallocating the cursor.


There is however a way to use locally-scoped cursors with dynamic SQL.
Anthony Faull pointed out to me that you can achieve this with cursor variables, as in this example:


DECLARE @my_cur CURSOR
EXEC sp_executesql
N'SET @my_cur = CURSOR STATIC FOR
SELECT name FROM dbo.sysobjects;
OPEN @my_cur',
N'@my_cur cursor OUTPUT', @my_cur OUTPUT
FETCH NEXT FROM @my_cur


You refer to a cursor variable, just like named cursors, but there is an @ in front,
and, as you see from the example, you can pass them as a parameters. (I have to confess
I have never seen any use for cursor variables until Anthony Faull was kind to send
me this example.)


EXEC() at Linked Server


A special feature added in SQL 2005 is that you can use EXEC() to run
pass-through queries on a linked server. This could be another instance of
SQL Server, but it could also be an Oracle server, an Access database, Active
directory or whatever. The SQL could be a single query or a sequence of
statements, and could it be composed dynamically or be entirely static. The syntax
is simple, as seen by this example:


EXEC('SELECT COUNT(*) FROM ' + @db + '.dbo.sysobjects') AT SQL2K

SQL2K is here a linked server that has been defined with
sp_addlinkedserver
.


There is one thing that you can do with EXEC() at a linked server, that you
cannot do with EXEC() on a local server: you can use parameters, both for
input and output. The confuse matters, you don't use parameters with names
starting with @, instead you use question marks (?) as parameter
holders. Say that you are on an SQL 2005 box, and you are dying to know how
many orders VINET had in the Northwind database. Unfortunately, SQL 2005 does
not ship with Northwind, but you have a linked server set up to an instance
of SQL 2000 with Northwind. You can run this:


DECLARE @cnt int
EXEC('SELECT ? = COUNT(*) FROM Northwind.dbo.Orders WHERE CustomerID = ?',
@cnt OUTPUT, N'VINET') AT SQL2K
SELECT @cnt

Note here that the parameter values must appear in the order the parameter
markers appear in the query. When passing a parameter, you can either specify a
constant value or a variable.


You may ask why the inconsistency with a different parameter marker from
sp_executesql? Recall that linked servers in SQL Server are always accessed
through an OLE DB provider, and OLE DB uses ? as
the parameter marker, a convention inherited from ODBC. OLE DB translates
that parameter marker as is appropriate for the data source on the other end.
(Not all RDBMS use @ for variables.)


As with regular EXEC(), you can specify AS USER/LOGIN to use impersonation:


EXEC('SELECT COUNT(*) FROM ' + @db + '.dbo.sysobjects')
AS USER = 'davidson' AT SQL2K

This begs the question: is davidson here a local user or a remote
user at SQL2K? Books Online is not very clear
about this, but I did some
quick experimenting, and found that what you are impersonating is a local user or login,
not a login on the remote server. (The login to use on the remote server can be
defined with sp_addlinkedsrvlogin.)


Common Cases when to (Not) Use Dynamic SQL


When you read the various newsgroups on SQL Server, there is almost every day
someone who asks a question that is answered with use dynamic SQL with a quick example
to illustrate, but ever so often the person answering forgets to tell
about the implications on permissions or SQL injection. On top of
that, far too
many examples uses EXEC() without any thought of query plans. And while many of these
questions taken by the letter have no other answer than dynamic SQL, there is
often a real business problem which has a completely different solution
without dynamic SQL – and
a much better one.


So, in this section I will explore some situations where you could use dynamic
SQL. You will see that sometimes dynamic SQL is a
good choice, but also that in many cases that it is an outright bad idea.


SELECT * FROM @tablename


A common question is why the following does not work:


CREATE PROCEDURE my_proc @tablename sysname AS
SELECT * FROM @tablename


As we have seen, we can make this procedure work with help of dynamic SQL, but
it should also be clear that we gain none of the advantages with generating
that dynamic SQL in a stored procedure. You could just as well send the
dynamic SQL from the client. So, OK: 1) if the
SQL statement is very complex, you save some network
traffic and you do
encapsulation. 2) As we have seen, on SQL 2005 there are methods to deal with
permissions. Nevertheless, this is a bad idea.


There seems to be several reasons why people want to parameterise the table
name. One camp
appears to be people who are new to SQL programming, but have experience
from other
languages such as C++, VB etc where parameterisation is a good thing. Parameterising
the table name to achieve generic code and to increase
maintainability seems like good programmer virtue.


But it is just that when it comes to database objects, the old truth does not
hold. In a proper database design, each table is unique, as it describes a
unique entity. (Or at least it should!) Of course, it is not uncommon to end
up with a dozen or more look-up tables that all have an id, a name
column and some auditing columns. But they do describe different entities,
and their semblance should be regarded as mere chance, and future
requirements may make the tables more dissimilar.


Furthermore, when it comes to building a query plan, each table has its set
of statistics and
presumptions that are by no means interchangeable, as far as SQL Server is
concerned. Finally, in
a complex data model, it is important to get a grip of what's being used. When you start to pass table and column names as parameters, you definitely
lose control.


So if you want to do the above (save the fact that SELECT * should not be
used in production code), to save some typing, you are on the wrong path. It is
much better to write ten or twenty stored procedures, even if they are similar
to each other.


(If your SQL statements are complex, so that there actually is a considerable
gain in maintainability to only have them in one place, despite different
tables being used, you could consider using a
pre-processor like the one in C/C++. You would still have one set of
procedures per table, but the code would be in one single include file.)


SELECT * FROM sales + @yymm


This is a variation of the previous case, where there is a suite of tables
that actually do describe the same entity. All tables have the same columns, and the name includes some partitioning
component, typically year and sometimes also month. New tables are created as
a new year/month begins.


In this case, writing one stored procedure per table is not really feasible.
Not the least, because the user may want to specify a date range for a search, so even
with one procedure per table you would still need a dynamic dispatcher.


Now, let's make this very clear: this is a flawed
table design. You should not have one sales table per month, you should
have one single sales table, and the month that appear in the table
name, should be the first column of the primary key in the united sales table. At least
logically. Sometimes, when you have huge tables (say over 10 GB
in size), partitioning can be a good idea, but you should do it right and use
partitioned views, that we will look at in this section. I like
to make the point that since SQL Server is a enterprise RDBMS, it can
handle very large tables very efficiently, as long as you keep in mind that
good indexing is essential. A few million rows is no cause for concern.


If you have a legacy application, it may be prohibitively
expensive to make a redesign. Then again, the complexity of dynamic SQL also
comes with a cost. Fortunately, there are alternatives. A simple approach is
to define a view like this:


CREATE VIEW sales AS
SELECT year = '2006', * FROM dbo.sales2006
UNION ALL
SELECT year = '2005', * FROM dbo.sales2005
UNION ALL
...

(For a view like this, SELECT * could be
considered OK.) Instead of composing
the table name dynamically, you can now say:


SELECT ... FROM sales WHERE year = '2006' AND ...

Unfortunately, this view is not terribly efficient, as the query will access
all three tables. Furthermore, the view is not updateable. While this is a
partitioned view in some sense, the view does not fulfil the rules for partitioned views
in the SQL Server sense. Such views can be very efficient, because for
queries that include the partitioning column in the WHERE clause, SQL Server
will only access the relevant table(s). And such a view is updatable, so you
can insert data into it, and the data will end up in the right table.


Here is a
quick example/demo on how to properly set up a partitioned view. Assume that
as legacy of a poor design we have these three tables:


SELECT * INTO Orders96 FROM Northwind..Orders WHERE year(OrderDate) = 1996
ALTER TABLE Orders96 ADD CONSTRAINT pk96 PRIMARY KEY (OrderID)

SELECT * INTO Orders97 FROM Northwind..Orders WHERE year(OrderDate) = 1997
ALTER TABLE Orders97 ADD CONSTRAINT pk97 PRIMARY KEY (OrderID)

SELECT * INTO Orders98 FROM Northwind..Orders WHERE year(OrderDate) = 1998
ALTER TABLE Orders98 ADD CONSTRAINT pk98 PRIMARY KEY (OrderID)

First step is to a add Year column to each table. These columns need a
default (so that processes that insert directly into these tables are
unaffected) and a CHECK constraint. Here is how it looks for Orders96:


ALTER TABLE Orders96 ADD Year char(4) NOT NULL
CONSTRAINT def96 DEFAULT '1996'
CONSTRAINT check96 CHECK (Year = '1996')

This column must be the first column in the primary key, so we need to drop
the current primary key and recreate it:


ALTER TABLE Orders DROP CONSTRAINT pk96
ALTER TABLE Orders96 ADD CONSTRAINT u96 UNIQUE (Year, OrderID)

Again, this must be performed for all three tables. Finally, you can create
the view:


CREATE VIEW Orders AS
SELECT * FROM dbo.Orders96
UNION ALL
SELECT * FROM dbo.Orders97
UNION ALL
SELECT * FROM dbo.Orders98

You now have a proper partitioned view that you can perform inserts and updates through. And if you run a query like:


SELECT OrderID, OrderDate, EmployeeID
FROM Orders
WHERE Year = @year
AND CustomerID = N'BERGS'

SQL Server will at run-time only access the OrdersNN table that maps to
@year. If you look at the query plan casually, it may seem that all three
tables are
accessed, but if you check the Filter operators you will find something
called STARTUP EXPR. This means that SQL Server determines at
run-time
whether to access the table or not.


For your real-world case you may find it prohibitive to change the primary
key. In this case you could add a UNIQUE constraint with the partitioning
column + the real primary key. This will not be a proper partitioned view,
and the view will not be updatable,
but with some luck SQL Server may still apply startup expressions, and access only one of the base tables.
At least I got it to work, when I ran a quick test. You
should verify that it works for your situation.


When a new table is added with a new year, the view needs to be redefined. If
this happens frequently, for instance by each month, you should probably set
up a job for this. I leave out example code, but it requires running
a cursor over sysobjects to compose a CREATE VIEW statement that you then
execute with sp_executesql or EXEC(). That would be an example of good use of
dynamic SQL.


This was a very concentrated introduction to partitioned views, a feature which
was introduced in SQL 2000. What Microsoft had in mind was truly big tables where partitioning is desired not only for
performance but also manageability. You can find the full rules for
partitioned views under the topic for CREATE VIEW in Books Online. Good
reading is also Stefan
Delmarco's detailed article
SQL
Server 2000 Partitioned Views
.


For completeness sake, I should mention that it is also possible to define
distributed partitioned views with tables spread out over several servers.
Furthermore, SQL 2005 adds another partitioning feature, partitioned tables.
I'm not detailing any of them here.


UPDATE tbl SET @colname = @value WHERE keycol = @keyval


In this case people want to update a column which they select at run time.
The above is actually legal in T-SQL, but what happens is simply that the
variable @colname
is assigned the value in @value for each affected row in the table.


In this case dynamic SQL would call for the user to have UPDATE permissions
on the table, something which is not to take lightly. So there is all reason to
avoid it. Here is a fairly simple workaround:


UPDATE tbl
SET col1 = CASE @colname WHEN 'col1' THEN @value ELSE col1 END,
col2 = CASE @colname WHEN 'col2' THEN @value ELSE col2 END,
...


If you don't know about the CASE expression, please look it up in Books Online.
It's a very powerful SQL feature.


Then again, one would wonder why people want to do this. Maybe it's because their
tables look like this:


CREATE TABLE products (prodid   prodid_type NOT NULL,
prodname name_type NOT NULL,
...
sales_1 money NULL,
sales_2 money NULL,
...
sales_12 money NULL,
PRIMARY KEY (prodid))


It could make more sense to move these sales_n columns to a second table:


CREATE TABLE product_sales (prodid prodid_type NOT NULL,
month tinyint NOT NULL,
sales money NOT NULL,
PRIMARY KEY (prodid, month))

SELECT col AS @myname


The request here is to determine the name for a column in a result set at
run-time.My gut reaction, is that this should be handled
client-side. But if your client is a query window is Management Studio or
similar, this is kind of difficult. In any case, this is simple to do without any
dynamic SQL on SQL 2005:


DECLARE @mycolalias sysname
SELECT @mycolalias = 'This week''s alias'

CREATE TABLE #temp (a int NOT NULL,
b int NOT NULL)

INSERT #temp(a, b) SELECT 12, 17

EXEC tempdb..sp_rename '#temp.b', @mycolalias, 'COLUMN'

SELECT * FROM #temp

That is, you first get the data into a temp table, and then you use
sp_rename
to rename the column along your needs. (You need to qualify sp_rename with tempdb to have sp_rename operate in that database.) You will get
an informational message Caution: Changing any part of an object name could
break scripts and stored procedures
, but you may be able to live with that.


This trick works on SQL 2000 too, although not entirely without dynamic
SQL:
you need put the SELECT from the temp table in
EXEC():


EXEC('SELECT * FROM #temp')

This is because on SQL 2000, sp_rename apparently does not trigger a recompile,
so if the the SELECT is in the same batch, the statement fails with Invalid
column name 'b'
. There is yet one thing to be aware of on SQL 2000: you
canont use sp_rename in a stored procedure that is to be run by plain users, as sp_rename thinks
you need to be a member of the db_owner or db_ddladmin database roles,
even if this is only a temp table. This issue has been addressed in
SQL 2005.


SELECT * FROM @dbname + '..tbl'


In this case the table is in another database which is somehow determined
dynamically. There seems to be several reasons why people want to do this, and
depending on your underlying reason, the solution is different.


Get Data from another Database


If you for some reason have your
application spread over two databases, what you absolutely not should do is
to have code that says:


SELECT ... FROM otherdb.dbo.tbl JOIN ...

This is bad, because if someone asks for a second environment on the same
server, you have a lot of code to change.


The best solution for this particular problem on SQL 2005, is to use
synonyms:


CREATE SYNONYM otherdbtbl FOR otherdb.dbo.tbl

You can then refer to otherdb.dbo.tbl as just otherdbtbl. If
there is a need for a second set of databases, you only have to update the
synonyms, and there is no need to use dynamic SQL.


Yet a way to avoid dynamic SQL is to use stored procedures for all
inter-database communication. That is, if you are in db1 and need to get data from
db2
, you call a stored procedure in db2. This can be dynamic,
because EXEC permits you to specify a variable that holds the name of the
procedure to execute.


SELECT @dbname = quotename(dbname) FROM ...
SELECT @sp = @dbname + '..some_sp'
EXEC @ret = @sp @par1, @par2...

If you want to get result sets back from db2, look at my article
How to Share Data between Stored Procedures

for suggestions.


There may still be cases you may find that dynamic SQL is the only feasible
situation. This can be done in two ways. The most obvious
is:


SELECT @dbname = quotename(dbname) FROM ...
SELECT @sql = ' SELECT ... FROM ' + @dbname + ' .dbo.otherdbtbl ' +
' JOIN dbo.localtbl ... '
EXEC sp_executesql @sql, @params, ...

But, if the query is complex, and most of the tables are in the remote
database you can also do:


SELECT @sql = ' SELECT ... FROM dbo.othertbl ' +
' JOIN ' + quotename(db_name()) + '.dbo.localtbl ... '
SELECT @dbname = quotename(dbname) FROM ...
SELECT @sp_executesql = @dbname + '..sp_executesql'
EXEC @sp_executesql @sql, @params, ...

As above, I make use of that you can specify the procedure name dynamically
with EXEC. The trick here is that when you specify a system stored procedure
in three-part notation with the database name, the procedure executes in the
context of that database. Thus, the dynamic SQL in this example runs in
@dbname, not the current database.


Do Something in Every Database


This sounds to me like some sysadmin
venture, and for sysadmin tasks dynamic SQL is
usually a fair game, because neither caching nor permissions are issues.
Nevertheless there is an kind of alternative: sp_MSforeachdb, demonstrated by this example:


sp_MSforeachdb 'SELECT ''?'', COUNT(*) FROM sysobjects'

As you might guess, sp_MSforeachdb uses dynamic SQL internally, so
what you win is that you don't have to write the control loop yourself. I should
hasten to add that sp_MSforeachdb is not documented in Books Online,
which also means that use of it is not supported by Microsoft and it could be
changed or withdrawn from SQL Server without notice.


A "Master" Database


The scenario here is that you have a suite of databases with identical
schema. The typical reason they are different databases and not one, is that every
database serves a different customer, and each customer can access his
database (but of course no one else's). Some people
see a problem with the same stored procedures in fifty databases,
and believe that they face a maintenance nightmare. So they get the idea
that they should put the procedures in a "master" database. Yes, you can do that. It
will give you a much bigger maintenance problem, because your code will
entirely littered with dynamic SQL.
In fact, if you feel that this is the only alternative, you are better off
skipping stored procedures altogther and do all acecss from client code
instead. In such case there is only one place you need to specify the
database: the connection string.


What else can you do? Some people might suggest that you should collapse the
databases into one, and employ a strict
row-level security scheme. Personally, I would never accept such a solution
as a potential customer. In a complex application, bugs can easily lead to
that information is exposed to people who should not see it. Besides,
row-level security cannot be implemented entirely waterproof in SQL Server.
Whereas queries only would return the data they should, query plans and error
messages may indirectly disclose information to users who are not authorised
to see it.


Another wild approach is to use SQL Server's own master database and install the application procedures
as system procedures. This works at least in SQL 2000 and earlier. I have not
verified that it still does in SQL 2005 where Microsoft's
own system procedures
now reside in the invisible resource database. In any case, this is entirely
unsupported. So while I mention the possibility, I don't give you the details
on how to do it and I strongly recommend that you don't go there.


What then is the real solution? Install the stored procedures in each database and develop
rollout routines for your SQL objects. You need this anyway, the day you want
to update the table definitions. This also permits you to have some
flexibility. Some customers may prefer to skip an upgrade. Other customers
may be prepared to pay for extra functions that only they have access to. Even more importantly, it permits you to easily scale out and move some
databases to a second server. I mentioned that as a customer, I would not
accept to share database with other customers. In fact, a security-aware
customer would not even accept to share the same instance of SQL Server, but
require his own instance.


(You may ask whether not synonyms could be used to implement the "master"
database. I have not been able to think of anything useful, but if you find
out something, please drop me a line.)


Creating an Object in Another Database


This question sometimes comes up. Most often people have problems with the
USE command. The correct solution is to avoid USE altogether in this case. In
fact, we have already seen how to do this:


SELECT @sql = 'CREATE VIEW ...'
SELECT @sp_executesql = quotename(@dbname) + '..sp_executesql'
EXEC @sp_executesql @sql

That is, make use of that you can set the database context by calling sp_executesql with three-part notation.


SELECT * FROM tbl WHERE col IN (@list)


It is fascinating how may people who put '1,2,3,4' in @list, and then are
puzzled why the query above does not return any rows. Well, if there is a row
where col has the value '1,2,3,4', you will get a match. These two
conditions are the same:


col IN (@list)
col = @list

IN does not mean "parse whatever data there is at runtime as a
comma-separated list". It's a compile-time shortcut for
col =
@a OR col = @b OR
...


This is a very common question on the newsgroups, and Use dynamic SQL is a far too common answer.
Yes, you can do this with dynamic SQL, but it is an extremely poor solution.
You cannot pass the list as a parameter to sp_executesql, so you would have
to use EXEC() and be open to SQL injection. On
top of that, for long lists, IN has extremely poor performance – in some
tests I did, it took SQL Server 15 seconds to build the query plan for a list
with 10000 elements.


The correct method is to unpack the list into a table with a user-defined
function or a stored procedure. In my article, Arrays and Lists in
SQL Server
, I describe a whole range of ways to do this. I also present performance data for the various methods. (Dynamic SQL is at
the bottom of that list!) This is a long article, but there are jump-start
links in the beginning of the article, depending on which version of SQL
Server you are using.


SELECT * FROM tbl WHERE @condition


If you are considering to write the procedure


CREATE PROCEDURE search_sp @condition varchar(8000) AS
SELECT * FROM tbl WHERE @condition

Just forget it. If you are doing this, you have not completed the transition
to use stored procedure and you are still assembling your SQL code in the client.
But this example lapses into


Dynamic Search Conditions


A not too uncommon case is that the users should be able to select data from a broad set of
parameters. The procedure search_orders in the section on
SQL injection
was a very simple example of this.


Any programmer that tackles this realises that writing a static solution
with a tailor-made query for each combination of input parameters is
impossible. There are a number of ways to attack this problem, both with dynamic and
static SQL, and I have a separate article, Dynamic Search Conditions, where I
present several methods for this type of searches, both with dynamic SQL and
static SQL. A very brief summary is that
dynamic SQL is often the best solution, both for performance and
maintainability, as long as you can accept the permission consequences.


SELECT * FROM tbl ORDER BY @col


This can easily be handled without dynamic SQL in this way:


SELECT col1, col2, col3
FROM dbo.tbl
ORDER BY CASE @col1
WHEN 'col1' THEN col1
WHEN 'col2' THEN col2
WHEN 'col3' THEN col3
END

Again, review the CASE expression in Books Online, if you are not acquainted
with it.


Note that if the columns have different data types you cannot lump them into
the same CASE expression, as the data type of a CASE
expression is always one and the same. Instead, you can do this:


SELECT col1, col2, col3
FROM dbo.tbl
ORDER BY CASE @col1 WHEN 'col1' THEN col1 ELSE NULL END,
CASE @col1 WHEN 'col2' THEN col2 ELSE NULL END,
CASE @col1 WHEN 'col3' THEN col3 ELSE NULL END


If you also want to make it dynamic whether the order should be ascending or
descending, add one more CASE:


SELECT col1, col2, col3
FROM dbo.tbl
ORDER BY CASE @sortorder
WHEN 'ASC' THEN CASE @col1
WHEN 'col1' THEN col1
WHEN 'col2' THEN col2
WHEN 'col3' THEN col3
END
ELSE NULL
END ASC,
CASE @sortorder
WHEN 'DESC' THEN CASE @col1
WHEN 'col1' THEN col1
WHEN 'col2' THEN col2
WHEN 'col3' THEN col3
END
ELSE NULL
END

Or use the form in the second example to deal with diffeent data types.


SQL Server MVP Itzik Ben-Gan had a good article on this topic in the March
2001 issue of SQL Server Magazine,
where he offers other suggestions.


SELECT TOP @n FROM tbl


On SQL 2005 this is straightforward in static SQL, as with a slight syntax
change, TOP accepts expressions for the argument:


SELECT TOP(@n) col1, col2 FROM tbl

On SQL 2000, you can achieve this without dynamic SQL as in this example:


CREATE PROCEDURE get_first_n @n int AS
SET ROWCOUNT @n
SELECT au_id, au_lname, au_fname
FROM authors
ORDER BY au_id
SET ROWCOUNT 0

You may wonder whether SQL Server will consider the value of @n when building
the query plan, and my testing indicates that it does, but:



  • @n must be a parameter. If @n is a local variable, the optimizer has no
    clue.

  • If you first call get_first_n with @n = 10, and then @n = 10000
    you will get the same plan, although different plans may be the best
    choice for different values of @n. You can use the WITH RECOMPILE option to
    address this problem, either with the CREATE PROCEDURE statement or when you
    execute the procedure.

  • This applies to SQL 7 and later. Then again, on SQL 6.5, you don't
    have any other choice, as it does not support SELECT TOP at all.

  • You should remember to say SET ROWCOUNT 0 after the SELECT.


It can be disputed whether SET ROWCOUNT @n is really a better solution than
running a dynamic SQL statement with TOP. A dynamic TOP is probably a
better choice, as long as you can accept the security implications. (But it's
not worth to change the permissions only for this.)


I guess a common reason for wanting to do this is to implement paging in web
applications. SQL Server MVP Aaron Bertrand has an article which is the
standard reference on
this topic.


CREATE TABLE @tbl


The desire here is to create a table of which the name is determined at
run-time.


If we just look at the arguments against using dynamic SQL in stored
procedures, few of them are really applicable here. If a stored procedure has a
static CREATE TABLE in it, the user who runs the procedure must have
permissions to create tables, so dynamic SQL
will not change anything. Plan caching obviously has nothing to do with
it. Etc.


Nevertheless: Why? Why would you want to do this? If you are creating tables on the fly in your
application, you have missed some fundamentals about database design. In a
relational database, the set of tables and columns are supposed to be
constant. They may change with the installation of new versions, but not during
run-time.


Sometimes when people are doing this, it appears that they want to construct
unique names for temporary tables. This is completely unnecessary, as this is a
built-in feature in SQL Server. If you say:


CREATE TABLE #nisse (a int NOT NULL)

then the actual name behind the scenes will be something much longer, and no
other connections will be able to see this instance of #nisse.


If you want to create a permanent table which is unique to a user, but you
don't want to stay connected and therefore cannot use temp tables, it may be
better to create one table that all clients can share, but where the first
column is a key which is private to the client. I discuss this method a little
more closely in my article How to
Share Data between Stored Procedures
.


CREATE TABLE with Unknown Columns


Sometimes I see persons on the newsgroups that are unhappy, because they
create a temp table from dynamic SQL, and then they can't access it, because it
disappeared when the dynamic SQL exited. When told that they have to create the
table outside the dynamic SQL, they respond that they can't, because they don't
know the structure of the table until run-time.


One solution is to create a global temp table, one with two # in the name,
for instance ##temp. Such a table is visible to all processes (so you may have
to take precautions to make the name unique), and unless you explicitly drop it, it exists
until your process exits.


But the real question is: what are these guys up to? If you are
working with a relational database, and you don't know the structure of your
data until run-time, then there is something fundamentally wrong. As I have
never been able to fully understand what the underlying business requirements
are, I can't really provide any alternatives. But I would suggest that if you
need to go this road, you should seriously consider to run your SQL from a client
program. Because, all access
to that table would have to be through dynamic SQL, and composing
dynamic SQL strings is easier in languages with better string capabilities,
be that C#, VB or Perl.


OK, so there is one case where I can see people end up here, and that is if you
want to run a dynamic crosstab – which definitely isn't a very relational
operation, but neverthless is a very common user requirement. You may want to look at
RAC, a third-party product. I have not
used it myself, but I've heard reports from satisfied users.


Linked Servers


This is similar to parameterising the database name,
but in this case we want to access a linked server of which the name is
determined at run-time.


Two of the solutions for dynamic database names apply here as well:



  • On SQL 2005, the best solution is probably to use synonyms:
    CREATE SYNONYM myremotetbl FOR Server.db.dbo.remotetbl.

  • If you can confine the access to the linked server to a stored procedure
    call, you can build the SP name dynamically:
    SET @sp = @server + 'db.dbo.some_sp'
    EXEC @ret = @sp @par1, @par2...



If you want to join a local table with a remote table on some remote server,
determined in the flux of the moment, dynamic SQL is probably the best way if
you are on SQL 2000.
There exists however an alternative, although it's only usable in some
situations. You can use sp_addlinkedserver to define the linked server at
run-time,
as demonstrated by this snippet:


EXEC sp_addlinkedserver MYSRV, @srvproduct='Any',
@provider='SQLOLEDB', @datasrc=@@SERVERNAME
go
CREATE PROCEDURE linksrv_demo_inner WITH RECOMPILE AS
SELECT * FROM MYSRV.master.dbo.sysdatabases
go
EXEC sp_dropserver MYSRV
go
CREATE PROCEDURE linksrv_demo @server sysname AS
IF EXISTS (SELECT * FROM master..sysservers WHERE srvname = 'MYSRV')
EXEC sp_dropserver MYSRV
EXEC sp_addlinkedserver MYSRV, @srvproduct='Any',
@provider='SQLOLEDB', @datasrc=@server
EXEC linksrv_demo_inner
EXEC sp_dropserver MYSRV
go
EXEC linksrv_demo 'Server1'
EXEC linksrv_demo 'Server2'


There are two procedures. linksrv_demo_inner is the procedure where we
actually access the linked server. As the linked server must exist when the
procedure is created, I first create a dummy entry for MYSRV, which I subsequently
drop once the procedure has been created. (Not only must the linked server exist, it must also have the database and
tables that you access.) linksrv_demo is the outside interface which takes a
server name as a parameter, and then at run-time defines MYSRV to point to
@server.



The above is only possible under certain conditions:



  • The procedure must be run by someone who has privileges to set up
    linked servers, normally only the roles sysadmin and setupadmin
    have these permissions. Thus, plain users do not apply.

  • Since you change a
    server-wide definition, you cannot have several instances of the procedure
    running. (It goes without saying, that you should use the alias in this
    procedure only.)



As you can see in the example, I've added WITH RECOMPILE to linksrv_demo_inner.
This is a safety precaution, to prevent that a cached plan does not access a
different server. I don't think this is really necessary, as SQL Server should sense the
changed definition. In fact, you may not even have to split the code over two
procedures, but as they say, better safe than sorry.


OPENQUERY


The rowset functions OPENQUERY and OPENROWSET often calls for dynamic SQL. Their second argument
is an SQL string, and they do no accept variables.
(This is because the optimizer builds a plan for the distributed query when
the procedure is compiled.) So any single parameter you want to pass to the
SQL statement for that remote server requires you to use dynamic SQL. Since the
remote SQL string can include string literals, you
may have to deal with up to three
levels of nested quotes. If you don't watch out, you can spend a full day
looking at things like:


DECLARE @sql varchar(8000)
SELECT @sql = 'SELECT * FROM OPENQUERY(MYSRV, ' +
'''SELECT * FROM Northwind.dbo.Orders ' +
'WHERE CustomerID = N''''VINET'''''')'
PRINT @sql
EXEC(@sql)

and then try to find out if you might you have one ' too many or too
few.


Strict discipline is absolutely necessary when working with dynamic SQL for
OPENQUERY. The function quotestring()
that I showed you earlier can be of great help:


DECLARE @remotesql nvarchar(4000),
@localsql nvarchar(4000),
@state char(2)

SELECT @state = 'CA'
SELECT @remotesql = 'SELECT * FROM pubs.dbo.authors WHERE state = ' +
dbo.quotestring(@state)
SELECT @localsql = 'SELECT * FROM OPENQUERY(MYSRV, ' +
dbo.quotestring(@remotesql) + ')'

PRINT @localsql
EXEC (@localsql)

The built-in function quotename() is usually not useful here, as the SQL statement easily
can exceed the limit of 129 characters for the input parameter to
quotename()
.


On SQL 2005, you can use EXEC() to run an SQL statement on a
linked
server
. Since EXEC() at linked servers can take parameters, this can make
things considerably easier. Then again, you can join OPENQUERY with local
tables, so that only rows of interest are brought across the wire. This you
cannot do with EXEC().


Dynamic Column Widths


Say that you write a stored
procedure that is to present some data, and the GUI it is to be run from is
Query Analyzer or SQL Server Management Studio (presumably because it is a sysadmin procedure). To make the
output easy to digest, you want the column width to be so wide that no data
is truncated, but neither do you want any extraneous spaces. This is
something you can achieve with dynamic SQL. Typically you would use a temp table
to hold the data, in which case there are no permission issues.


Rather than giving an example, I refer you to the source code for the popular (but undocumented)
system procedure sp_who2. You can find the code by entering exec
master..sp_helptext sp_who2
.


Dynamic SQL and Maintenance Tasks


I've written this text with a main focus on application code, because it is
mainly in application tasks, bad usage of dynamic SQL can cause serious harm
by opening for SQL injection, poor
query-plan reuse, and result in code that is
difficult to read and maintain.


Here, I like to briefly discuss code that is for maintenance jobs, code that
run once a
night or once a week or even less frequently. Generally, for this sort of
code, dynamic SQL is almost always a fair game. Query
plans are rarely an issue. And if the code is to be run by users with
sysadmin
privileges, there are no permissions issues. The same applies to
code that does not require permissions outside the database, and is to be run
by users with db_owner privileges.


There are however, two points about SQL injection I like to make.



  1. If you are a DBA that writes some stored procedure to be run by junior
    operators that do not have sysadmin privilege themselves, you must of
    course take precaution against SQL injection, so that they don't outsmart
    you.

  2. If you write a job that performs operations on tables in
    every database, be careful to use quotename() when you build the SQL strings.
    This is particularly important if there are non-sysadmin users that own
    databases. A user could create a table name that injects an SQL command
    into
    your maintenance script when you run it. If you are the DBA at a hosting
    company, this is a risk that you definitely should not neglect.


Acknowledgements and Feedback


I like to thank the following persons who have provided valuable suggestions
and input for this article: SQL Server MVPs Tibor Karaszi, Keith Kratochvil,
Steve Kass, Umachandar Jaychandran, Hal Berenson and Aaron Bertrand, as well as Pankul Verma,
Anthony Faull, Karl Jones, Marcus
Hansfeldt, Jeremy Lubich and Simon Hayes.


I also like to thank the people who have made the effort to provide
translations of older versions of the article: SQL Server MVP Frank Kalis translated into
German
and Simon Hayes into
Spanish
. Tam Vu translated the article into
Vietnamese.


And not the least I like to thank all you people who have pointed out typos
and spelling errors. Just keep those letters and cards coming!


If you have suggestions for improvements or corrections on contents, language or
formatting, please mail me at
esquel@sommarskog.se
.
If you have technical questions that any knowledgeable person could answer, I encourage you to
post to any of the newsgroups microsoft.public.sqlserver.programming or comp.databases.ms-sqlserver.


Revision History


2008-06-06 – Added an example on how to deal with dynamic sort
order in the section on ORDER BY.


2008-03-18 – Added a section on how to handle a
dynamic column alias.


2006-12-27 – In the section Caching
Query Plans
, added a note
on forced parameterisation and a demo of the performance penalty for failing
to use parameterised queries.


2006-07-25 – Corrected syntax in example with
cursor variable
after comment from Anthony Faull.


2006-04-28 – A
Vietnamese translation
(of the old version) is now available.


2006-04-23 – Thoroughly reworked the article to cover SQL 2005 in
full, resulting in lots of new text, lots of old text dropped, and many
sections rearranged. I'm now
more strongly favouring sp_executesql over EXEC(), and
I put more stress on SQL
injection. I also stress the importance of using parameterised statements for
query-plan reuse, and I note that prefixing with dbo is essential for
query-plan reuse. The examples of cases where (not) to use dynamic SQL have
had an overhaul as well, if not equally drastic. I'm now giving a very quick
example of partitioned views for the sales + @yymm case. The article now also
includes snippets for
parameterised commands from VB6 and VB .Net.


2005-04-17 – Added example of EXEC +
sp_executesql
with OUTPUT parameter. Added use of nvarchar(max) on
SQL 2005 for quotestring and elsewhere.


2004-05-30
Spanish
translation now available.


2004-02-08
German translation now available. Minor language corrections.


2003-12-02
Korean
translation now available. Added example of using
cursor variable with dynamic SQL. Modified description
of first parameter to sp_executesql.


Back to my home page.


Back to News Articles
View Articles From Database related articles Category