Java Practices->Keep SQL out of code

Keep SQL out of code

Structured Query Language (SQL) is text. Its natural home is a text file of some sort. Storing SQL as text, outside of compiled code, eases maintenance significantly:

it lets those who know SQL to contribute directly to a project, without having to know anything else about all the other tools used in the project.
for developers new to a project, having SQL separate from code makes it significantly easier to browse through SQL statements and learn the database structure.
it minimizes the ripple effects caused by changes to the database. For example, if a table name or column name changes, the fix is usually an easy search and replace in a single text file.
if an older application needs to be migrated to a new framework, then the SQL is readily available for porting to the new application.
if necessary, changes to SQL can even be made after deployment, by changing the text file and doing a restart or a refresh. This can occasionally be useful. For example, if users ask for a change in the sort order of some data, then the developer very likely has the option of implementing that change without a redeployment (and maybe even without a restart, if the tools support a refresh). However, some organizations would object to performing such changes directly in production.
adding support for a new relational database may become simply a matter of translating only those statements that happen to be non-portable.

The Pragmatic Programmer, by Hunt and Thomas, emphasizes this type of metadata:

"It forces you to create a more robust, abstract design by deferring details - deferring them all the way out of the program."

There's a possible objection to keeping SQL as text in a production environment: it can be examined for vulnerabilities by a malicious user with access to the machine where it resides. In that case, there is the option of translating the SQL-as-text (used in development) into some binary form (used in production).

Many SQL statements come in these three forms, all of which may be removed from compiled code and stored outside the program as text:

Fixed text:

SELECT * FROM Board ORDER BY Id;

Here, simply fetch a String from some textual source (such as a properties file), and pass it to the constructor of a Statement or PreparedStatement. This form is likely relatively rare, since most SQL statements seem to have at least one parameter.

Parameterized, with a single set of values :

INSERT INTO Board
(Id, Title, TimeZone, Language, CreationDate)
VALUES ('quark','High Energy Physics','GMT','english',NOW());

Here, fetch a String having '?' as place holders from some textual source :

INSERT INTO Board
(Id, Title, TimeZone, Language, CreationDate)
VALUES (?,?,?,?,NOW());

This String is then passed to the constructor of a PreparedStatement, and parameters are inserted using the methods of the PreparedStatement class. Note that PreparedStatement has significant advantages over an ordinary Statement.

Parameterized, with multiple sets of parameters :

INSERT INTO Vote
VALUES
('RASB', 3, 'Y', NOW()),
('RASB', 2, 'Y', NOW()),
('RASB', 1, 'Y', NOW()),
('RASB', 4, '?', NOW());

Using such a statement to add many records is advantageous, since it involves only one trip to the database. However, the PreparedStatement parameter mechanism cannot do this in one statement, since the number of added records is arbitrary. The usual style in this case is to use batched statements, in which a number of individual INSERT statements are collected together in a batch, and then executed as a unit.

Search Operations

There are cases in which it remains preferable to construct an SQL statement (or at least parts of it) in code. This is particularly true for many search operations, where the end user enters various search criteria into a form used to construct WHERE and ORDER BY clauses. It's often the case that a very large number of possible WHERE and ORDER BY clauses can be built from such user input. Manually enumerating all such possibilities is often undesirable. In such cases, building the SQL statement dynamically from user input seems preferable.

When constructing statements dynamically, however, you must guard against SQL injection attacks. Here's a simple technique for doing so:

start with a base SQL statement defined in a text file, as above.
build the dynamic parts of the SQL statement in code (usually the WHERE and ORDER BY clauses), but, at this stage, do not insert the data parameter values yet. That is, build a String containing '?' placeholders for data, in the usual form expected by a PreparedStatement.
next, create a PreparedStatement with the given SQL string. This will verify that the SQL statement is of valid form. As always, using PreparedStatement in this way will protect against SQL injection attacks.
finally, inject the data for the '?' placeholders in the usual way.

With the above technique, user input of search criteria is used for two distinct purposes: first dynamically building the core SQL with the desired WHERE and ORDER BY clauses (containing '?' placeholders), and secondly populating the '?' placeholders with data in the usual way.

Example

Here's an example of a .sql file which contains the SQL statements related to a single feature. All parameterized statements are expressed in a style suitable for a PreparedStatement, to avoid using any quoting conventions. It's taken from one of the WEB4J example applications.

LIST_RESTOS {
 SELECT Id, Name, Location, Price, Comment
 FROM Resto
 ORDER BY Name
}

FETCH_RESTO {
 SELECT Id, Name, Location, Price, Comment
 FROM Resto
 WHERE Id =? 
}

ADD_RESTO  {
  -- Id is an autoincrement field, populated automagically by the database.
 INSERT INTO Resto (Name, Location, Price, Comment) VALUES (?,?,?,?)
}

CHANGE_RESTO {
  UPDATE Resto SET Name=?, Location=?, Price=?, Comment=? WHERE Id=?
}

DELETE_RESTO {
  DELETE FROM RESTO WHERE Id=?
}

In the case of web applications, such files are safely placed in the WEB-INF directory, where they are not accessible to the user, and can be retrieved using ServletContext.getResourceAsStream.

The WEB4J framework places all SQL statements in one or more such .sql files, located anywhere under the WEB-INF directory. (Please see the javadoc for its data layer for more information.)