0% found this document useful (0 votes)
37 views11 pages

Command Injection Essence

Uploaded by

anumadhuo104
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views11 pages

Command Injection Essence

Uploaded by

anumadhuo104
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

The Essence of Command Injection Attacks in Web Applications

Zhendong Su Gary Wassermann


University of California, Davis University of California, Davis
su@cs.ucdavis.edu wassermg@cs.ucdavis.edu

Abstract erated content. They are ubiquitous. For example, when a user
Web applications typically interact with a back-end database to re- logs on to his bank account through a web browser, he is using
trieve persistent data and then present the data to the user as dy- a web database application. These applications normally interact
namically generated output, such as HTML web pages. However, with databases to access persistent data. This interaction is com-
this interaction is commonly done through a low-level API by dy- monly done within a general-purpose programming language, such
namically constructing query strings within a general-purpose pro- as Java, through an application programming interface (API), such
gramming language, such as Java. This low-level interaction is ad as JDBC. A typical system architecture for applications is shown in
hoc because it does not take into account the structure of the output Figure 1. It is normally a three-tiered architecture, consisting of a
language. Accordingly, user inputs are treated as isolated lexical web-browser, an application server, and a back-end database server.
entities which, if not properly sanitized, can cause the web applica- Within the underlying general-purpose language, such an applica-
tion to generate unintended output. This is called a command injec- tion constructs database queries, often dynamically, and dispatches
tion attack, which poses a serious threat to web application security. these queries over an API to appropriate databases for execution. In
This paper presents the first formal definition of command injec- such a way, a web application retrieves and presents data to the user
tion attacks in the context of web applications, and gives a sound based on the user’s input as part of the application’s functionality;
and complete algorithm for preventing them based on context-free it is not intended to be simply an interface for arbitrary interaction
grammars and compiler parsing techniques. Our key observation is with the database.
that, for an attack to succeed, the input that gets propagated into However, if the user’s input is not handled properly, serious se-
the database query or the output document must change the in- curity problems can occur. This is because queries are constructed
tended syntactic structure of the query or document. Our definition dynamically in an ad hoc manner through low-level string manip-
and algorithm are general and apply to many forms of command ulations. This is ad hoc because databases interpret query strings
injection attacks. We validate our approach with S QL C HECK , an as structured, meaningful commands, while web applications often
implementation for the setting of SQL command injection attacks. view query strings simply as unstructured sequences of characters.
We evaluated S QL C HECK on real-world web applications with sys- This semantic gap, combined with improper handling of user input,
tematically compiled real-world attack data as input. S QL C HECK makes web applications susceptible to a large class of malicious at-
produced no false positives or false negatives, incurred low run- tacks known as command injection attacks.
time overhead, and applied straightforwardly to web applications We use one common kind of such attacks to illustrate the prob-
written in different languages. lem, namely the SQL command injection attacks (SQLCIA). An
SQLCIA injection attack occurs when a malicious user, through
Categories and Subject Descriptors D.2.4 [Software Engineer- specifically crafted input, causes a web application to generate and
ing]: Software/Program Verification—Reliability, Validation; D.3.1 send a query that functions differently than the programmer in-
[Programming Languages]: Formal Definitions and Theory— tended. For example, if a database contains user names and pass-
Syntax; F.4.2 [Mathematical Logic and Formal Languages]: Gram- words, the application may contain code such as the following:
mars and Other Rewriting Systems—Parsing, Grammar Types
query = "SELECT * FROM accounts WHERE name=’"
General Terms Algorithms, Experimentation, Languages, Relia- + request.getParameter("name")
bility, Security, Verification + "’ AND password=’"
+ request.getParameter("pass") + "’";
Keywords command injection attacks, web applications, gram-
mars, parsing, runtime verification This code generates a query intended to be used to authenticate a
user who tries to login to a web site. However, if a malicious user
1. Introduction enters “badguy” into the name field and “’OR’ a’=’a” into the
password field, the query string becomes:
Web applications are designed to present to any user with a web
browser a system-independent interface to some dynamically gen- SELECT * FROM accounts WHERE
name=’badguy’ AND password=’’ OR ’a’=’a’
whose condition always evaluates to true, and the user will bypass
the authentication logic.
Permission to make digital or hard copies of all or part of this work for personal or Command injection vulnerabilities continue to be discovered on
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
large, real-world web applications [37], and the effects can be se-
on the first page. To copy otherwise, to republish, to post on servers or to redistribute vere. A recent news article [23] told about a major university whose
to lists, requires prior specific permission and/or a fee. student-application login page had a vulnerability much like the
POPL ’06 January 11–13, 2006, Charleston, South Carolina, USA. example shown above. Using appropriate input, an attacker could
Copyright c 2006 ACM 1-59593-02702/06/0001. . . $5.00. retrieve personal information about any of the hundreds of thou-
<%!
// database connection info
String dbDriver = "com.mysql.jdbc.Driver";
String strConn = "jdbc:mysql://"
+ "sport4sale.com/sport";
String dbUser = "manager";
String dbPassword = "athltpass";

// generate query to send


String sanitizedName =
replace(request.getParameter("name"),"’","’’");
Figure 1. A typical system architecture for web applications. String sanitizedCardType =
replace(request.getParameter("cardtype"),
"’","’’");
String query = "SELECT cardnum FROM accounts"
sands of that school’s applicants. The university had to notify every
+ " WHERE uname=’" + sanitizedName + "’"
applicant whose records were in the database about the possibility
+ " AND cardtype=" + sanitizedCardType + ";";
that the applicant was now the victim of identity theft. This conse-
quence was both an expense and a blow to public relations for the
try {
university.
// connect to database and send query
The problem goes beyond simply failing to check input that
java.sql.DriverManager.registerDriver(
is incorporated into a query. Even web applications that perform
(java.sql.Driver)
some checks on every input may be vulnerable. For example, if
(Class.forName(dbDriver).newInstance()));
the application forbids the use of the single-quote in input (which
javaq.sql.Connection conn =
may prevent legitimate inputs such as “O’Brian”), SQLCIAs may
java.sql.DriverManager.getConnecion(
still be possible because numeric literals are not delimited with
strConn, dbUser, dbPassword);
quotes. The problem is that web applications generally treat input
java.sql.Statement stmt =
strings as isolated lexical entities. Input strings and constant strings
conn.createStatement();
are combined to produce structured output (SQL queries) without
java.sql.ResultSet rs =
regard to the structure of the output language (SQL).
stmt.executeQuery(query);
A number of approaches to dealing with the SQLCIA problem
have been proposed, but to the best of our knowledge, no formal
// generate html output
definition for SQLCIAs has been given. Consequently, the effec-
out.println("<html><body><table>");
tiveness of these approaches can only be evaluated based on ex-
while(rs.next()) {
amples, empirical results, and informal arguments. This paper fills
out.println("<tr> <td>");
that gap by formally defining SQLCIAs and presenting a sound and
out.println(rs.getString(1));
complete algorithm to detect SQLCIAs using this definition com-
out.println("</td> </tr>");
bined with parsing techniques [1].
}
This paper makes the following contributions:
if (rs != null) {
• A formal definition of a web application, and in that context the rs.close();
first formal definition of an SQLCIA. }
• An algorithm for preventing SQLCIAs, along with proofs of its out.println("</table> </body> </html>");
soundness and completeness. Both the definition and the algo- } catch (Exception e)
rithm apply directly to other settings that generate interpreted { out.println(e.toString()); }
output (see Section 4). %>
• An implementation, S QL C HECK , which is generated from Figure 2. A JSP page for retrieving credit card numbers.
lexer- and parser-generator input files. Thus S QL C HECK can
be modified for different dialects of SQL or different choices of
security policy (see Section 3.1) with minimal effort. spectively. Section 7 discusses related work, and finally, Section 8
• An empirical evaluation of S QL C HECK on real-world web ap- concludes.
plications written in PHP and JSP. Web applications of differ-
ent languages were used to evaluate S QL C HECK’s applicabil- 2. Overview of Approach
ity across different languages. In our evaluation, we used lists
of real-world attack- and legitimate-data provided by an inde- Web applications have injection vulnerabilities because they do not
pendant research group [13], in addition to our own test data. constrain syntactically the inputs they use to construct structured
These lists were systematically compiled and generated from output. Consider, for example, the JSP page in Figure 2. The con-
sources such as CERT/CC advisories. S QL C HECK produced no text of this page is an online store. The website allows users to
false positives or false negatives. It checked in roughly 3ms per store credit card information so that they can retrieve it for future
query, and thus incurred low runtime overhead. purchases. This page returns a list of a user’s credit card numbers
of a selected credit card type (e.g., Visa). In the code to construct a
The rest of this paper is organized as follows: Section 2 gives an query, the quotes are “escaped” with the replace method so that
overview of our approach and Section 3 formalizes it with defini- any single quote characters in the input will be interpreted as literal
tions, algorithms, and correctness proofs. Section 4 discusses other characters and not string delimiters. This is intended to block at-
settings to which our approach applies. Sections 5 and 6 present tacks by preventing a user from ending the string and adding SQL
our implementation and an evaluation of the implementation, re- code. However, cardtype is a numeric column, so if a user passes
command injection attack or is meaningless to the interpreter that
would receive it.
Figure 3 shows the architecture of our runtime checking system.
After S QL C HECK is built using the grammar of the output language
and a policy specifying permitted syntactic forms, it resides on the
web server and intercepts generated queries. Each input that is to be
propagated into some query, regardless of the input’s source, gets
augmented with the meta-characters ‘L’ and ‘M.’ The application
then generates augmented queries, which S QL C HECK attempts to
parse. If a query parses successfully, S QL C HECK sends it sans the
meta-data to the database. Otherwise, the query is blocked.

3. Formal Descriptions
This section formalizes the notion of a web application, and, in that
context, formally defines an SQLCIA.

Figure 3. System architecture of S QL C HECK. 3.1 Problem Formalization


A web application has the following characteristics relevant to
SQLCIAs:
“2 OR 1=1” as the card type, all account numbers in the database • It takes input strings, which it may modify;
will be returned and displayed. • It generates a string (i.e., a query) by combining filtered inputs
We approach the problem by addressing its cause: we track
and constant strings. For example, in Figure 2, sanitizedName
through the program the substrings from user input and constrain
is a filtered input, and "SELECT cardnum FROM accounts"
those substrings syntactically. The idea is to block queries in which
is a constant string for building dynamic queries;
the input substrings change the syntactic structure of the rest of the
query. Such queries are command injection attacks (SQLCIAs, in • The query is generated without respect to the SQL grammar,
the context of database back-ends). We track the user’s input by even though in practice programmers write web applications
using meta-data, displayed as ‘L’ and ‘M,’ to mark the beginning with the intent that the queries be grammatical; and
and end of each input string. This meta-data follows the string • The generated query provides no information about the source
through assignments, concatenations, etc., so that when a query is of its characters/substrings.
ready to be sent to the database, it has matching pairs of markers
identifying the substrings from input. We call this annotated query In order to capture the above intuition, we define a web application
an augmented query. as follows:
We want to forbid input substrings from modifying the syntac-
tic structure of the rest of the query. To do this we construct an Definition 3.1 (Web Application). We abstract a web applica-
augmented grammar for augmented queries based on the standard tion P : hΣ∗ , . . . , Σ∗ i → Σ∗ as a mapping from user inputs (over
grammar for SQL queries. In the augmented grammar, the only pro- an alphabet Σ) to query strings (over Σ). In particular, P is given
ductions in which ‘L’ and ‘M’ occur have the following form: by {hf1 , . . . , fn i, hs1 , . . . , sm i} where
nonterm ::= L symbol M • fi : Σ∗ → Σ∗ is an input filter;
• s i : Σ∗ is a constant string.
where symbol is either a terminal or a non-terminal. For an aug-
mented query to be in the language of this grammar, the substrings The argument to P is an n-tuple of input strings hi1 , . . . , in i, and
surrounded by ‘L’ and ‘M’ must be syntactically confined. By select- P returns a query q = q1 + . . . + q` where, for 1 ≤ j ≤ `,
ing only certain symbols to be on the rhs of such productions, we 
can specify the syntactic forms permitted for input substrings in a s where s ∈ {s1 , . . . , sm }
qj =
query. f (i) where f ∈ {f1 , . . . , fn } ∧ i ∈ {i1 , . . . , in }
One reason to allow input to take syntactic forms other than lit- That is, each qj is either a static string or a filtered input.
erals is for stored queries. Some web applications read queries or
query fragments in from a file or database. For example, Bugzilla, a
widely used bug tracking system, allows the conditional clauses of Definition 3.1 says nothing about control-flow paths or any
queries to be stored in a database for later use. In this context, a tau- other execution model, so it is not tied to any particular program-
tology is not an attack, since the conditional clause simply serves ming paradigm.
to filter out uninteresting bug reports. Persistent storage can be a In order to motivate our definition of an SQLCIA, we return to
medium for second order attacks [2], so input from them should be the example JSP code shown in Figure 2. If the user inputs “John”
constrained, but if stored queries are forbidden, applications that as his user name and perhaps through a dropdown box selects credit
use them will break. For example, in an application that allows card type “2” (both expected inputs), the generated query will be:
conditional clauses to be stored along with associated labels, a ma- SELECT cardnum FROM accounts WHERE uname=’John’
licious user may store “val = 1; DROP TABLE users” and as- AND cardtype=2
sociate a benign-sounding label so that an unsuspecting user will
retrieve and execute it. As stated in Section 2, a malicious user may replace the credit card
We use a parser generator to build a parser for the augmented type in the input with “2 OR 1=1” in order to return all stored
grammar and attempt to parse each augmented query. If the query credit card numbers:
parses successfully, it meets the syntactic constraints and is legit- SELECT cardnum FROM accounts WHERE uname=’John’
imate. Otherwise, it fails the syntactic constraints and either is a AND cardtype=2 OR 1=1
(a) (b)

Figure 4. Parse trees for WHERE clauses of generated queries. Substrings from user input are underlined.

constructed by P is an SQL command injection attack (SQLCIA)


Figure 4 shows a parse tree for each query. Note that in Fig- if the following conditions hold:
ure 4a, for each substring from input there exists a node in the • The query string q has a valid parse tree Tq ;
parse tree whose descendant leaves comprise the entire input sub-
• There exists k such that 1 ≤ k ≤ n and fk (ik ) is a substring
string and no more: lit for the first substring and num lit/value for
the second, as shown with shading. No such parse tree node ex- in q and is not a valid syntactic form in Tq .
ists for the second input substring in Figure 4b. This distinction is
common to all examples of legitimate vs. malicious queries that
we have seen. The intuition behind this distinction is that the mali- The first condition, that q have a valid parse tree, prevents
cious user attempts to cause the execution of a query beyond the query strings that would fail to execute from being considered
constraints intended by the programmer, while the normal user attacks. The second condition includes a clause specifying that
does not attempt to break any such constraints. We use this dis- valid syntactic forms are only considered within the context of the
tinction as our definition of an SQLCIA. The definition relies on query’s parse tree. This is necessary because the same substring
the notion of a parse tree node having an input substring as its de- may have multiple syntactic forms when considered in isolation.
scendants, and we formalize this notion as a valid syntactic form: For example, “DROP TABLE employee” could be viewed either as
a DROP statement or as string literal data if not viewed in the context
of a whole query.
Definition 3.2 (Valid Syntactic Form). Let G = hV, Σ, S, P i Note that these definition do not include all forms of dangerous
be a context-free grammar with non-terminals V , terminals Σ, a or unexpected behavior. Definition 3.1 provides no means of al-
start symbol S, and productions P . Let U ⊆ V ∪ Σ. Strings in the tering the behavior of the web application (e.g., through a buffer
sub-language L generated by U are called valid syntactic forms overflow). Definition 3.3 assumes that the portions of the query
w.r.t. U. More formally, L is given by: from constant strings represent the programmer’s intentions. If a
[ programmer mistakenly includes in the web application a query to
L = (U ∩ Σ) ∪ L(hV, Σ, u, P i) drop a needed table, that query would not be considered an SQL-
u∈U ∩V CIA. Additionally, Definition 3.3 constrains the web application to
use input only where a valid syntactic form is permitted. By Def-
where L(G) denotes the language generated by the grammar G. inition 3.2, a valid syntactic form has a unique root in the query’s
parse tree. Consider, for example, the following query construction:
Definition 3.2 allows for a modifiable security policy: The set U query = "SELECT * FROM tbl WHERE col " + input;
can be assigned such that L includes only the syntactic forms that If the variable input had the value "> 5", the query would be
the application programmer wants to allow the user to supply. Hav- syntactically correct. However, if the grammar uses a rule such as
ing a definition for valid syntactic forms, we define SQL command “e → e opr e” for relational expressions, then the input cannot have
injection attacks as follows: a unique root, and this construction will only generate SQLCIAs.
However, we believe these limitations are appropriate in this
setting. By projecting away the possibility of the application server
Definition 3.3 (SQL Command Injection Attack). Given a web
getting hacked, we can focus on the essence of the SQLCIA prob-
application P and an input vector hi1 , . . . , in i, the following SQL
lem. Regarding the programmer’s intentions, none of the literature
query:
we have seen on this topic ever calls into question the correctness
q = P (i1 , . . . , in ) of the constant portions of queries (except Fugue [9], Gould et
select stmt ::= SELECT select list from clause select stmt ::= SELECT select list from clause
| SELECT select list from clause where clause | SELECT select list from clause where clause
select list ::= id list select list ::= id list
| * | *
id list ::= id ida ::= id
| id , id list
| L id M
from clause ::= FROM tbl list
tbl list ::= id list id list ::= ida
where clause ::= WHERE bool cond | ida , id list
bcond ::= bcond OR bterm from clause ::= FROM tbl list
| bterm tbl list ::= id list
bterm ::= bterm AND bfactor where clause ::= WHERE bcond
| bfactor bcond ::= bcond OR bterm
bfactor ::= NOT cond | bterm
| cond bterm ::= bterm AND bfactor
cond ::= value comp value | bfactor
value ::= id
| str lit bfactor ::= NOT conda
| num | conda
str lit ::= ’ lit ’ conda ::= cond
comp ::= = | < | > | <= | >= | != | L cond M
Figure 5. Simplified grammar for the SELECT statement. cond ::= value comp value
value ::= ida
| str lit
al.’s work on static type checking of dynamically generated query
| numa
strings [12], and our earlier work on static analysis for web applica-
tion security [41], which consider this question to some limited de- numa ::= num
gree). Additionally, programmers generally do not provide formal | L num M
specifications for their code, so taking the code as the specification lita ::= lit
directs us to a solution that is fitting for the current practice. Finally,
we have not encountered any examples either in the literature or in | L lit M
the wild of constructed queries where the input cannot possibly be str lit ::= ’ lita ’
a valid syntactic form. comp ::= = | < | > | <= | >= | !=
3.2 Algorithm Figure 6. Augmented grammar for grammar shown in Figure 5.
Given a web application P and query string q generated by P and New/modified productions are shaded.
input hi1 , . . . , in i, we need an algorithm A to decide whether q is
an SQLCIA, i.e., A(q) is true iff q is an SQLCIA. The algorithm syntactic forms, an augmented grammar G a has the property that
A must check whether the substrings fj (ij ) in q are valid syntactic an augmented query q a = P (Li1 M, . . . , Lin M) is in L(G a ) iff:
forms, but the web application does not automatically provide in- • The query q = P (i1 , . . . , in ) is in L(G); and
formation about the source of a generated query’s substrings. Since • For each substring s that separates a pair of matching ‘L’ and
the internals of the web application are not accessible directly, we
‘M’ in q a , if all meta-characters are removed from s, s is a valid
need a means of tracking the input through the web application to
syntactic form in q’s parse tree.
the constructed query. For this purpose we use meta-characters ‘L’
and ‘M,’ which are not in Σ. We modify the definition of the filters
such that for all filters f , A natural way to construct an augmented grammar G a from
• f : (Σ ∪ {L, M})∗ → (Σ ∪ {L, M})∗ ; and G and U is to create a new production rule for each u ∈ U
of the form ua → LuM | u, and replace all other rhs occur-
• for all strings σ ∈ Σ∗ , f (LσM) = Lf(σ)M. rences of u with ua . We give our construction in Algorithm 3.6.
By augmenting the input to hLi1 M, . . . , Lin Mi, we can determine
which substrings of the constructed query come from the input. Algorithm 3.6 (Grammar Augmentation). Given a grammar
G = hV, Σ, S, Ri and a policy U ⊆ V ∪ Σ, we define G’s
Definition 3.4 (Augmented Query). A query q a is an aug- augmented grammar as:
mented query if it was generated from augmented input, i.e., q a = G a = hV ∪ {v a |v ∈ U }, Σ ∪ {L, M}, S, R a i
P (Li1 M, . . . , Lin M).
where v denotes a fresh non-terminal. Given rhs = v1 . . . vn
a

where vi ∈ V ∪ Σ, let rhs a = w1 . . . wn where


We now describe an algorithm for checking whether a query 
is an SQLCIA. This algorithm is initialized once with the SQL vi a if vi ∈ U
wi =
grammar and a policy stating the valid syntactic forms, with which vi otherwise
R is given by:
a
it constructs an augmented grammar.
R a = {v → rhs a | v → rhs ∈ R}
∪ {v a → v | v ∈ U } ∪ {v a → LvM | v ∈ U }
Definition 3.5 (Augmented Grammar). Given a grammar
G = {V, Σ, S, P } and a set U ⊆ V ∪ Σ specifying the valid
Consequently, there exists a parse tree Tqa for q from G a ’s pro-
ductions R a . By assumption, q is not an SQLCIA, which by Def-
initions 3.2 and 3.3 means that for each σ = fj (ij ) used in the
construction of q, there exists some parse tree node v in Tq such
that v ∈ U and v’s leaf-descendants are exactly σ. Find such a
parse tree node v for each σ. The construction of Tqa from Tq cre-
ates a mapping from the parse tree nodes in Tq to the parse tree
nodes in Tqa . Using that mapping, find the corresponding node v in
Tqa . By the definition of meta-characters and of augmented queries,
P (Li1 M, . . . , Lin M) produces a query identical to q, except that each
σ is replaced with LσM. Algorithm 3.6 specifies that there is a rule
vi a → Lvi M for each vi ∈ U . These rules allow each v identified
above in Tqa to be replaced with LvM. This results in the parse tree
(a) (b) Tqaa , which proves that q a ∈ L(G a ).

Figure 7. Parse tree fragments for an augmented query. Lemma 3.9 (Grammar Construction: Complete). Let G a be the
augmented grammar constructed from grammar G and set U. For
all P (Li1 M, . . . , Lin M) = q a ∈ L(G a ), P (i1 , . . . , in ) = q ∈ L(G)
To demonstrate this algorithm, consider the simplified grammar and q is not an SQLCIA.
for SQL’s SELECT statement in Figure 5. This is the grammar used
to generate the parse trees in Figure 4. If a security policy of Proof. Suppose for contradiction that there exists some hi1 , . . . ,
U = {cond, id, num, lit} is chosen, the result of Algorithm 3.6 in i such that P (Li1 M, . . . , Lin M) = q a ∈ L(G a ), but P (i1 , . . . ,
is shown in Figure 6. Suppose the queries shown in Figure 4 in ) = q is an SQLCIA. This implies that q a has a parse tree
were augmented. Using the augmented grammar, the parse tree for Tqaa from R a in G a . Because q is an SQLCIA, there exists some
the first query would look the same as Figure 4a, except that the substring LσM in q a where σ is not a valid syntactic form, i.e.,
subtrees shown in Figures 7a and 7b would be substituted in for the no node v in Tqaa both has σ as it descendant leaves and has
first and second input strings, respectively. No parse tree could be
v ∈ U . If v ∈ / U , then Algorithm 3.6 specifies no rule of the
constructed for the second augmented query.
form v a → LvM ∈ R a . Consequently Tqaa cannot exist, and this
A GLR parser generator [28] can be used to generate a parser
contradicts our initial assumption.
for an augmented grammar G a .

Theorem 3.10 (Soundness and Completeness). For all


Algorithm 3.7 (SQLCIA Prevention). Here are steps of our al- hi1 , . . . , in i, Algorithm 3.7 will permit query q = P (i1 , . . . , in ) iff
gorithm A to prevent SQLCIAs and invalid queries: q ∈ L(G) and q is not an SQLCIA.
1. Intercept augmented query q a ;
2. Attempt to parse q a using the parser generated from G a ; Proof. Step 2 of Algorithm 3.7 attempts to parse the augmented
3. If q a fails to parse, raise an error; query q a = P (Li1 M, . . . , Lin M). By Lemma 3.8, if q is an SQLCIA
4. Otherwise, if q a parses, strip all occurrences of ‘L’ and ‘M’ out or if q is not a syntactically correct SQL query, q a will fail to parse.
of q a to produce q and output q. If q a fails to parse, step 3 will prevent q from being executed. By
Lemma 3.9, if q is syntactically correct and is not an SQLCIA, q a
will parse. Step 4 causes the query to be executed.
3.3 Correctness
We now argue that the algorithms given in Section 3.2 are correct 3.4 Complexity
with respect to the definitions given in Section 3.1. Lemmas 3.8 Theorem 3.11 (Time Complexity). The worst-case time bound
and 3.9 prove the soundness and completeness respectively of Al- on Algorithm 3.7 is:
gorithm 3.6 for constructing augmented grammars. Using these
lemmas, Theorem 3.10 proves the soundness and completeness of O(|q|) LALR
Algorithm 3.7 for preventing SQLCIAs. O(|q|2 ) if G a is not LALR but is deterministic
O(|q|3 ) non-deterministic
Lemma 3.8 (Grammar Construction: Sound). Let G a be the
augmented grammar constructed from grammar G and set U. For
all hi1 , . . . , in i, if P (i1 , . . . , in ) ∈ L(G) and P (i1 , . . . , in ) is not Proof. These time bounds follow from known time-bounds for
an SQLCIA, then classes of grammars [1]. Achieving them for Algorithm 3.7 is
contingent on the parser generator being able to handle each case
P (Li1 M, . . . , Lin M) ∈ L(G a ) without using an algorithm for a more expressive class of grammar.

Proof. Consider an arbitrary query q = P (i1 , . . . , in ) for some


hi1 , . . . , in i such that q ∈ L(G) and q is not an SQLCIA. Be-
cause q ∈ L(G), there exists a parse tree Tq for q from G’s 4. Applications
productions R. For each parse tree node v in Tq with children Although we have so far focused on examples of SQL command
v1 , . . . , vm , there exists a rule v → v1 , . . . , vm ∈ R. For each injections, our definition and algorithm are general and apply to
rule v → v1 , . . . , vm ∈ R, Algorithm 3.6 specifies a rule v → other settings that generate structured, meaningful output from
w1 , . . . , wm ∈ R a where wi = vi a if vi ∈ U and wi = vi oth- user-provided input. We discuss three other common forms of com-
erwise. For each vi ∈ U , Algorithm 3.6 specifies a rule vi a → vi . mand injections.
4.1 Cross Site Scripting file to bison. For meta-characters, we use two randomly generated
Web sites that display input data are subject to cross site scripting strings of four alphabetic characters: one to represent ‘L’ and the
(XSS) attacks. This vulnerability is perhaps more widespread than other to represent ‘M.’ We made this design decision based on two
SQLCIAs because the web application need not access a back-end considerations: (1) the meta-characters should not be removed by
database. As an example of XSS, consider an auction website that input filters, and (2) the probability of a user entering a meta-
allows users to put items up for bid. The site then displays a list of character should be low.
item numbers where each item number is a link to a URL to bid on First, we selected alphabetic characters because some input fil-
the corresponding item. Suppose that an attacker enters as the item tering functions restrict or remove certain characters, but generally
to add: alphabetic characters are permitted. The common exceptions are
filters for numeric fields which allow only numeric characters. In
><script>document.location= this case either the meta-characters can be added after applying an
’http://www.xss.com/cgi-bin/cookie.cgi? filter, or they can be stripped off leaving only numeric data which
’%20+document.cookie</script cannot change the syntactic structure of the generated query. We
When a user clicks on the attacker’s item number, the text in added them after the filter, where applicable.
the URL will be parsed and interpreted as JavaScript. This script Second, ignoring case, 264 = 456, 976 different four-character
sends the user’s cookie to http://www.xss.com/, the attacker’s strings are possible. To avoid using meaningful words as meta-
website. Note that the string provided by the attacker is not a valid characters, we forbid meta-characters from being represented as
syntactic form, since the first character completes a preceding tag. strings that occur in the dictionary. The default dictionary for
ispell contains 72,421 words, and if these are forbidden for use
4.2 XPath Injection as meta-characters, 384,555 unique strings remain. It is difficult to
A web application that uses an XML document/database for its quantify precisely the probability of a user accidentally entering a
back-end storage and accesses the data through dynamically con- substring identical to one of the strings used for meta-characters be-
structed XPath expressions may be vulnerable to XPath injection cause of several unknown factors. If we assume (1) that 90% of the
attacks [19]. This is closely related to the problem of SQLCIAs, words that a user enters occur in the dictionary and the remaining
but the vulnerability is more severe because: 10% are chosen uniformly at random from non-dictionary words,
(2) that the average number of words entered on a web session is
• XPath allows one to query all items of the database, while an 100, and (3) that the word length is 4, the probability of a user
SQL DBMS may not provide a “table of tables,” for example; accidentally entering a meta-character string in one web session is:
and  (.1×100)
• XPath provides no mechanism to restrict access on parts of the 2
1− 1− 4 ≈ .000052
XML document, whereas most SQL DBMSs provide a facility 26 − 72, 421
to make parts of the database inaccessible.
This probability can be further reduced by using longer encodings
The following piece of ASP code is vulnerable to XPath injection: of meta-characters. We expect that the actual probability is less than
XPathExpression expr = what is shown above, since the numbers chosen for the calculation
nav.Compile("string(//user[name/text()=’" were intended to be conservative. Also, in settings where input is
+TextBox1.Text+"’ and password/text()=’" least expected to occur in a dictionary (e.g., passwords), sequences
+TextBox2.Text+"’]/account/text()"); of alphabetic characters are often broken up by numeric and “spe-
cial” characters, and the same non-dictionary words are repeatedly
Entering a tautology as in Figure 4b would allow an attacker to entered. Section 6.3 addresses the possibility of a user guessing the
log in, but given knowledge of the XML document’s node-set, an meta-character encodings.
attacker could enter: The input to flex requires roughly 70 lines of manually writ-
NoUser’] | P | //user[name/text()=’NoUser ten C code to distinguish meta-characters from string literals, col-
umn/table names, and numeric literals when they are not separated
where P is a node-set. The surrounding predicates would always be by the usual token delimiters.
false, so the constructed XPath expression would return the string The algorithm allows for a policy to be defined in terms of
value of the node-set P. Such attacks can also be prevented with our which non-terminals in the SQL grammar are permitted to be at
technique. the root of a valid syntactic form. For the evaluation we selected
4.3 Shell Injection literals, names, and arithmetic expressions to be valid syntactic
forms. Additional forms can be added to the policy at the cost
Shell injection attacks occur when input is incorporated into a of one line in the bison input file per form, a find-and-replace
string to be interpreted by the shell. For example, if the string vari- on the added symbol, and a token declaration. Additionally, if the
able filename is insufficiently sanitized, the PHP code fragment: DBMS allows SQL constructs not recognized by S QL C HECK , they
exec("open(".$filename.")"); can be added straightforwardly by updating the bison input file.
The bison utility includes a glr mode, which can be used if the
will allow an attacker to be able to execute arbitrary shell com-
augmented grammar is not LALR. For the policy choice used here,
mands if filename is not a valid syntactic form in the shell’s
the augmented grammar is LALR.
grammar. This vulnerability is not confined to web applications.
A setuid program with this vulnerability allows a user with re-
stricted privileges to execute arbitrary shell commands as root. 6. Evaluation
Checking the string to ensure that each substring from input is a This section presents our evaluation of S QL C HECK.
valid syntactic form would prevent these attacks.
6.1 Evaluation Setup
5. Implementation To evaluate our implementation, we selected five real-world web
We implemented the query checking algorithm as S QL C HECK. applications that have been used for previous evaluations in the lit-
S QL C HECK is generated using an input file to flex and an input erature [13]. Each of these web applications is provided in mul-
Subject Description LOC Query Query Metachar External
PHP JSP Checks Sites Pairs Query
Added Added Data
Employee Directory Online employee directory 2,801 3,114 5 16 4 39
Events Event tracking system 2,819 3,894 7 20 4 47
Classifieds Online management system for classifieds 5,540 5,819 10 41 4 67
Portal Portal for a club 8,745 8,870 13 42 7 149
Bookstore Online bookstore 9,224 9,649 18 56 9 121

Table 1. Subject programs used in our empirical evaluation.

Language Subject Queries Timing (ms)


Legitimate Attacks Mean Std Dev
(Attempted/allowed) (Attempted/prevented)
Employee Directory 660 / 660 3937 / 3937 3.230 2.080
Events 900 / 900 3605 / 3605 2.613 0.961
PHP Classifieds 576 / 576 3724 / 3724 2.478 1.049
Portal 1080 / 1080 3685 / 3685 3.788 3.233
Bookstore 608 / 608 3473 / 3473 2.806 1.625
Employee Directory 660 / 660 3937 / 3937 3.186 0.652
Events 900 / 900 3605 / 3605 3.368 0.710
JSP Classifieds 576 / 576 3724 / 3724 3.134 0.548
Portal 1080 / 1080 3685 / 3685 3.063 0.441
Bookstore 608 / 608 3473 / 3473 2.897 0.257

Table 2. Precision and timing results for S QL C HECK.

tiple web-programming languages, so we used the PHP and JSP first compiling one list of attack inputs, which were gleaned from
version of each to evaluate the applicability of our implementa- CERT/CC advisories and other sources that list vulnerabilities and
tion across different languages. Although the notion of “applica- exploits, and one list of legitimate inputs. The data type of each
bility across languages” is somewhat qualitative, it is significant: input was also recorded. Then each parameter in each URL was
the more language-specific an approach is, the less it is able to annotated with its type. Two lists of URLs were then generated,
address the broad problem of SQLCIAs (and command injections one ATTACK list and one L EGIT list, by substituting inputs from
in general). For example, an approach that involves using a modi- the respective lists into the URLs in a type consistent way. Each
fied interpreter [32, 31] is not easily applicable to a language like URL in the ATTACK list had at least one parameter from the list of
Java (i.e., JSP and servlets) because Sun is unlikely to modify its attack inputs, while each URL in the L EGIT list had only legitimate
Java interpreter for the sake of web applications. To the best of our parameters. Finally, the URLs were tested on unprotected versions
knowledge, this is the first evaluation in the literature run on web of the web applications to ensure that the ATTACK URLs did, in
applications written in different languages. fact, execute attacks and the L EGIT URLs resulted in normal, ex-
Table 1 lists the subjects, giving for each subject its name, a pected behavior.
brief description of its function, the number of lines of code in The machine used to perform the evaluation runs Linux kernel
the PHP and JSP versions, the number of pairs of meta-characters 2.4.27 and has a 2 GHz Pentium M processor and 1 GB of memory.
added, the number of input sites, the number of calls to S QL C HECK
added, and the number of points at which complete queries are 6.2 Results
generated. The number of pairs of meta-characters added was less Table 2 shows, for each web application, the number of attacks at-
than the number of input sites because in these applications, most tempted (using URLs from the ATTACK list) and prevented, the
input parameters were passed through a particular function, and number of legitimate uses attempted and allowed, and the mean
by adding a single pair of meta-characters in this function, many and standard deviation of times across all runs of S QL C HECK for
inputs did not need to be instrumented individually. For a similar that application. S QL C HECK successfully prevented all attacks and
reason, the number of added calls to S QL C HECK is less than the allowed all legitimate uses. Theorem 3.10 predicted this, but these
number of points at which completed queries are generated: In results provide some assurance that S QL C HECK was implemented
order to make switching DBMSs easy, a wrapper function was without significant oversight. Additionally, the timing results show
added around the database’s SELECT query function. Adding a that S QL C HECK is quite efficient. Round trip time over the Inter-
call to S QL C HECK within that wrapper ensures that all SELECT net varies widely, but 80–100ms is typical. Consequently, S QL -
queries will be checked. Calling S QL C HECK from the JSP versions C HECK’s overhead is imperceptible to the user, and is also reason-
requires a Java Native Interface (JNI) wrapper. We report both able for servers with heavier traffic.
figures to indicate approximately the numbers of checks that need In addition to the figures shown in Table 2, our experience using
to be added for web applications of this size that are less cleanly S QL C HECK provides experimental results. Even in the absence of
designed. For this evaluation, we added the meta-characters and the an automated tool for inserting meta-characters and calls to S QL -
calls to S QL C HECK manually; in the future, we plan to automate C HECK, this technique could be applied straightforwardly. Most
this task using a static flow analysis. existing techniques for preventing SQLCIAs either cannot make
In addition to real-world web applications, the evaluation syntactic guarantees (e.g., regular expression filters) or require a
needed real-world inputs. To this end we used a set of URLs tool with knowledge of the source language. For example, a type-
provided by Halfond and Orso. These URLs were generated by system based approach requires typing rules in some form for each
construct in the source language. As another example, a technique tle assumptions do not hold, as in the case of the second order at-
that generates automata for use in dynamic checking requires a tacks [2]. In the absence of a principled analysis to check these
string analyzer designed for the source language. Forgoing the use methods, security cannot be guaranteed.
of the string analyzer would require an appropriate automaton for
each query site to be generated manually, which most web appli- 7.2 Syntactic Structure Enforcement
cation programmers cannot/will not do. In contrast, a programmer Other techniques deal with input validation by enforcing that all
without a tool designed for the source language of his choice can input will take the syntactic position of literals. Bind variables and
still use S QL C HECK to prevent SQLCIAs. parameters in stored procedures can be used as placeholders for
literals within queries, so that whatever they hold will be treated
6.3 Discussions as literals and not as arbitrary code. SQLrand, a recently proposed
We now discuss some of our design decisions and limitations of the instruction set randomization for SQL in web applications, has a
current implementation. similar effect [4]. It relies on a proxy to translate instructions dy-
First, we used a single policy U for all test cases. In practice we namically, so SQL keywords entered as input will not reach the
expect that a simple policy will suffice for most uses. In general, a SQL server as keywords. The main disadvantages of such a system
unique policy can be defined for each pair of input site (by choosing are its complex setup and security of the randomization key. Hal-
a different pair of strings to serve as delimiters) and query site (by fond and Orso address SQL injection attacks through first building
generating an augmented grammar according to the desired policy a model of legal queries and then ensuring that generated queries
for each pair of delimiters). However, even if U were always chosen conform to this model via runtime monitoring [13], following a
to be V ∪ Σ, S QL C HECK would restrict the user input to syntactic similar approach to Wagner and Dean’s work on Intrusion Detec-
forms in the SQL language. In the case where user input is used tion Via Static Analysis [8]. The precision of this technique is sub-
in a comparison expression, the best an attacker can hope to do is ject to both the precision of the statically constructed model and the
to change the number of tuples returned; no statements that modify tokenizing technique used. Because how their model is generated,
the database, execute external code, or return columns other than user inputs are confined to statically defined syntactic positions.
those in the column list will be allowed. These techniques for enforcing syntactic structure do not extend
Second, because the check is based on parsing, it would be pos- to applications that accept or retrieve queries or query-fragments,
sible to integrate it into the DBMSs own parser. From a software such as those that retrieve stored queries from persistent storage
engineering standpoint, this does not seem to be a good decision. (e.g., a file or a database).
Web applications are often ported to different environments and in-
terface with different backend DBMS’s, so the security guarantees 7.3 Static and Runtime Checking
could be lost without the programmer realizing it. Many real-world web applications have vulnerabilities, even though
Finally, the test cases used for the evaluation were generated by measures such as those mentioned above are used. Vulnerabilities
an independant research group from real-world exploits. However, exist because of insufficiency of the technique, improper usage,
they were not written by attackers attempting to defeat the partic- incomplete usage, or some combination of these. Therefore, black-
ular security mechanism we used. In its current implementation, box testing tools have been built for web database applications.
our technique is vulnerable to an exhaustive search of the charac- One from the research community is called WAVES (Web Appli-
ter strings used as delimiters. This vulnerability can be removed by cation Vulnerability and Error Scanner) [14]. Several commercial
modifying the augmenting step: in addition to adding delimiters, products also exist, such as AppScan [33], WebInspect [38], and
it must check for the presence of the delimiters within the input ScanDo [17]. While testing can be useful in practice for finding vul-
string. If the delimiters occur, it must “escape” them by prepend- nerabilities, it cannot be used to make security guarantees. Thus,
ing them with some designated character. S QL C HECK must also be several techniques based on static analysis or runtime checking
modified so that first, its lexer will not interpret escaped delimiters have been proposed, most of which are based on the notion of
as delimiters, and second, it will remove the escaping character af- “taintedness,” similar to Perl’s “tainted mode” [40]. In particular,
ter parsing. there are two recent techniques using static analysis to track the
flow of untrusted input through a program: one based on a type
7. Related Work system [15] (similar to CQual [10]) and one based on a points-
to analysis [24] (using a precise points-to analysis for Java [43]
7.1 Input Filtering Techniques and policies specified in PQL [22, 26]). Both systems trust user
Improper input validation accounts for most security problems in filters, so they do not provide strong security guarantee. There is
database and web applications. Many suggested techniques for in- also recent work on runtime taint tracking [31, 32]. Pietraszek et al.
put validation are signature-based, including enumerating known suggest the use of meta-data for tracking the flow of input through
“bad” strings necessary for injection attacks, limiting the length of filters [32]. The closest work to ours is by Buehrer et al. [6]. They
input strings, or more generally, using regular expressions for fil- bound user input, and at the point where queries are sent, they re-
tering. An alternative is to alter inputs, perhaps by adding slashes place input by dummy literals and compare the parse trees of the
in front of quotes to prevent the quotes that surround literals from original query and the substituted query. In this case, a lexer would
being closed within the input (e.g., with PHP’s addslashes func- suffice for the check, since input substrings must be literals. We do
tion and PHP’s magic quotes setting, for example). Recent re- not address the question of completeness of usage (i.e., that all input
search efforts provide ways of systematically specifying and en- and query sites in the application source code are augmented and
forcing constraints on user inputs [5, 35, 36]. A number of com- checked, respectively). However, a web application programmer
mercial products, such as Sanctum’s AppShield [34] and Kavado’s using S QL C HECK need not make the false-positive/false-negative
InterDo [17], offer similar strategies. All of these techniques are tradeoffs that come with less rigorous approaches. Consquently, a
an improvement over unregulated input, but they all have weak- guarantee of completeness of usage for S QL C HECK implies that
nesses. None of them can say anything about the syntactic struc- SQLCIAs will not occur.
ture of the generated queries, and all may still admit bad input; for This work also relates to some recent work on security analysis
example, regular expression filters may be under-restrictive. More for Java applications. Naumovich and Centonze propose a static
significantly, escaping quotes can also be circumvented when sub- analysis technique to validate role-based access control policies
in J2EE applications [30]. They use a points-to analysis to deter- be to generate queries with “place-holder” user inputs. Then,
mine which object fields are accessed by which EJB methods to using a modified top-down parser, we will generate random in-
discover potential inconsistencies with the policy that may lead to puts that, when put in place of the place-holder inputs, form
security holes. Koved et al. study the complementary problem of syntactically correct queries. By feeding these randomly gener-
statically determining the access rights required for a program or a ated inputs to the web application, we will test S QL C HECK on
component to run on a client machine [21] using a dataflow analy- randomly generated yet meaningful queries.
sis [16, 18]. • Second, we plan to explore static analysis techniques to help
insert meta-characters and calls to S QL C HECK automatically.
7.4 Meta-Programming
The challenge will be to insert the meta-characters such that
To be put in a broader context, our research can be viewed as an no constant strings are captured and the control-flow of the
instance of providing runtime safety guarantee for meta-program- application will not be altered.
ming [39]. Macros are a very old and established meta-program- • Third, we plan to adapt our technique to other settings, for
ming technique; this was perhaps the first setting where the issue
example, to prevent cross-site scripting and XPath injection
of correctness of generated code arose. Powerful macro languages
attacks.
comprise a complete programming facility, which enable macro
programmers to create complex meta-programs that control macro-
expansion and generate code in the target language. Here, basic Acknowledgments
syntactic correctness, let alone semantic properties, of the gener-
ated code cannot be taken for granted, and only limited static check- We thank William Halfond and Alex Orso for providing legitimate
ing of such meta-programs is available. The levels of static check- and attack data for use in our evaluation. We are also grateful to Earl
ing available include none, syntactic, hygienic, and type checking. Barr, Christian Bird, Prem Devanbu, Kyung-Goo Doh, Alex Groce,
The widely used cpp macro pre-processor allows programmers to Ghassan Misherghi, Nicolas Rouquette, Bronis Supinski, Jeff Wu,
manipulate and generate arbitrary textual strings, and it provides no and the anonymous POPL reviewers for useful feedback on drafts
checking. The programmable syntax macros of Weise & Crew [42] of this paper.
work at the level of correct abstract-syntax tree (AST) fragments,
and guarantee that generated code is syntactically correct with re-
spect (specifically) to the C language. Weise & Crew macros are References
validated via standard type checking: static type checking guaran- [1] A. Aho, R. Sethi, and J. Ullman. Compilers, Principles, Techniques
tees that AST fragments (e.g., Expressions, Statements, etc.) are and Tools. Addison-Wesley, 1986.
used appropriately in macro meta-programs. Because macros in- [2] C. Anley. Advanced SQL Injection in SQL Server Applications. An
sert program fragments into new locations, they risk “capturing” NGSSoftware Insight Security Research (NISR) publication, 2002.
variable names unexpectedly. Preventing variable capture is called URL: http://www.nextgenss.com/papers/
hygiene. Hygienic macro expansion algorithms, beginning with advanced sql injection.pdf.
Kohlbecker et al. [20] provide hygiene guarantees. Recent work, [3] G. Bierman, E. Meijer, and W. Schulte. The essence of data access
such as that of Taha & Sheard [39], focuses on designing type in Cω . In The 19th European Conference on Object-Oriented
checking of object-programs into functional meta-programming Programming (ECOOP), 2005. To appear.
languages. There are also a number of proposals to provide type- [4] S. W. Boyd and A. D. Keromytis. SQLrand: Preventing SQL injection
safe APIs for dynamic SQL, including, for example Safe Query attacks. In International Conference on Applied Cryptography and
Objects [7], SQL DOM [27], and Xen [3, 29]. These proposals sug- Network Security (ACNS), LNCS, volume 2, 2004.
gest better programming models, but require programmers to learn [5] C. Brabrand, A. Møller, M. Ricky, and M. I. Schwartzbach.
a new API. In contrast, our approach does not introduce a new API, Powerforms: Declarative client-side form field validation. World
and it is suited to address the problems in the enormous number of Wide Web, 3(4), 2000.
programs that use existing database APIs. There are also research
[6] G. T. Buehrer, B. W. Weide, and P. A. Sivilotti. Using parse tree
efforts on type-checking polylingual systems [11, 25], but they do validation to prevent SQL injection attacks. In Proceedings of the
not deal with applications interfacing with databases such as web International Workshop on Software Engineering and Middleware
applications. (SEM) at Joint FSE and ESEC, Sept. 2005.
[7] W. R. Cook and S. Rai. Safe Query Objects: Statically Typed
8. Conclusions and Future Work Objects as Remotely Executable Queries. In Proceedings of the 27th
International Conference on Software Engineering (ICSE), 2005.
In this paper, we have presented the first formal definition of com-
mand injection attacks in web applications. Based on this defini- [8] D. Dean and D. Wagner. Intrusion detection via static analysis.
tion, we have developed a sound and complete runtime checking In Proceedings of the IEEE Symposium on Research in Security and
algorithm for preventing command injection attacks and produced Privacy, Oakland, CA, May 2001. IEEE Computer Society, Technical
Committee on Security and Privacy, IEEE Computer Society Press.
a working implementation of the algorithm. The implementation
proved effective under testing; it identified SQLCIAs precisely and [9] R. DeLine and M. Fähndrich. The Fugue protocol checker: Is your
incurred low runtime overhead. Our definition and algorithm are software baroque? Technical Report MSR-TR-2004-07, Microsoft
general and apply directly to other settings that produce structured, Research, Jan. 2004. http://research.microsoft.com/~maf/
Papers/tr-2004-07.pdf.
interpreted output.
Here are a few interesting directions for future work: [10] J. Foster, M. Fähndrich, and A. Aiken. A theory of type qualifiers. In
Proceedings of the ACM SIGPLAN Conference on Programming
• First, we plan to experiment with other ways to evaluate S QL - Language Design and Implementation (PLDI), pages 192–203,
C HECK . A natural choice will be to use S QL C HECK in some Atlanta, Georgia, May 1–4, 1999.
online web applications to expose S QL C HECK to the real world. [11] M. Furr and J. S. Foster. Checking type safety of foreign function
By logging the blocked and permitted queries, we hope to val- calls. In Proceedings of the ACM SIGPLAN 2005 Conference on
idate that it does not disrupt normal use and does not allow at- Programming Language Design and Implementation, pages 62–72,
tacks. A more novel approach to evaluating S QL C HECK will 2005.
[12] C. Gould, Z. Su, and P. Devanbu. Static checking of dynamically http://www.sanctuminc.com.
generated queries in database applications. In Proceedings of the [34] Sanctum Inc. AppShield 4.0 Whitepaper., 2002.
25th International Conference on Software Engineering (ICSE), URL: http://www.sanctuminc.com.
pages 645–654, May 2004.
[35] D. Scott and R. Sharp. Abstracting application-level web security. In
[13] W. G. Halfond and A. Orso. AMNESIA: Analysis and Monitoring World Wide Web, 2002.
for NEutralizing SQL-Injection Attacks. In Proceedings of 20th ACM
International Conference on Automated Software Engineering (ASE), [36] D. Scott and R. Sharp. Specifying and enforcing application-level
Nov. 2005. web security policies. IEEE Transactions on Knowledge and Data
Engineering, 15(4):771–783, 2003.
[14] Y.-W. Huang, S.-K. Huang, T.-P. Lin, and C.-H. Tsai. Web application
security assessment by fault injection and behavior monitoring. In [37] Security Focus. http://www.securityfocus.com.
World Wide Web, 2003. [38] SPI Dynamics. Web Application Security Assessment. SPI Dynamics
[15] Y.-W. Huang, F. Yu, C. Hang, C.-H. Tsai, D.-T. Lee, and S.-Y. Whitepaper, 2003.
Kuo. Securing web application code by static analysis and runtime [39] W. Taha and T. Sheard. Multi-stage programming with explicit
protection. In World Wide Web, pages 40–52, 2004. annotations. In Proceedings of the ACM SIGPLAN Symposium
[16] J. B. Kam and J. D. Ullman. Global data flow analysis and iterative on Partial Evaluation and Semantics-Based Program Manipulation
algorithms. Journal of the ACM, 23(1):158–171, 1976. (PEPM), 1997.
[17] Kavado, Inc. InterDo Vers. 3.0, 2003. [40] L. Wall, T. Christiansen, and R. L. Schwartz. Programming Perl (3rd
Edition). O’Reilly, 2000.
[18] G. A. Kildall. A unified approach to global program optimization.
In Proceedings of the 1st Annual Symposium on Principles of [41] G. Wassermann and Z. Su. An Analysis Framework for Security
Programming Languages (POPL), pages 194–206, Oct. 1973. in Web Applications. In Proceedings of the FSE Workshop on
Specification and Verification of Component-Based Systems (SAVCBS
[19] A. Klein. Blind XPath Injection. Whitepaper from Watchfire, 2005. 2004), pages 70–78, 2004.
[20] E. Kohlbecker, D. P. Friedman, M. Felleisen, and B. Duba. Hy- [42] D. Weise and R. Crew. Programmable syntax macros. In Proceedings
gienic macro expansion. In Conference on LISP and Functional of the ACM SIGPLAN Conference on Programming Language Design
Programming, 1986. and Implementation (PLDI), pages 156–165, 1993.
[21] L. Koved, M. Pistoia, and A. Kershenbaum. Access rights analysis [43] J. Whaley and M. S. Lam. Cloning-based context-sensitive pointer
for Java. In Proceedings of the 17th Annual Conference on Object- alias analysis using binary decision diagrams. In Proceedings of the
Oriented Programming, Systems, Languages, and Applications ACM SIGPLAN Conference on Programming Language Design and
(OOPSLA), pages 359–372, Nov. 2002. Implementation (PLDI), pages 131–144, June 2004.
[22] M. S. Lam, J. Whaley, V. B. Livshits, M. Martin, D. Avots, M. Carbin,
and C. Unkel. Context-sensitive program analysis as database queries.
In Proceedings of the ACM Conference on Principles of Database
Systems (PODS), June 2005.
[23] R. Lemos. Flawed USC admissions site allowed access to applicant
data, July 2005. http://www.securityfocus.com/news/11239.
[24] V. B. Livshits and M. S. Lam. Finding security vulnerabilities in Java
applications with static analysis. In Usenix Security Symposium, Aug.
2005. To appear.
[25] K. J. L. Mark Grechanik, William R. Cook. Static checking of
object-oriented polylingual systems. http://www.cs.utexas.
edu/users/wcook/Drafts/FOREL.pdf, Mar. 2005.
[26] M. Martin, V. B. Livshits, and M. S. Lam. Finding application
errors using PQL: a program query language. In 20th Annual ACM
Conference on Object-Oriented Programming, Systems, Languages,
oct 2005. To appear.
[27] R. A. McClure and I. H. Krüger. SQL DOM: compile time checking
of dynamic SQL statements. In Proceedings of the 27th International
Conference on Software Engineering, pages 88–96, 2005.
[28] S. McPeak. Elsa: An Elkhound-based C++ Parser, May 2005.
http://www.cs.berkeley.edu/~smcpeak/elkhound/.
[29] E. Meijer, W. Schulte, and G. Bierman. Unifying tables, objects and
documents, 2003.
[30] G. Naumovich and P. Centonze. Static analysis of role-based access
control in J2EE applications. SIGSOFT Software Engineering Notes,
29(5):1–10, 2004.
[31] A. Nguyen-Tuong, S. Guarnieri, D. Greene, J. Shirley, and D. Evans.
Automatically hardening web applications using precise tainting.
In Twentieth IFIP International Information Security Conference
(SEC’05), 2005.
[32] T. Pietraszek and C. V. Berghe. Defending against Injection Attacks
through Context-Sensitive String Evaluation. In Proceedings of
the 8th International Symposium on Recent Advances in Intrusion
Detection (RAID), Sept. 2005.
[33] Sanctum Inc. Web Application Security Testing-Appscan 3.5. URL:

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy