Just like in natural languages, SQL statements have a standard order of words. In English, a sentence uses the order subject + verb + object. For SQL statements, it depends on the action but directionally looks like1:
The relevance and risk of SQL injection.
The SQL programming language is almost 50 years old but still one of the most relevant technologies used today. Injection attacks rank third (3rd) on OWASP’s Top 10 categories and occupy the second (2nd) most occurrences of common weakness enumerations (CWE). Anything from XPATH to Hibernate2 may be impacted. SQL is one of the more common vulnerabilities within the broader category of injection-based attacks.
How SQL injection attacks work.
Injection attacks attempt to embed a malicious SQL statement inside a legitimate payload submitted to the web server. The effects of a successful injection attack may include data exfiltration, deletion, or modification. This inevitably leads to monetary losses from reputational damage, regulatory fines, litigation costs, court-ordered settlements, and an additional marketing/PR budget to repair the brand’s image.
The role of SQL keywords and false positives.
Like all other computer languages, SQL implementations (e.g. ANSI SQL 2003) have keywords. The total number reserved words depend on the implementation. For example, SQLite has almost 150 words that are reserved. The keywords are not to be used to name tables, indices, columns, etc. Examples: LIKE, SELECT, FROM, TABLE, UNION.
These keywords can be legitimately used within data posted to a web server and this is what drives false positives.
A real-world example: The furniture store scenario.
Imagine a webpage displaying types of furniture. The GET query string to populate a category page may be referenced by: “?furniture=table&extension=drop%20leaf”. Ostensibly, the values (table, drop leaf) will be inserted in a SQL statement returning all pieces of furniture of the selected style.
The injection attack attempts to insert characters or keywords to semantically change the statement. The malicious input may use a condition that always returns true (and 1=1) and/or in-line comments (double hyphen) to ignore the remaining portion of the text. Not commenting out the remainder of the string input may break the newly created malicious input. There are a lot of basic tutorials with examples of a simple injection attack so no need to recreate here.
The furniture store example is contrived but based on a real-world issue we solved. Many reserved words will be present in legitimate transactions posted to a web server. Simple pattern matching will throw too many false positives – legitimate requests are incorrectly blocked.
Relaxation rules: Addressing false positives.
Relaxation rules are a way to let the WAF know of a false-positive that can be safely ignored. Relaxation Rules override the default countermeasure. A violation that was initially logged or blocked, can be inspected by the user and allowed within the policy. These rules inspect a URL, Cookie, Form Field, or Header searching for the specific keyword, special character, or pattern that should be allowed in context.
An example of a relaxation rule.
One common Relaxation Rule would be to override the blocking of all occurrences of “like” to only allow “like” in a specific HTTP Request Header:
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36
Automating relaxation rules with UltraWAF.
For convenience, the relaxation rules can be created from the logs dashboard within UltraWAF. Having to manually configure all the relaxation rules may be unreasonable and a grammar would be more useful. The objective of the grammar is to distinguish between legal and illegal sequences of characters – SQL characters/words in this case.
UltraWAF offers a Boolean option to enable “SQL Injection Grammar,” allowing transactions to pass even when keywords are present but not in a valid order. The following sentence would not be blocked even though there are at least two reserved words:
3 drop leaf table in stock
The idea is to analyze the string recursively to determine if the keywords are in a syntactically correct statement. A simple string for a grammar to reject would be:
drop table leaf
Words used more frequently in English have a higher chance of being a false-positive in a WAF. Some words commonly found in false positives would be “and”, “like”, or, “null”. Instead of naive word-matching and counting, the position of the words within the sentence matters. Just like natural languages, SQL statements also have a grammar.
Graphically, these are often represented as a finite state machine. The SQLite drop table command looks like3:
The grammar can be evaluated using regular expressions. The first half of this regex may look like:
(?i)/drop table?:\s*(.*?)\s*(?:if exists?:\s*(.*))?$/
The impact of grammar on false positives and negatives.
The takeaway is that grammar can dramatically increase the accuracy of the SQL Injection rules by reducing the false positives (requests that are detected as an attack when they are not) and the false negatives (requests that are not detected as an attack when they are an attack). A calculation for this is the Matthews correlation Coefficient (MCC), which is the ratio between these measurements to find the “sweet spot” in tuning where you minimize false positives and false negatives4. The more accurate the WAF countermeasures and rules are, the lower the level of effort to tune and the lower the Total Cost of Ownership.
1. https://www.sqlite.org/lang_expr.html
2. https://owasp.org/www-project-top-ten/
3. https://sqlite.org/lang_droptable.html
4. https://en.wikipedia.org/wiki/Phi_coefficient