Why do SQL databases take strings as input?

Leros

Lifer
Jul 11, 2004
21,867
7
81
It seems that many of the vulnerabilities out there are due to SQL taking strings as its query (SQL injections). Why is there not a more structured query format that wouldn't allow such vulnerabilities?

The place that I just started working at has a database utility library that has functions for any kind of SQL query you could think of. Not only do these functions make writing queries easier (not having to build a string), they make it really easy to sanitize input.

So, can anyone explain to me the reason that the interface to SQL databases is string queries? I'm sure I'm missing something, but I don't see the point.

<--- SQL noob
 

Aikouka

Lifer
Nov 27, 2001
30,383
912
126
What way would you do it instead? Databases are meant for data access... you essentially tell the database what you want and it provides said data to you ( whether it be actual cell data or information about the database and/or its contents ). The only other thing could be trying to use OO to create an object that you call functions through to access the database, but this becomes really messy really fast.
 

boran

Golden Member
Jun 17, 2001
1,526
0
76
I presume most SQL database products have something akin to preared statements.

basically these go like this in older systems in pseudocode:
setStatement("select count(*) from users where username = ? and password = ?");
setParam(0, name);
setParam(1, pass);
executeStatement;

or a more modern variety in pseudo C#:
SqlCommand cmd = new SqlCommand("select count(*) from users where username = @user and password = @pass");
cmd.Parameters.Add(new SqlParameter("@pass", pass)); //deliberately out of order as example
cmd.Parameters.Add(new SqlParameter("@user", user));

this negates sql injection attacks completely. (unless the values are used somewhere else to build a sql statement with strings)
 

Emulex

Diamond Member
Jan 28, 2001
9,759
1
71
clean your stuff up before hand - best practice. i just store everything in strings in the first ETL - you'd be surprised what i get. data like linenumber being FOR JOE - you have a choice to kick it or accept it and flag the document for review. i'd rather not lose data. human error is so prevalent even in the strictest specs since middleware abound it translating this to that to this to that - specs get abused when humans put their touch into them.

sanitize string (remember unicode!) and shove it on an ETL table, then sort it out. can't stall the queue for one bad egg but you can stall the queue for one document thread.
 

esun

Platinum Member
Nov 12, 2001
2,214
0
0
What doesn't take strings as input? Any human-readable interface uses strings. Whether they look like "functions" or "SQL queries" really is irrelevant. And prepared statements, as mentioned by another poster, is the solution to security vulnerabilities, not making the command strings look different.
 

degibson

Golden Member
Mar 21, 2008
1,389
0
0
Databases intentionally take arbitrary data: byte-array-plus-length sense. Some features of databases allow data to take the form of byte-array-with-implicit-length-from-null-termination, i.e., a human-readable string, but that's just a convenience. Besides, if databases didn't allow arbitrary input, they couldn't store arbitrary data, defeating the point.

But the data storage format isn't usually the problem. Rather, the injection problems start showing up when folks use NULL-terminated features without proper sanitation.
You know, something like NAME="Robert'); DROP TABLE Students;" (http://xkcd.com/327/)
 

Markbnj

Elite Member <br>Moderator Emeritus
Moderator
Sep 16, 2005
15,682
14
81
www.markbetz.net
Databases intentionally take arbitrary data: byte-array-plus-length sense. Some features of databases allow data to take the form of byte-array-with-implicit-length-from-null-termination, i.e., a human-readable string, but that's just a convenience. Besides, if databases didn't allow arbitrary input, they couldn't store arbitrary data, defeating the point.

But the data storage format isn't usually the problem. Rather, the injection problems start showing up when folks use NULL-terminated features without proper sanitation.
You know, something like NAME="Robert'); DROP TABLE Students;" (http://xkcd.com/327/)

Ah, it's little Bobby Tables!
 

Train

Lifer
Jun 22, 2000
13,572
66
91
www.bing.com
Injection attacks can occur in any scenario where a context switch happens. This is not limited to databases, you could also "inject" malicious javascript exploiting the switch from a server language like C# or PHP to client side JavaScript.

OOP is no exception. Switching DB access to an OOP only API would not eliminate the possiblity of an injection type of attack. Especially nowadays with the wider use of dynamic languages and reflection... programmers try to avoid hard coding a bunch of columns to a UI, so they load/reflect a list of fields, then call the API dynamically from that. A malicious user could theoretically exploit that to cause the API to update a field/table it is not supposed to.
 

KIAman

Diamond Member
Mar 7, 2001
3,342
23
81
BTW, your database utility library functions send strings to the underlying RDMS.

You aren't asking the right question.
 

Ross Ridge

Senior member
Dec 21, 2009
830
0
0
BTW, your database utility library functions send strings to the underlying RDMS.

Which then has to deconstruct (parse) the string back into its component actions, modifiers and parameters.

You aren't asking the right question.

Oh, he's asking the right question, just to the wrong people. Most people who work with databases think of everything in terms of SQL queries and aren't able to consider working with a database in any other terms.

To answer the original poster's question, it's just simply how most relational databases (and all SQL databases by defintion) work. Ultimately any query has to be provided to the database as an SQL string. SQL queries aren't easy to parse, so they'll all immediately compile that SQL query into a more efficient form so it can executed and cached. Depsite this, there's usually no option to sumbit a query in more direct or efficient manner.

There is the one allowance most SQL databases have for injection attacks these days, and that's an API that supports parameterized statements, like boran already mentioned. Depending on the circumstances the query may just end up being sent as a single SQL string with the parameters properly escaped to avoid injection attacks. In some cases though the parameters can be sent seperately as binary data, avoiding both injection attacks and the need for them escaped, parsed and converted.
 

Cogman

Lifer
Sep 19, 2000
10,283
134
106
To answer the original poster's question, it's just simply how most relational databases (and all SQL databases by defintion) work. Ultimately any query has to be provided to the database as an SQL string. SQL queries aren't easy to parse, so they'll all immediately compile that SQL query into a more efficient form so it can executed and cached. Depsite this, there's usually no option to sumbit a query in more direct or efficient manner.

This, IMO, is the biggest issue. there really is little to no distinctions between SQL queries and the strings they are putting in. I think this could be done in a better way (MS's Linq expressions do a decent job in separating the two.)
 

Train

Lifer
Jun 22, 2000
13,572
66
91
www.bing.com
This, IMO, is the biggest issue. there really is little to no distinctions between SQL queries and the strings they are putting in. I think this could be done in a better way (MS's Linq expressions do a decent job in separating the two.)

I'm not sure what you mean? Linq to SQL is just syntactic sugar, as far as the database is concerned, it's still just getting a parameterized query.