More techniques
Properties of SQL
In SQL there are many same elements as in programming: Types, statements and functions. We have seen many examples of SQL commands, but let's loon into the language a bit deeper.
Types
The types and their properties depend on the database system we are using. Usually there are several options even for integers and strings. In practice though, types INTEGER
for integers and TEXT
for strings go a long way.
TEXT vs. VARCHAR
Traditional SQL type for saving a string is VARCHAR
, where we give the maximum length of the string. For example VARCHAR(100)
means a string, which can have at most 100 characters.
This is one remainder of the programming from old times: back then the strings were often saved as an array, with a fixed amount of characters. In practice TEXT
is more versatile as we do not have to come up with the maximum amount of characters.
DATE, DATETIME, TIME, TIMESTAMP...
Very useful types of data are the types used for saving DATE
and TIME
. The naming and usage vary from database system to another, so it is best to check the details from the documentation of the databasey system we are using.
Statements
Statements are a part of SQL command, with a certain value. For example in query
SELECT price FROM Products WHERE name='radish';
Has four statements: name
, price
, 'radish'
and name='radish'
. In this the statement price
and price
get their values from the column of the row, statement 'radish'
is a string and the statement name='radish'
is a conditional statement to limit the query.
We can also use calculations and other operators the same way as in programming. For example the query
SELECT price*2 FROM Products WHERE name='radish';
returns the price of the radish doubled.
A good way to test how SQL statements work is to discuss with the database by doing queries which do not search information from any tables, but only calculate values for statements. This could be something like
sqlite> SELECT 2*(1+3);8sqlite> SELECT 'tes' || 'ts';testssqlite> SELECT 3 < 5;1
The first query calculates the value for statement 2*(1+3)
. The second query combines with the operator ||
the strings 'tes' and 'ts'. into 'tests'. Third query defines the truth value for statement 3 < 5
. The practice in SQL is that the truth value is given as an integer: 1 is true
, and 0 is false
.
Many items for SQL statements are already familar from programming:
- calculations:
+
,-
,*
,/
,%
- comparison:
=
,<>
,<
,<=
,>
,>=
- combining conditions:
AND
,OR
,NOT
In addition to these SQL has some more special features, whose knowledge is useful. Next we will look into some of them:
BETWEEN
Statement X BETWEEN a AND b
is true, if X
is at least a
and at most b
. For example the query
SELECT * FROM Products WHERE price BETWEEN 4 AND 6;
Returns the products, whose price is at least 6 and at maximum 6. We can also write the same query like this:
SELECT * FROM Products WHERE price >= 4 AND price <= 6;
CASE
A CASE
structure enables conditional statements. It can have one or more WHEN
and a possible ELSE
. For example the query
SELECT name,CASE WHEN price>5 THEN 'expensive'ELSE 'cheap'ENDFROM Products;
Retrieves the name for all the products and the information about them, if they are expensive or cheap. In this query the product is expensive if the price is over 5, otherwise it is cheap.
IN
Statement x IN (...)
is true, if x
is some of the given values. For example the query
SELECT * FROM Products WHERE name IN ('turnip','cucumber','celery');
Returns the products whose name is turnip, cucumber or celery.
LIKE
Statement s LIKE p
is true, if the string s
matches to the description p
. In the description we can use special characters _
(any single character) and %
(any amount of any characters). For example
SELECT * FROM Products WHERE name LIKE '%er%';
Returns the products whose name contain the string er
(such as cucumber and celery).
NULL
Handling NULL
values we have a separate syntax. For example
SELECT * FROM Products WHERE price IS NULL;
retrieves the products to which no price has been set, and the query
SELECT * FROM Products WHERE price IS NOT NULL;
retrieves the products to which the price has been set.
Functions
As a part of statements we can have functions, just like in programming. As with types, the functions available and their usage are dependant on the used database system and additional information should be checked from the database system documentation.
Here are some useful SQLite functions:
name function-------- ----------ABS(x) returns the absolute value for xCOALESCE(...) returns the first value from the list, which is not NULLLENGTH(s) returns the length of the string sLOWER(s) changes the characters in string s to lower caseMAX(x,y) returns the greater of integers x and yMIN(x,y) returns the smaller of integers x and yRANDOM() returns a random numberROUND(x,d) returns x rounded to d decimalsUPPER(s) changers the characters in string s to upper case
For example the query
SELECT * FROM Products WHERE LENGTH(name)=6;
Returns the products whose name are six characters long (such as carrot, radish, turnip and celery). The query
SELECT * FROM Products ORDER BY RANDOM();
Returns the all the rows in random order, since the order is not based on any column but on a random number.
Subqueries
Subqueries are a statement used as a part of a SQL command, whose value is determined by a query. We can build subqueries in the same manner as main queries and produce queries with them, which could be difficult to achieve otherwise.
Example
Let's look at a situation, where the database has a table for players' scores in the table Results
. We assume the table is following:
id name score---------- ---------- ----------1 Uolevi 1202 Maija 803 Liisa 1204 Aapeli 455 Kaaleppi 115
Now we want to know the players who have achieved the top score, and the query should return Uolevi and Liisa. We can achieve this by subquery like this:
SELECT name, score FROM ResultsWHERE score = (SELECT MAX(score) FROM Results);
And we get:
name score---------- ----------Uolevi 120Liisa 120
In this scenario the subquery is SELECT MAX(score) FROM Results
, which gives the largest score in the table, in this case 120. Notice, that the subquery has to be enclosed brackets, so that it does not mix up with the outer query.
Here's a bit more comlex query:
SELECT name, score FROM ResultsWHERE score >= 0.9*(SELECT MAX(score) FROM Results);
This query shows, that we can use the value from the subquery as part of a statement, just like any other value. The query retrieves the players, whose score is at most 10 percent lower than the the best score:
name score---------- ----------Uolevi 120Liisa 120Kaaleppi 115
Correlated or synchronized subquery
A subquery is also possible to create so, that its content is dependant or a row in the outer query. For example:
SELECT name, score,(SELECT COUNT(*) FROM Results WHERE score > R.score) betterFROM Results R;
The idea for this query is to calculate for each player, how many players have a better score than them. For example for Maija the answer is 3, since Uolevi, Liisa and Kaaleppi have better scores. We get the following result set:
name score better---------- ---------- ----------Uolevi 120 0Maija 80 3Liisa 120 0Aapeli 45 4Kaaleppi 115 2
Because the table Results is in two roles in the subquery, we have given the table Results additional name R. With this in the subquery it is clear that we want to count rows, whose score is better than the row score from the outer query.
Multiple values in subquery
Subquery can also return multiple values, as long as the result from the subquery are used in a location where this is allowed. This works in for example:
SELECT name FROM ProductsWHERE id IN (SELECT product_id FROM Purchases WHERE customer_id = 1);
This query retrieves all the names for products in customer 1's shopping cart. The subquery returns the id values for the products, which can be combined with the IN syntax.
Notice, that we could have done the query also like this:
SELECT P.nameFROM Products P, Purchases OWHERE P.id = O.product_id AND O.customer_id = 1;
Often a subquery is an alternative way to produce a query, which could just as well be done with for example properly designed multiple table query.
Limiting results
SQL query by default returns all the rows mathcing its conditions, but we can ask for only a part of the rows when needed. This is useful for example in applications, where we only want to show a part of the results per page.
Ways of limiting
When we add LIMIT x
to the end of a query, the query only returns x
first rows. For example LIMIT 3
means, that the query only shows the first three rows of the result set.
A more common form is LIMIT x OFFSET y
, which means that we want x
rows starting from position y
(with 0 indexing, of course). For example LIMIT 3 OFFSET 1
means that the result set contains second, third and fourth rows.
Example
Let's see an example query, which returns the products from cheapest to most expensive:
SELECT * FROM Products ORDER BY price;
We get the following result set:
id name price---------- ---------- ----------3 cucumber 25 celery 42 carrot 51 radish 74 turnip 8
We can get the cheapest three as follows:
SELECT * FROM Products ORDER BY price LIMIT 3;
And the result is:
id name price---------- ---------- ----------3 cucumber 25 celery 42 carrot 5
The next query in turn gets the three cheapest products, starting from the second cheapest:
SELECT * FROM Products ORDER BY price LIMIT 3 OFFSET 1;
And the result is:
id name price---------- ---------- ----------5 celery 42 carrot 51 radish 7
Limiting subquery
Let's look at a situation, where we want the combined price of the three cheapest products. The following query does not work like we would want:
SELECT SUM(price) FROM Products ORDER BY price LIMIT 3;
This returns the price of all the products in the table:
SUM(price)----------26
The problem is that the query forms the return set, where only one row containing the value 26 (sum of all products), after which we select the first three rows of the result set (in practice, the only row there is).
We can solve this with a subquery, where we get the three cheapest prices, and calculate the sum of these:
SELECT SUM(price) FROM (SELECT price FROM Products ORDER BY price LIMIT 3);
Now we get the desired result:
SUM(price)----------11