SQL - 0 Quering

Database

A collection of data for Creating, Reading, Updating and Deleting.

Every row in a table stores one item in that set. (Each book is a row) Every column has some attribute of that item. (title, author)

Limitations of Spreadsheets and Advantages of databases

  • Scale: Database can store items to millions and billions.
  • Update Capacity: Databases are able to handle multiple updates of data in a second.
  • Speed: Databases allow fast look-up of information. As they provide access to different algorithms to retrieve information.

Database Management Systems

DBMS is a way to interact with a database using a graphical interface or textual language. Ex: MySQL, Oracle, PostgreSQL, SQLite, Microsoft Access, MongoDB etc.

SQL

Structured Query Language, is a language used to interact with databases using CRUD operations. It has keywords that can be used to interact with the database. It is a query language that can be used to ask questions of data inside a database.

sqlite3 longlist.db to enter ctrl+L to clear terminal screen .quit to exit.

SQLite

SQLite uses a sub set of SQL, if code has to be ported to different system like MySQL, it is likely some syntax has to change.

SELECT

Allows for selecting some rows or all of it from a table inside a database.


SELECT * FROM "longlist";

SELECT "title" FROM "longlist";

SELECT "title", "author" FROM "longlist";

Double quotes " " are called SQL identifiers used for table and column names. SQL also has strings which are contents in table which uses ' ' to differentiate them from identifiers.

SQLite is case-insensitive but Case is used as a style convention to help in readability.

LIMIT

Used to limit the number of data contained in a database. LIMIT is used to specify the number of rows in the query output.

SELECT "title"
FROM "longlist"
LIMIT 10;

Taking just the first 10 titles in the database. Order is as per the entry in database.

WHERE

WHERE is used to select rows based on a condition; it will output the rows for which the specified condition is true.

SELECT "title", "author"
FROM "longlist"
WHERE "year" = 2023;

= shows equals and

!= <> both represent Not equal to. NOT is a used to negate a sign.

To select all books which are not hardcover books.

SELECT "title"
FROM "longlist"
WHERE "format" <> 'hardcover';

WHERE "format" != 'hardcover';

WHERE NOT "format" = 'hardcover'; 

'hardcover' is in single quotes to represent it is a string entry in a row of an item.

OR AND

AND and OR can be used to combine conditionals. () parentheses is used to indicate how to combine the conditions in a compound conditional statements.

To select titles and authors of books in 2022 or 2023.

SELECT "title", "author"
From "longlist"
WHERE "year" = 2022 OR "year" = 2023;

To select titles and authors of books in 2022 or 2023 that are not hardcovers.

SELECT "title", "format"
From "longlist"
WHERE ("year" = 2022 OR "year" = 2023) AND "format" != 'hardcover';

Here OR should be evaluated before AND

NULL

If the table has missing data or no data, then NULL is the type used to indicate that data does not have a value or does not exist in the table. Conditions used with NULL are IS NULL and IS NOT NULL

SELECT "title", "translator"
From "longlist"
WHERE "translator" IS NULL;

WHERE "translator" IS NOT NULL;

LIKE

This keyword is used to select data that roughly matches the specified string. Like finding strings that have specific keywords. LIKE can be combined with % which matches any character around a string and _ which matches a single character.

SELECT "title"
FROM "longlist"
WHERE "title" LIKE '%love%';

WHERE "title" LIKE 'Love%';

'%love%' To find any title that has the word love anywhere in it. % matches 0 or more characters before and after love.

'Love%' selects books that only begins with Love

SELECT "title"
FROM "longlist"
WHERE "title" LIKE 'The%';

WHERE "title" LIKE 'The %';

The% finds any title that begins with the but it might also include Them, they also as first three characters are same.

The % finds only title which has The in the beginning with space after it.

To find a title with one letter unknown like pyre pire

SELECT "title"
FROM "longlist"
WHERE "title" LIKE 'P_re';

_ matches to any single character. This can also get pore pure if they exist.

Querying for multiple words

SELECT "title"
FROM "longlist"
WHERE "title" LIKE 'The%love%';

WHERE "title" LIKE 'T____';

In SQLite, comparison of string with LIKE is by default case-insensitive, whereas comparison with = is case-sensitive. (Other DBMS or configuration of database might change this)

Ranges

Using comparison operators < > <= >= in our condition to match the range values.

Books between 2019 to 2022

SELECT "title", "autor", "year"
FROM "longlist"
WHERE "year" >= 2019 AND "year" <= 2022;

BETWEEN and AND also can specify inclusive range.

SELECT "title", "author"
FROM "longlist"
WHERE "year" BETWEEN 2019 AND 2022;

Selecting books based on number of votes and ratings

SELECT "title", "rating", "votes"
FROM "longlist"
WHERE "rating" > 4.0 AND "votes" > 10000;

WHERE "pages" < 300;

< > works on integers, floating points. These data types for columns are set while creating the database.

ORDER BY

ORDER BY allows or organizing / sorting the returned rows in some specified order. The default order will be ascending for ORDER BY, so this gets the lowest ranking books.

SELECT "title", "rating"
FROM "longlist"
ORDER BY "rating" LIMIT 10;

ASC and DESC are keywords that specify the order explicitly This gets the top 10 books by rating

SELECT "title", "rating"
FROM "longlist"
ORDER BY "rating" DESC LIMIT 10;

Using votes as tie breaker when there are similar ratings

SELECT "title", "rating", "votes"
FROM "longlist"
ORDER BY "rating" DESC, "votes" DESC
lIMIT 10;

DESC is mentioned for each column of ORDER BY

Aggregate Functions

COUNT, AVG, MIN, MAX, SUM are called aggregate functions and allow us to perform corresponding operations over multiple rows of data. By their vary nature they return only a single output - the aggregate value.

Average rating of all books

SELECT AVG("rating")
FROM "longlist";

Average rounded to two decimal points using ROUND ( value, decimal points)

SELECT ROUND(AVG("rating"), 2)
FROM "longlist";

To rename the column in which the result gets displayed by using AS

SELECT ROUND(AVG("rating"), 2) AS "average rating"
FROM "longlist";
SELECT MAX("rating"), MIN("rating")
FROM "longlist";

Counting total number of Books in Database

SELECT SUM("votes")
FROM "longlist";

Counting number of books in database (number of rows)

SELECT COUNT(*)
FROM "longlist";
SELECT COUNT("translator")
FROM "longlist";

this might give less number if few the rows has NULL value. COUNT does not count NULL values.

Using DISTINCT to count the unique entries only.

SELECT COUNT(DISTINCT "publishers")
FROM "longlist";

SELECT DISTINCT "publishers"
FROM "longlist";