Mastering SQL Syntax: A Comprehensive Guide for 2024 and Beyond

Welcome to a journey of SQL mastery! In the ever-evolving landscape of data management, having a strong command over SQL syntax is crucial. As you embark on this comprehensive guide for 2024 and beyond, you are about to explore the intricate elements of SQL, unraveling its grammar and nuances. Whether you’re a seasoned SQL developer or a curious learner, this article will be your companion in navigating through keywords, identifiers, expressions, predicates, and the myriad clauses that define the language. Get ready to enhance your SQL skills and unlock new possibilities in database management. Let’s dive into the world of SQL syntax mastery!

The SQL syntax like English grammar. Just as an English sentence has multiple components, A SQL query can also be subdivided into several language elements: Keywords, Identifiers, Expressions, Predicates, Clauses, Queries and Statements.

The complete SQL language grammar is defined and maintained by ISO/IEC SC 32 as part of ISO/IEC 9075. Despite the existence of the standard, SQL code is not completely portable among different database systems without adjustments. You may check this article to see Why does SQL differ among various RDBMS when it is said to be an ANSI standard?

SQL syntax elements

1. Keywrods

Keywords in SQL play a crucial role, falling into two categories: reserved words and non-reserved keywords.

Reserved words: These are SQL’s powerhouse, representing special words like SELECT, INSERT, and DELETE that directly initiate SQL operations.

Non-reserved keywords: In specific contexts, non-reserved keywords hold unique significance and can double as identifiers in other scenarios. Often, these keywords align with built-in table and function names, such as BEFORE, COMMENT, and DEFERRABLE.

For example, in Oracle, reserved words include CONNECT BY and START WITH for hierarchical queries, while Snowflake introduces keywords like SNAPSHOT and COPY for unique data warehousing operations. This diversity underscores how different databases infuse their own flavor into SQL’s universal language.

Different RDBMS has different Reserved words and Non-reserved keywords list.

2. Identifiers

Here is the explantion about the identifiers on wikipedia:

Identifiers are names on database objects, like tables, columns and schemas. An identifier may not be equal to a reserved keyword, unless it is a delimited identifier. Delimited identifiers means identifiers enclosed in double quotation marks. They can contain characters normally not supported in SQL identifiers, and they can be identical to a reserved word, e.g. a column named YEAR is specified as “YEAR”

https://en.wikipedia.org/wiki/SQL_syntax
Regular identifier

A regular identifier is not case-sensitive; that is, if you create a database named SALES, you can call it by using any case combination of letters. For example, SALES, sales, SaLeS, and SALEs all match the database named SALES. The ANSI SQL standard specifies that systems convert all regular SQL identifiers to the corresponding uppercase characters, so the database system converts any regular identifier that you specify into uppercase characters when it is saved in the database, and also when the regular identifiers are used in query processing.

Regular identifiers can contain only letters (in any alphabet, not just the Latin alphabet), syllables (as in the Japanese Hiragana syllabary), ideographs, decimal digits, underscores, and dollar sign ($). Regular identifiers must begin with a letter; they cannot begin with a digit, underscore, or dollar sign. Regular identifiers also cannot be a SQL reserved word (as described in SQL reserved words and keywords).

Delimited identifer

A delimited identifier is also a name of a database object, but it is enclosed in double quotation marks and has special considerations. A delimited identifier is case-sensitive, so a database named “Sales” is not the same database as one named SALES, for example. The Oracle database does not convert delimited identifiers to the default system case. The Oracle also does not save the enclosing double quotation marks in the database.

Within the double quotation marks, a delimited identifier can include the same letters, syllables, ideographs, decimal digits, and underscores as a regular identifier, but it can also include spaces, special characters such as hyphens and percent signs, and SQL reserved keywords. Except for underscores, which are not allowed, a delimited identifier can begin with any of these letters, digits, or symbols.

In actual implementations by various database vendors, identifier is more complex than described above.

  • In addition to using double quotation marks to enclose the delimited identifiers, some databases use other characters to enclose the delimited identifiers. Microsoft SQL Server use [ and ] to enclose the delimited identifiers, while MySQL and Google BigQuery use ` to enclose the delimited identifiers.
  • In order to use the delimited identifier correctly, we need to know 2 things: 1) how database store the regular identifier and delimited identifier internally. 2) When comparing identifiers, is it case-sensitive? Let’s take Oracle database for example, regular identifier will be saved in UPPERCASE, while the delimited identifier will keep the case as original. Oracle compare identifer in a case-sensitive way. So, column1 and “column1” in Oracle is treated as 2 different column name.
  • The max length of an identifier also various from different database vendors.
  • The characters allowed in the identifier.

SQL Server delimited identifiers:

  1. [First Name]
  2. [Order Date]
  3. [Total Sales]

DB2 delimited identifiers:

  1. “Employee ID”
  2. “Shipping Address”
  3. “Product Code”

3. Expressions and Predicates

In the context of SQL, both expressions and predicates are fundamental components used to define conditions and criteria in queries. However, they serve slightly different purposes:

1. SQL Expression

An expression in SQL is a combination of one or more values, operators, and functions that yields a single value. Expressions are used to compute values, perform calculations, or concatenate strings. They can appear in various parts of a SQL statement, such as the SELECT list, WHERE clause, or the SET clause in an UPDATE statement.

SELECT salary * 1.1 AS increased_salary FROM employees WHERE department = 'IT';

In this example, salary * 1.1 is an expression that calculates an increased salary, and department = ‘IT’ is a predicate used in the WHERE clause.

2. SQL Predicate

A predicate in SQL is a condition that evaluates to either true, false, or unknown. Predicates are typically used in the WHERE clause of SELECT, UPDATE, DELETE, and other SQL statements to filter rows based on certain criteria. Predicates involve comparisons, logical operations, and other conditions that determine the inclusion or exclusion of rows in the result set.

SELECT product_name FROM products WHERE price > 100;

In this example, price > 100 is a predicate that filters rows where the price is greater than 100.

In summary, while an expression computes a value, a predicate defines a condition that determines which rows will be included in the result set. Expressions can be part of predicates, but not all expressions are predicates. Predicates play a crucial role in filtering and selecting data, contributing to the logic of SQL queries.

Databases such as Oracle, SQL Server don’t allow predicate used in select list, but Google BigQuery allow the predicte used in select list. Remember that the SQL standard allows for certain flexibility, and database vendors may introduce variations or extensions. Always refer to the specific documentation of the database system you are working with to understand its supported syntax.

Here are some typical expressions

Arithmetic and Mathematical Expressions:

  • Arithmetic Expressions
  • Mathematical Functions

String and Text Manipulation:

  • String Concatenation

Comparison and Logical Operations:

  • Comparison Expressions
  • Logical Expressions

Conditional Logic:

  • CASE Expressions

NULL Handling:

  • COALESCE Function
  • NULLIF Function

Type Conversion:

  • CAST Function (Type Conversion)

Set Operations and Predicates:

  • LIKE Predicate
  • IN Predicate
  • BETWEEN Predicate
  • EXISTS Predicate
  • UNIQUE Predicate

NULL and MISSING Values:

  • IS NULL Predicate
  • IS NOT NULL Predicate

Window Functions:

  • Window Functions

This is a sample SQL that applying various arithmetic and mathematical expressions to employee salary and bonus data, give you an idea how expression and predicate used in SQL statement.

SELECT
  employee_id,
  base_salary,
  bonus_percentage,
  base_salary + (base_salary * bonus_percentage / 100) AS total_salary_with_bonus,
  base_salary * 1.1 AS increased_salary,
  CASE WHEN bonus_percentage > 5 THEN 'High Bonus' ELSE 'Standard Bonus' END AS bonus_category,
  ROUND(base_salary * 1.05, 2) AS rounded_bonus,
  CEIL(base_salary * 1.05) AS rounded_up_bonus,
  FLOOR(base_salary * 1.05) AS rounded_down_bonus
FROM
  employees
WHERE
  bonus_percentage > 3;

4. Clauses

SQL clauses are components of SQL statements that define various aspects of the query or operation. These clauses are essential for constructing SQL statements to retrieve, manipulate, or manage data within a database. Let’s discuss how SQL clauses are defined in the ANSI SQL standards:

  1. SELECT Clause:
    • Defines the columns or expressions to retrieve.
  2. FROM Clause:
    • Specifies the tables or views from which to retrieve data.
  3. WHERE Clause:
    • Filters rows based on specified conditions.
  4. GROUP BY Clause:
    • Groups rows based on specified columns.
  5. HAVING Clause:
    • Filters groups based on specified conditions.
  6. ORDER BY Clause:
    • Sorts the result set based on specified columns or expressions.
  7. JOIN Clause:
    • Combines rows from two or more tables based on a related column.
  8. UPDATE Clause:
    • Modifies existing rows in a table.
  9. DELETE Clause:
    • Removes rows from a table.
  10. INSERT Clause:
    • Adds new rows to a table.
-- SELECT Clause: Retrieving specific columns
SELECT employee_id, first_name, last_name, salary
FROM employees
WHERE department = 'IT'
ORDER BY last_name DESC
LIMIT 5;

-- UPDATE Clause: Modifying data
UPDATE employees
SET salary = salary * 1.05
WHERE department = 'HR';

-- INSERT Clause: Adding new data
INSERT INTO employees (first_name, last_name, department, salary, hire_date)
VALUES ('Alice', 'Johnson', 'Marketing', 60000, '2023-01-15');

-- DELETE Clause: Removing data
DELETE FROM employees
WHERE department = 'Finance' AND salary < 50000;

Most databases implemention adheres to ANSI SQL standards for these clauses, ensuring compatibility and allowing users to write SQL statements that are portable across different database systems. While those databases may have specific features or optimizations, the core SQL clauses align with ANSI SQL standards, promoting consistency in SQL development practices. Always refer to the specific documentation for the version of the databases you are using for precise details and syntax.

5. Queries

A SQL query is a request for data or information from a database. It is a specific type of SQL statement that retrieves data or performs a query operation. In the context of most relational database management systems (RDBMS), including DB2, Oracle, and PostgreSQL, a SQL query typically refers to a SELECT statement.

A SQL statement is a complete, self-contained unit of execution in SQL. It can encompass various operations, including queries (SELECT), data modifications (INSERT, UPDATE, DELETE), and schema operations (CREATE, ALTER, DROP). A SQL statement represents a single action to be performed on the database.

While a SQL query is a specific type of SQL statement (typically SELECT), not all SQL statements are queries. SQL statements can include queries, data modifications, or schema operations. The term “SQL statement” is a more encompassing concept that refers to any complete SQL instruction.

In summary:

  • A SQL query is a type of SQL statement that specifically retrieves data.
  • A SQL statement is a broader term that includes queries and other SQL operations.

In a SELECT statement, you can use various SQL clauses to shape the result set. Here are several clauses that can be used in a typical SELECT statement:

SELECT
  column1,
  column2,
  -- Add more columns as needed

FROM
  your_table

-- Optional Clauses
WHERE
  condition1
GROUP BY
  column1
HAVING
  condition2
ORDER BY
  column1 ASC, column2 DESC
LIMIT
  10;
5.1 CTE

A Common Table Expression (CTE) in SQL is a named temporary result set that you can reference within the context of a SELECT, INSERT, UPDATE, or DELETE statement. The main usage of CTEs is to simplify complex queries, enhance readability, and facilitate the reuse of subqueries. Here are the main use cases for CTEs:

Improve Readability:

  • CTEs make SQL queries more readable by allowing you to break down a complex query into smaller, named, and more manageable parts. Each CTE serves as a self-contained, named query segment.
WITH EmployeeCTE AS (
  SELECT employee_id, first_name, last_name
  FROM employees
  WHERE department_id = 1
)
SELECT *
FROM EmployeeCTE;

Reuse Subqueries:

  • CTEs allow you to define a subquery once and reference it multiple times within the same query. This avoids duplicating the same complex logic and promotes code reuse.
WITH HighSalaryEmployees AS (
  SELECT employee_id, first_name, last_name
  FROM employees
  WHERE salary > 80000
)
SELECT * FROM HighSalaryEmployees
UNION
SELECT * FROM HighSalaryEmployees WHERE department_id = 2;

Recursive Queries:

  • CTEs support recursive queries, allowing you to perform operations on hierarchical or graph-like data structures, such as organizational charts or bill of materials.
WITH RecursiveCTE AS (
  SELECT employee_id, manager_id
  FROM employees
  WHERE manager_id IS NULL
  UNION ALL
  SELECT e.employee_id, e.manager_id
  FROM employees e
  JOIN RecursiveCTE r ON e.manager_id = r.employee_id
)
SELECT * FROM RecursiveCTE;

Window Functions:

  • CTEs are often used in conjunction with window functions to perform complex analytical calculations on a specific window of rows in a result set.
WITH SalaryRanking AS (
  SELECT employee_id, salary, 
         ROW_NUMBER() OVER (ORDER BY salary DESC) AS salary_rank
  FROM employees
)
SELECT * FROM SalaryRanking WHERE salary_rank <= 10;

In summary, the main usage of CTEs is to enhance the readability of SQL queries, enable the reuse of subqueries, support recursive operations, and facilitate the use of window functions for analytical purposes. They provide a cleaner and more modular approach to writing complex SQL queries.

5.2 Joins

In ANSI SQL, there are several types of joins, including INNER JOIN, LEFT JOIN (or LEFT OUTER JOIN), RIGHT JOIN (or RIGHT OUTER JOIN) and FULL JOIN (or FULL OUTER JOIN), CROSS JOIN, and NATURAL JOIN . Here are examples of each join type using practical SQL:

Assume we have two tables: employees and departments.

INNER JOIN:

An INNER JOIN returns only the rows where there is a match in both tables.

SELECT employees.employee_id, employees.first_name, departments.department_name
FROM employees
INNER JOIN departments ON employees.department_id = departments.department_id;
LEFT JOIN (LEFT OUTER JOIN):

A LEFT JOIN returns all rows from the left table and the matched rows from the right table. If there is no match, NULL values are returned for columns from the right table.

SELECT employees.employee_id, employees.first_name, departments.department_name
FROM employees
LEFT JOIN departments ON employees.department_id = departments.department_id;
RIGHT JOIN (RIGHT OUTER JOIN):

A RIGHT JOIN returns all rows from the right table and the matched rows from the left table. If there is no match, NULL values are returned for columns from the left table.

SELECT employees.employee_id, employees.first_name, departments.department_name
FROM employees
RIGHT JOIN departments ON employees.department_id = departments.department_id;
FULL JOIN (FULL OUTER JOIN):

A FULL JOIN returns all rows when there is a match in either the left or the right table. If there is no match, NULL values are returned for columns from the table without a match.

SELECT employees.employee_id, employees.first_name, departments.department_name
FROM employees
FULL JOIN departments ON employees.department_id = departments.department_id;
CROSS JOIN:

A CROSS JOIN returns the Cartesian product of two tables, combining each row from the first table with every row from the second table.

SELECT * FROM employees CROSS JOIN departments;
NATURAL JOIN:

A Natural Join is a type of INNER JOIN where the join condition is implicitly defined based on columns with the same name in both tables. It automatically matches columns with identical names and returns rows where these columns have equal values.

SELECT * FROM employees NATURAL JOIN departments;

In additon to the join type in ANSI SQL, Various database vendors introduce additional join types or optimizations beyond the ANSI SQL standard. Here are some of those joins: MERGE JOIN (Oracle), HASH JOIN (SQL Server), NESTED LOOPS JOIN (PostgreSQL), Semi-Join (EXISTS) (PostgreSQL, MySQL), Anti-Join (NOT EXISTS) (PostgreSQL, MySQL) and etc.

5.3 Oracle’s old-style join syntax

In Oracle, the traditional way of specifying joins involves using the WHERE clause with the (+) syntax, known as Oracle’s old-style join syntax. This syntax is still supported, but Oracle strongly recommends using the ANSI SQL JOIN syntax for clarity and maintainability. However, I’ll provide an example of both for better understanding.

Oracle’s old-style join syntax uses the (+) symbol to indicate the columns involved in the join.

SELECT e.employee_id, e.first_name, d.department_name
FROM employees e, departments d
WHERE e.department_id = d.department_id (+);

In this example, (+) on the right side of the equals sign indicates a right outer join. If (+) were on the left side, it would indicate a left outer join.

6. Statements

SQL Statements are categorized into Data Definition Language (DDL), Data Manipulation Language (DML), Data Control Language (DCL), Transaction Control Language (TCL), and Data Query Language (DQL).

Data Definition Language (DDL):

Responsible for defining and managing the structure of a database. DDL statements are used to create, modify, and delete database objects such as tables and indexes.

  1. CREATE TABLE
  2. ALTER TABLE
  3. DROP TABLE
  4. CREATE INDEX
  5. DROP INDEX
Data Manipulation Language (DML):

Concerned with the manipulation and retrieval of data within the database. DML statements are used to insert, update, delete, and query data stored in tables.

  1. SELECT
  2. INSERT
  3. UPDATE
  4. DELETE
Data Control Language (DCL):

Handles the permissions and access controls within a database. DCL statements grant or revoke privileges to users, controlling their ability to perform certain operations on database objects.

  1. GRANT
  2. REVOKE
Transaction Control Language (TCL):

Manages transactions within a database. TCL statements are used to control the beginning and ending of transactions, as well as to set savepoints within transactions for rollback purposes.

  1. COMMIT
  2. ROLLBACK
  3. SAVEPOINT
  4. ROLLBACK TO SAVEPOINT
Data Query Language (DQL):

Primarily focused on querying and retrieving data from the database. DQL consists mainly of the SELECT statement, which allows users to retrieve data based on specified conditions.

  1. SELECT

In summary, the article serves as a comprehensive guide to SQL syntax, covering fundamental elements, clauses, queries, joins, and statements, offering insights into the nuances and variations in SQL usage across different database systems.

Free and Online SQL Tools

There are several online and free SQL syntax-related tools that can help with writing, testing, and optimizing SQL queries. Here are some popular ones:

  1. SQLFiddle:
    • Website: SQLFiddle
    • Description: SQLFiddle allows you to experiment with SQL queries online. It supports various database systems, including MySQL, PostgreSQL, Oracle, and SQL Server.
  2. DB-Fiddle:
    • Website: DB-Fiddle
    • Description: Similar to SQLFiddle, DB-Fiddle supports multiple database systems, and it provides a collaborative environment for sharing and testing SQL code.
  3. W3Schools SQL Editor:
    • Website: W3Schools SQL Editor
    • Description: W3Schools offers an online SQL editor where you can practice SQL queries with a sample database. It’s a great resource for learning and testing SQL syntax.
  4. Mode Analytics:
    • Website: Mode Analytics
    • Description: Mode Analytics provides an SQL editor that allows you to write and run SQL queries. It’s a powerful tool with features for collaboration and data visualization.
  5. SQLite Online:
    • Website: SQLite Online
    • Description: This tool is specifically designed for SQLite. It allows you to execute SQLite queries online and visualize the results.
  6. Codecademy SQL Courses:
    • Website: Codecademy
    • Description: While not an editor per se, Codecademy offers interactive SQL courses where you can learn SQL syntax in a hands-on manner.
  7. DBeaver:
    • Website: DBeaver
    • Description: DBeaver is a free and open-source database tool that supports various databases. It provides a SQL editor with syntax highlighting and code completion features.
  8. SQLZoo:
    • Website: SQLZoo
    • Description: SQLZoo is an interactive platform that offers a series of challenges to practice SQL. It covers a variety of SQL concepts and is suitable for both beginners and experienced users.

These tools cater to different needs, from learning SQL to testing queries on specific database systems. Depending on your requirements, you can choose the tool that best fits your needs and preferences.

Newsletter Updates

Enter your email address below to subscribe to our newsletter