Get columns and tables in SQL script (.NET version)

The logic to find all tables and columns in SQL script is quite simple like this:

1. Parse SQL script.

2. Iterate all statements via TGSqlParser.sqlstatements and TCustomSqlStatement.ChildNodes().

3. Once a statement was found, get all table tokens belong to this statement via TCustomSqlStatement.TableTokens.

4. Once a table token was found, get all columns belong to this table via various TSourceToken properties explained belowed.

Input SQL:

SELECT e.last_name,
       e.department_id,
       d.department_name
FROM   employees e
       LEFT OUTER JOIN department d
         ON ( e.department_id = d.department_id ); 

Tables and columns:

Tables:
department
employees

Columns
department.department_id
department.department_name
employees.department_id
employees.last_name

If you want to print columns in the same order they appear in sql statement, just append “-s” parameter at command line of this demo, then it will generate following result:

pos        column          table         t-alias  schema   DB  Loc                                     
[  1, 10]  last_name       employees     e        							columnResult/select list                
[  2, 10]  department_id   employees     e        							columnResult/select list                
[  3, 10]  department_name department    d        							columnResult/select list                
[  6, 17]  department_id   employees     e        							join condition (e.department_id = d.department_id)
[  6, 35]  department_id   department    d        							join condition (e.department_id = d.department_id)

Since .NET version 2.2.1, user can set a callback function to TGSqlParser.OnMetaDatabaseTableColumn to provide detailed meta information from database to help SQL parser to determine relationship between column and tables.

SELECT
   Quantity,b.Time,c.Description
FROM
  (SELECT ID,Time FROM bTab) b
  INNER JOIN aTab a on a.ID=b.ID
  INNER JOIN cTab c on a.ID=c.ID

General SQL Parser can link column: Quantity to table: cTab automatically if your callback function says table: cTab includes column: Quantity. Without information provided by this callback function, General SQL Parser will link column: Quantity to table: aTab which is the first possible table.

Download this demo: C# demo

Java version of this document

ANSI C demo

In order to understand this demo listed, here are some conceptions you should know.

select table1.f1, t1.f2
from table1 t1
where t1.f3 = 1

Token ‘table1′ in from clause is a declared table token. Token ‘table1′ in select list is a referenced table token. Token ‘t1′ in from clause is a declared table alias token. Token ‘t1′ in select list and where clause is a referenced table alias token.

Only declared table token will be listed in TCustomSqlStatement.TableTokens, referenced table token, declared table alias token and referenced table alias token can be found in RelatedToken/RelatedTokens property of declared table token.

Temp table, table variable and declare CTE table will also be listed in TCustomSqlStatement.TableTokens. DBObjType property of TSourceToken cab be used to find out those tokens.

Following properties in TSourceToken Class had different meanings according to what’s kind of table type this token is.

  • table token is a declared table token
    • ParentToken: schema token if any
    • ChildToken is null
    • RelatedToken: table alias of this declared table token
    • RelatedTokens:
      • 1. referenced table token of this declared table
      • 2. columns attached to this declared table in-directly. (such as field in create table)
      • 3. columns not linked explicit with table like column f in this sql: select f from t
      • we can distinguish column token(2,3) and referenced token(1) of this declared table in RelatedTokens by using TSourceToken.DBObjType property, column token has vlaue: TDBObjType.ttObjField

    • RelatedSubquery is null.
  • table token is a referenced table token
    • ParentToken: schema token if any.
    • ChildToken: column attached to this referenced table token
  • table token is a declared table alias token
    • RelatedTokens: referenced tokens of this declared table alias.
    • RelatedSubquery, subquery if this table alias is linked to a subquery in from clause
  • table token is a referenced table alias token
    • ChildToken: column attached to this referenced table alias token.
  • table token is a temp table token
    • RelatedTokens, column attached to this table
  • a table variable token:(DBObjType = ttobjtableTemp)
    • RelatedTokens, columns attach to this table.
  • a declared CTE table token:(DBObjType = ttobjtablecte)
    • RelatedTokens: referenced table of this cte table
    • fields attach to this cte table
  • a referenced CTE table token
    • RelatedToken, alias of this refenced cte table
    • RelatedTokens: columns attached to this referened cte table
  • an alias of referenced cte table token
    • RelatedTokens: columns attach to this ref cte table

Download this demo: C# demo, Java demo , ANSI C demo