The logic to find all tables and columns in SQL script is quite simple like this:
1. Parse SQL script.
2. Iterate all statements via TGSqlParser.sqlstatements and TCustomSqlStatement.ChildNodes().
3. Once a statement was found, get all table tokens belong to this statement via TCustomSqlStatement.TableTokens.
4. Once a table token was found, get all columns belong to this table via various TSourceToken properties explained belowed.
SELECT e.last_name, e.department_id, d.department_name FROM employees e LEFT OUTER JOIN department d ON ( e.department_id = d.department_id );
Tables and columns:
Tables: department employees Columns department.department_id department.department_name employees.department_id employees.last_name
If you want to print columns in the same order they appear in sql statement, just append “-s” parameter at command line of this demo, then it will generate following result:
pos column table t-alias schema DB Loc [ 1, 10] last_name employees e columnResult/select list [ 2, 10] department_id employees e columnResult/select list [ 3, 10] department_name department d columnResult/select list [ 6, 17] department_id employees e join condition (e.department_id = d.department_id) [ 6, 35] department_id department d join condition (e.department_id = d.department_id)
Since .NET version 2.2.1, user can set a callback function to TGSqlParser.OnMetaDatabaseTableColumn to provide detailed meta information from database to help SQL parser to determine relationship between column and tables.
SELECT Quantity,b.Time,c.Description FROM (SELECT ID,Time FROM bTab) b INNER JOIN aTab a on a.ID=b.ID INNER JOIN cTab c on a.ID=c.ID
General SQL Parser can link column: Quantity to table: cTab automatically if your callback function says table: cTab includes column: Quantity. Without information provided by this callback function, General SQL Parser will link column: Quantity to table: aTab which is the first possible table.
Download this demo: C# demo
In order to understand this demo listed, here are some conceptions you should know.
select table1.f1, t1.f2 from table1 t1 where t1.f3 = 1
Token ‘table1′ in from clause is a declared table token. Token ‘table1′ in select list is a referenced table token. Token ‘t1′ in from clause is a declared table alias token. Token ‘t1′ in select list and where clause is a referenced table alias token.
Only declared table token will be listed in TCustomSqlStatement.TableTokens, referenced table token, declared table alias token and referenced table alias token can be found in RelatedToken/RelatedTokens property of declared table token.
Temp table, table variable and declare CTE table will also be listed in TCustomSqlStatement.TableTokens. DBObjType property of TSourceToken cab be used to find out those tokens.
Following properties in TSourceToken Class had different meanings according to what’s kind of table type this token is.
we can distinguish column token(2,3) and referenced token(1) of this declared table in RelatedTokens by using TSourceToken.DBObjType property, column token has vlaue: TDBObjType.ttObjField