{"id":3216,"date":"2026-05-02T17:42:36","date_gmt":"2026-05-02T09:42:36","guid":{"rendered":"https:\/\/www.dpriver.com\/blog\/?p=3216"},"modified":"2026-05-03T00:00:49","modified_gmt":"2026-05-02T16:00:49","slug":"bound-ast-logical-plan-and-relational-algebra-explained","status":"publish","type":"post","link":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/","title":{"rendered":"Bound AST, Logical Plan, and Relational Algebra Explained"},"content":{"rendered":"<p>For SQL parser engineers, the phrase <strong>Bound AST logical plan relational algebra<\/strong> describes more than a sequence of compiler phases. It is the boundary between syntax and semantics, between recognizing a valid SQL string and understanding what that SQL means.<\/p>\n<p>A raw parser AST can tell you that a query contains a <code>SELECT<\/code>, a <code>JOIN<\/code>, a <code>WHERE<\/code>, and a column reference named <code>id<\/code>. It cannot, by itself, tell you whether <code>id<\/code> means <code>orders.id<\/code>, <code>customers.id<\/code>, a select-list alias, a correlated outer reference, or an invalid ambiguous column. That requires binding. And once SQL has been bound, most serious analysis tasks\u2014optimization, lineage, policy enforcement, dialect translation, impact analysis\u2014benefit from an intermediate representation closer to relational algebra than to the surface grammar of SQL.<\/p>\n<p>This article explains the layers in depth:<\/p>\n<pre><code class=\"language-text\">SQL text\n  -&gt; parser AST\n  -&gt; bound \/ resolved AST\n  -&gt; Semantic IR for lineage and governance questions\n  -&gt; Logical IR \/ logical plan for relational semantics\n  -&gt; optimized logical plan\n  -&gt; physical plan, execution, or governance output\n<\/code><\/pre>\n<p>The exact names differ across systems. PostgreSQL speaks of parse trees, query trees, rewrite, and plans. Apache Calcite uses <code>SqlNode<\/code>, validation, and <code>RelNode<\/code>. Spark Catalyst uses unresolved and analyzed logical plans. DuckDB has parsed statements, binding, logical operators, optimization, and physical operators. GoogleSQL\/ZetaSQL distinguishes parse AST from resolved AST. But the underlying engineering problem is the same: SQL syntax must be converted into a semantically grounded representation before the system can reason about it reliably.<\/p>\n<h2>Executive summary<\/h2>\n<ul>\n<li>A <strong>parser AST<\/strong> is a syntactic tree. It preserves SQL grammar structure but usually does not resolve names, scopes, aliases, catalog objects, data types, or expression dependencies.<\/li>\n<li>A <strong>bound AST<\/strong> (also called resolved AST, analyzed tree, or semantic AST) attaches meaning: table bindings, column bindings, function\/operator resolution, types, scopes, aliases, and sometimes catalog metadata.<\/li>\n<li>A <strong>Logical IR<\/strong> represents a query using normalized operators such as scan, project, filter, join, aggregate, window, sort, union, and limit. In database engines this is often called a logical plan; in parser SDKs it may be exposed as a reusable intermediate representation.<\/li>\n<li>A <strong>Semantic IR<\/strong> is a task-oriented semantic representation for questions such as lineage, policy, risk, and catalog validation. It may use logical operators, but it does not have to be a full optimizer-grade logical plan.<\/li>\n<li><strong>Relational algebra<\/strong> is the mathematical normalization layer that lets a system reason about equivalence, transformation, optimization, lineage, and policy enforcement.<\/li>\n<li>For data lineage and SQL governance, you usually do not need a full database optimizer. You do need enough semantic binding plus a fit-for-purpose Logical IR or Semantic IR to answer: \u201cWhich output column depends on which input columns, under which operations and scopes?\u201d<\/li>\n<\/ul>\n<h2>1. Raw AST: necessary, but not sufficient<\/h2>\n<p>A SQL parser\u2019s first job is to decide whether text belongs to the language grammar and to produce a structured representation. For example:<\/p>\n<pre><code class=\"language-sql\">SELECT c.id, SUM(o.amount) AS revenue\nFROM customers c\nJOIN orders o ON o.customer_id = c.id\nWHERE o.status = 'paid'\nGROUP BY c.id;\n<\/code><\/pre>\n<p>A grammar-level AST might contain nodes roughly like:<\/p>\n<pre><code class=\"language-text\">SelectStmt\n  SelectList\n    ColumnRef(c.id)\n    Alias(FunctionCall(SUM, ColumnRef(o.amount)), revenue)\n  From\n    Join\n      TableRef(customers, alias=c)\n      TableRef(orders, alias=o)\n      On Equals(ColumnRef(o.customer_id), ColumnRef(c.id))\n  Where Equals(ColumnRef(o.status), Literal('paid'))\n  GroupBy ColumnRef(c.id)\n<\/code><\/pre>\n<p>This tree is useful. It preserves the surface structure of the query and gives visitor-based tools a reliable way to traverse expressions. Formatting, simple table extraction, statement classification, and many lint rules can start here.<\/p>\n<p>But for deeper analysis, this AST is under-specified. It does not necessarily know:<\/p>\n<ul>\n<li>whether <code>customers<\/code> resolves to a table, view, CTE, synonym, or temporary table;<\/li>\n<li>whether <code>c.id<\/code> resolves to <code>catalog.schema.customers.id<\/code>;<\/li>\n<li>whether <code>SUM<\/code> resolves to a built-in aggregate, an overloaded function, or a dialect-specific construct;<\/li>\n<li>whether <code>revenue<\/code> is visible to <code>ORDER BY<\/code>, <code>GROUP BY<\/code>, or another clause in this dialect;<\/li>\n<li>whether <code>id<\/code> would be ambiguous if written unqualified;<\/li>\n<li>what type <code>SUM(o.amount)<\/code> returns;<\/li>\n<li>whether an expression is aggregate, windowed, scalar, or invalid in its scope.<\/li>\n<\/ul>\n<p>This is why AST-only SQL tooling often fails on column-level lineage, permission checks, and Text-to-SQL validation. Syntax is not semantics.<\/p>\n<h2>2. Bound AST: where names acquire meaning<\/h2>\n<p>A <strong>bound AST<\/strong> is a syntax tree whose identifiers have been resolved against scopes and catalog metadata. Different systems use different terms: resolved AST, analyzed AST, query tree, semantic tree, or analyzer output. The essential work is the same.<\/p>\n<p>For each relation reference, the binder\/analyzer determines what it denotes:<\/p>\n<pre><code class=\"language-text\">TableRef(customers AS c)\n  -&gt; RelationBinding(\n       relation_id = catalog.sales.customers,\n       alias = c,\n       columns = [id, name, email, created_at, ...]\n     )\n<\/code><\/pre>\n<p>For each column reference, it determines the exact source column or derived expression:<\/p>\n<pre><code class=\"language-text\">ColumnRef(c.id)\n  -&gt; ColumnBinding(\n       relation = catalog.sales.customers,\n       column = id,\n       type = BIGINT,\n       scope = join_input_left\n     )\n<\/code><\/pre>\n<p>For functions and operators, binding may resolve overloads and infer types:<\/p>\n<pre><code class=\"language-text\">SUM(o.amount)\n  -&gt; FunctionBinding(\n       function = aggregate.sum(decimal) -&gt; decimal,\n       arguments = [orders.amount],\n       aggregate = true\n     )\n<\/code><\/pre>\n<p>For query blocks, binding builds scopes:<\/p>\n<pre><code class=\"language-text\">QueryBlock\n  input scope: c.*, o.*\n  group scope: c.id\n  output scope:\n    column 1 = c.id\n    column 2 = SUM(o.amount) AS revenue\n<\/code><\/pre>\n<p>Binding is where many \u201csimple\u201d SQL cases become non-trivial. A robust binder must handle at least:<\/p>\n<ul>\n<li>table aliases and column aliases;<\/li>\n<li>CTEs and recursive CTEs;<\/li>\n<li>derived tables and lateral references;<\/li>\n<li>correlated subqueries;<\/li>\n<li>view expansion or view references;<\/li>\n<li>star expansion (<code>*<\/code>, <code>t.*<\/code>, <code>EXCEPT<\/code>, <code>REPLACE<\/code> variants in some dialects);<\/li>\n<li>ambiguous and unqualified columns;<\/li>\n<li>dialect-specific visibility rules for aliases;<\/li>\n<li>aggregate and window scope rules;<\/li>\n<li>case folding, quoting, and identifier normalization;<\/li>\n<li>temporary tables, session state, search path, database\/schema resolution;<\/li>\n<li>function\/operator overloads and type coercion.<\/li>\n<\/ul>\n<p>This layer is the difference between saying \u201cthe query mentions <code>amount<\/code>\u201d and saying \u201cthe output column <code>revenue<\/code> is derived from <code>sales.orders.amount<\/code> through an aggregate <code>SUM<\/code> after filtering <code>sales.orders.status = 'paid'<\/code>.\u201d<\/p>\n<h2>3. Logical plan: from SQL grammar to relational operators<\/h2>\n<p>A bound AST is still shaped like SQL syntax. A logical plan is shaped like query semantics.<\/p>\n<p>The earlier SQL query can be represented as a tree of logical operators:<\/p>\n<pre><code class=\"language-text\">Aggregate(group=[c.id], output=[c.id, SUM(o.amount) AS revenue])\n  Filter(condition=(o.status = 'paid'))\n    Join(type=inner, condition=(o.customer_id = c.id))\n      Scan(table=sales.customers AS c)\n      Scan(table=sales.orders AS o)\n<\/code><\/pre>\n<p>This form is deliberately less concerned with the spelling of the SQL and more concerned with the meaning of the dataflow:<\/p>\n<ul>\n<li><code>Scan<\/code> introduces base relations.<\/li>\n<li><code>Join<\/code> combines relations under a predicate.<\/li>\n<li><code>Filter<\/code> removes rows.<\/li>\n<li><code>Aggregate<\/code> groups rows and computes aggregate expressions.<\/li>\n<li><code>Project<\/code> chooses and computes output expressions.<\/li>\n<li><code>Window<\/code> computes analytic functions over partitions and orderings.<\/li>\n<li><code>Union<\/code>, <code>Intersect<\/code>, and <code>Except<\/code> combine compatible relation outputs.<\/li>\n<li><code>Sort<\/code>, <code>Limit<\/code>, and <code>Offset<\/code> affect ordering and cardinality.<\/li>\n<\/ul>\n<p>A parser AST may distinguish many syntactic forms that are semantically equivalent. A logical plan attempts to normalize them into a smaller operator vocabulary. For example, a <code>WHERE<\/code> predicate, a join predicate, and a subquery predicate may remain distinct in the AST but become filter\/join conditions in a plan. This makes transformations easier.<\/p>\n<p>At this point it is useful to distinguish two terms that are often mixed together:<\/p>\n<ul>\n<li><strong>Logical IR<\/strong> is the normalized relational representation: scans, projections, filters, joins, aggregates, windows, set operations, and expression nodes. It is the representation a database optimizer or SQL analysis engine can transform systematically.<\/li>\n<li><strong>Semantic IR<\/strong> is a purpose-built semantic representation for answering product-level questions: column lineage, filter influence, join influence, sensitive-field access, policy violations, risk scoring, and catalog-aware validation.<\/li>\n<\/ul>\n<p>A Semantic IR may be derived from a bound AST, from a Logical IR, or from both. It may also be intentionally flatter than a full logical plan. For example, a governance-oriented Semantic IR might store a <code>StatementGraph<\/code> with <code>RelationSource<\/code>, <code>Projection<\/code>, <code>Filter<\/code>, <code>Join<\/code>, <code>Aggregation<\/code>, <code>OutputColumnMapping<\/code>, and <code>ExpressionDependency<\/code> records. That is not enough to choose an index or produce a physical execution strategy, but it is enough to answer whether <code>revenue<\/code> depends on <code>orders.amount<\/code>, whether the value passed through <code>SUM<\/code>, whether the row set was filtered by <code>orders.status<\/code>, and whether the result depends on a join predicate.<\/p>\n<p>This distinction matters for SQL parser and lineage products. A database engine usually needs a complete logical plan because it must execute the query. A SQL governance engine may instead need a lightweight Semantic IR that preserves evidence, confidence, diagnostics, and source locations while avoiding the complexity of a cost-based optimizer.<\/p>\n<h2>4. Relational algebra: the normalization contract<\/h2>\n<p>Relational algebra matters because it gives query systems a language of equivalence. Codd\u2019s relational model introduced the mathematical foundation: relations, projections, selections, joins, set operations, and transformations over them. Modern SQL is much larger than classical relational algebra, but the optimizer and analyzer still rely on algebraic ideas.<\/p>\n<p>Typical relational operators map naturally to SQL constructs:<\/p>\n<table>\n<thead>\n<tr>\n<th>SQL concept<\/th>\n<th>Relational\/logical operator<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td><code>FROM table<\/code><\/td>\n<td>scan \/ relation variable<\/td>\n<\/tr>\n<tr>\n<td><code>SELECT a, b<\/code><\/td>\n<td>projection<\/td>\n<\/tr>\n<tr>\n<td><code>WHERE p<\/code><\/td>\n<td>selection \/ filter<\/td>\n<\/tr>\n<tr>\n<td><code>JOIN ... ON p<\/code><\/td>\n<td>join<\/td>\n<\/tr>\n<tr>\n<td><code>GROUP BY<\/code><\/td>\n<td>aggregation \/ grouping<\/td>\n<\/tr>\n<tr>\n<td><code>HAVING<\/code><\/td>\n<td>filter after aggregation<\/td>\n<\/tr>\n<tr>\n<td><code>UNION<\/code><\/td>\n<td>union<\/td>\n<\/tr>\n<tr>\n<td><code>EXCEPT<\/code><\/td>\n<td>difference<\/td>\n<\/tr>\n<tr>\n<td><code>INTERSECT<\/code><\/td>\n<td>intersection<\/td>\n<\/tr>\n<tr>\n<td>subquery<\/td>\n<td>nested logical expression, semi-join, anti-join, apply, or scalar subplan<\/td>\n<\/tr>\n<tr>\n<td>window function<\/td>\n<td>window operator over partition\/order frame<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The algebraic representation enables rules such as:<\/p>\n<ul>\n<li>push filters below joins when safe;<\/li>\n<li>combine adjacent projections;<\/li>\n<li>remove unused columns;<\/li>\n<li>reorder inner joins under associativity and commutativity assumptions;<\/li>\n<li>convert subqueries into semi-joins or anti-joins;<\/li>\n<li>push projections into scans;<\/li>\n<li>recognize common subexpressions;<\/li>\n<li>reason about output column dependencies.<\/li>\n<\/ul>\n<p>For an optimizer, these transformations improve performance. For a lineage or governance engine, the same representation improves correctness. If a column is removed by projection, introduced by aggregation, or derived through a window function, the plan representation should make that explicit.<\/p>\n<h2>5. What mature systems do<\/h2>\n<p>The architecture appears repeatedly in mature SQL engines and frameworks, although each system names the phases differently.<\/p>\n<h3>PostgreSQL: raw parse tree, parse analysis, rewrite, planner<\/h3>\n<p>PostgreSQL is a useful example because it does <strong>not<\/strong> normally describe its internal pipeline using the product-neutral terms \u201cLogical IR\u201d and \u201cSemantic IR.\u201d Instead, PostgreSQL uses its own internal names: raw parse tree, query tree, rewritten query tree, and plan tree. The mapping is still useful for SQL parser and lineage engineers.<\/p>\n<p>PostgreSQL documents the parser stage as consisting of two parts: grammar\/lexer parsing and a transformation process. The raw parser checks syntax and builds a <strong>parse tree<\/strong>. The transformation process performs parse analysis and builds a <strong>query tree<\/strong>. PostgreSQL describes this query tree as the structure that represents information about the tables, columns, and other objects referenced by the query.<\/p>\n<p>In the terminology of this article, PostgreSQL\u2019s <strong>query tree after parse analysis<\/strong> is closest to a <strong>bound AST \/ semantic tree<\/strong>:<\/p>\n<pre><code class=\"language-text\">SQL text\n  -&gt; raw parse tree                -- grammar-shaped syntax\n  -&gt; query tree after analysis     -- names, scopes, relation references, target lists\n<\/code><\/pre>\n<p>This is where PostgreSQL has moved beyond syntax. Relation names have been looked up, column references have been analyzed, target lists have been built, expression nodes have type and function\/operator information, and the tree is suitable for rewrite and planning. PostgreSQL does not call this a \u201cSemantic IR,\u201d but it plays the same role as the semantically resolved representation that a lineage or governance tool needs before it can trust column references.<\/p>\n<p>The rewrite system then takes query trees as input and produces query trees as output. PostgreSQL\u2019s documentation describes rewrite as a module between the parser stage and the planner\/optimizer; it is often easiest to think of it as semantic-preserving expansion over the analyzed query tree, for example for views and rules:<\/p>\n<pre><code class=\"language-text\">analyzed query tree\n  -&gt; rewrite system\n  -&gt; rewritten query tree\n<\/code><\/pre>\n<p>The planner\/optimizer then takes the rewritten query tree and produces executable plan alternatives and, eventually, a selected execution plan. PostgreSQL\u2019s plan nodes are closer to what database engineers usually mean by a <strong>logical\/physical planning representation<\/strong>, although PostgreSQL\u2019s internal plan tree is designed for optimization and execution rather than for an external SDK-style \u201cLogical IR.\u201d<\/p>\n<p>A practical mapping is:<\/p>\n<table>\n<thead>\n<tr>\n<th>Article term<\/th>\n<th>PostgreSQL internal concept<\/th>\n<th>Notes<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td>Parser AST \/ raw AST<\/td>\n<td>Raw parse tree<\/td>\n<td>Grammar-shaped syntax produced before catalog lookups.<\/td>\n<\/tr>\n<tr>\n<td>Bound AST \/ semantic tree<\/td>\n<td>Query tree after parse analysis<\/td>\n<td>Names, relation references, target entries, expression semantics, and types are resolved enough for rewrite\/planning.<\/td>\n<\/tr>\n<tr>\n<td>Semantic IR<\/td>\n<td>Not a separate PostgreSQL public layer<\/td>\n<td>A lineage\/governance product could derive one from the analyzed or rewritten query tree. PostgreSQL itself does not expose a task-oriented lineage\/policy IR as a product API.<\/td>\n<\/tr>\n<tr>\n<td>Logical IR \/ logical plan<\/td>\n<td>Planner representation \/ plan trees<\/td>\n<td>PostgreSQL plans are optimizer\/execution-oriented. They are not the same as a portable SDK <code>RelNode<\/code>-style IR, but they perform the role of normalized planning structures inside the engine.<\/td>\n<\/tr>\n<tr>\n<td>Physical plan<\/td>\n<td>Selected executable plan<\/td>\n<td>Contains concrete execution operators such as scans, joins, sorts, aggregates, and access paths.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>So the precise statement is: <strong>PostgreSQL has semantically analyzed query trees and planner plan trees; it does not have a public layer literally named Logical IR or Semantic IR.<\/strong> If we use the terms from this article, PostgreSQL\u2019s analyzed query tree corresponds most closely to the bound\/semantic representation, while its planner structures correspond to the logical\/physical planning side. A governance-oriented Semantic IR would be an additional product-facing layer derived from PostgreSQL-like semantic information, not something PostgreSQL itself exposes as a ready-made lineage API.<\/p>\n<h3>Apache Calcite: <code>SqlNode<\/code>, validation, <code>RelNode<\/code><\/h3>\n<p>Apache Calcite is one of the clearest examples of SQL-to-relational-algebra architecture. Its documentation states that relational algebra is at the heart of Calcite: every query is represented as a tree of relational operators, and planner rules transform expression trees using semantics-preserving identities.<\/p>\n<p>A typical Calcite pipeline is:<\/p>\n<pre><code class=\"language-text\">SQL string\n  -&gt; SqlNode parse tree\n  -&gt; validated SqlNode with names and types\n  -&gt; RelNode relational expression tree\n  -&gt; rule-based\/cost-based optimization\n<\/code><\/pre>\n<p>Calcite is widely embedded because the relational algebra layer is not tied to one storage engine. Adapters can expose different data sources as schemas, while rules and planners operate on a common relational representation.<\/p>\n<h3>Spark Catalyst: unresolved plan to analyzed logical plan<\/h3>\n<p>Spark SQL\u2019s Catalyst optimizer is another influential design. Catalyst represents computations as trees and applies rules to transform them. A parsed query begins with unresolved attributes and relations. The analyzer uses catalog information to resolve them, producing an analyzed logical plan. Optimization rules then transform the plan before physical planning.<\/p>\n<p>The key idea for parser engineers is the explicit distinction between unresolved syntax and analyzed semantics. Until an attribute is resolved, it is unsafe to treat it as a real column dependency.<\/p>\n<h3>DuckDB: parser, binder, logical operators, optimizer, physical operators<\/h3>\n<p>DuckDB follows the conventional database pipeline: parse SQL, bind names and types, produce logical operators, optimize, then produce physical operators. The binder is where table and column names become catalog-bound references. The logical operator tree is then a compact representation for optimization and execution.<\/p>\n<p>This separation is one reason DuckDB can be both SQL-rich and embeddable: surface syntax is separated from semantic binding and execution planning.<\/p>\n<h3>Trino\/Presto: analysis and plan nodes<\/h3>\n<p>Trino analyzes SQL statements with scopes, relation types, and expression analysis before producing and optimizing plan nodes. Its optimizer includes cost-based and rule-based optimizations, predicate pushdown, projection pushdown, join-related transformations, and connector pushdown.<\/p>\n<p>For federation engines like Trino, a logical plan is also a negotiation layer: some operations can be pushed into connectors, while others must be executed by the engine.<\/p>\n<h3>Apache DataFusion: SQL parser to logical plan and optimizer rules<\/h3>\n<p>Apache DataFusion, built on Apache Arrow, exposes logical plans and optimizer rules as reusable Rust components. Its optimizer documentation describes a modular optimizer for logical plans and physical plans, with analyzer, logical optimizer, and physical optimizer rules.<\/p>\n<p>This is a good example of logical plans as a library interface, not only an internal database implementation detail.<\/p>\n<h3>GoogleSQL \/ ZetaSQL: parse AST vs resolved AST<\/h3>\n<p>GoogleSQL\u2019s ZetaSQL documentation explicitly distinguishes the parser AST from the analyzer\u2019s resolved AST. The resolved AST contains nodes such as resolved columns and resolved scans. This is close to what many parser vendors mean by a bound AST: an AST-like representation after names, types, and catalog references have been resolved.<\/p>\n<h3>SQLGlot: AST-first, with optimizer and lineage utilities<\/h3>\n<p>SQLGlot is a practical open-source SQL parser and transpiler. Its AST is expression-oriented and convenient for traversal and transformation. SQLGlot also includes scope, qualification, optimization, and lineage utilities. It demonstrates a common engineering path for SQL tooling: start with a strong AST, then add semantic layers when transpilation, qualification, or lineage requires more than syntax.<\/p>\n<h2>6. Bound AST vs logical plan: which one should a parser expose?<\/h2>\n<p>For a parser or SQL analysis SDK, this is a product and API design question.<\/p>\n<p>A <strong>bound AST<\/strong> is often better when users care about preserving source-level structure:<\/p>\n<ul>\n<li>IDE features;<\/li>\n<li>formatting with semantic awareness;<\/li>\n<li>precise error messages;<\/li>\n<li>source-to-source SQL rewriting;<\/li>\n<li>dialect transpilation that must preserve user intent;<\/li>\n<li>clause-level explanations;<\/li>\n<li>mapping findings back to source spans.<\/li>\n<\/ul>\n<p>A <strong>logical plan<\/strong> is better when users care about normalized semantics:<\/p>\n<ul>\n<li>column-level lineage;<\/li>\n<li>impact analysis;<\/li>\n<li>policy checks;<\/li>\n<li>query risk scoring;<\/li>\n<li>optimization;<\/li>\n<li>equivalence reasoning;<\/li>\n<li>dependency graphs;<\/li>\n<li>cross-dialect normalization.<\/li>\n<\/ul>\n<p>In practice, mature tooling often needs both. The bound AST keeps a connection to the original SQL. The logical plan gives a cleaner semantic graph.<\/p>\n<p>A useful internal design is:<\/p>\n<pre><code class=\"language-text\">Parser AST\n  - source spans\n  - grammar nodes\n  - dialect-specific syntax\n\nBound AST\n  - all AST nodes plus relation\/column\/function\/type bindings\n  - scopes and aliases\n  - catalog links\n\nLogical IR\n  - normalized relational operators\n  - expression trees\n  - output schema\n  - relation provenance\n\nSemantic IR\n  - output column mapping\n  - value\/filter\/join\/aggregation dependencies\n  - evidence, confidence, diagnostics\n  - task-specific governance facts\n<\/code><\/pre>\n<p>This does not require every layer to be a full database optimizer. For governance workloads, the Semantic IR can be intentionally lightweight, while the Logical IR can grow over time toward a fuller relational representation.<\/p>\n<h2>7. Logical IR and Semantic IR for lineage and governance<\/h2>\n<p>A lineage engine, SQL Guard, or catalog-aware validator does not necessarily need to estimate costs, choose indexes, or generate executable physical plans. But it does need a faithful semantic representation. In practice, there are two useful levels.<\/p>\n<p>A practical <strong>Logical IR<\/strong> should model:<\/p>\n<ul>\n<li>relation sources: table, view, CTE, subquery, function table, temporary table;<\/li>\n<li>relational operators: project, filter, join, aggregate, window, union, except, intersect, sort, limit;<\/li>\n<li>expression trees: column references, literals, function calls, casts, CASE expressions, predicates, and subqueries;<\/li>\n<li>scope boundaries: query blocks, CTEs, lateral references, correlated subqueries;<\/li>\n<li>catalog binding: normalized table and column identifiers;<\/li>\n<li>dialect annotations: syntax or behavior that affects semantics.<\/li>\n<\/ul>\n<p>A practical <strong>Semantic IR<\/strong> for governance should model the answers users actually ask for:<\/p>\n<ul>\n<li>relation sources with table, view, CTE, and subquery identity;<\/li>\n<li>projections and output column mappings;<\/li>\n<li>expression dependencies for value lineage;<\/li>\n<li>filter dependencies for row influence;<\/li>\n<li>join-condition dependencies for row matching and multiplicity;<\/li>\n<li>aggregation and group-key transit;<\/li>\n<li>confidence, evidence, diagnostics, and source anchors.<\/li>\n<\/ul>\n<p>For the sample query, a governance-oriented Semantic IR might produce:<\/p>\n<pre><code class=\"language-json\">{\n  &quot;outputs&quot;: [\n    {\n      &quot;name&quot;: &quot;id&quot;,\n      &quot;expression&quot;: &quot;c.id&quot;,\n      &quot;depends_on&quot;: [&quot;sales.customers.id&quot;]\n    },\n    {\n      &quot;name&quot;: &quot;revenue&quot;,\n      &quot;expression&quot;: &quot;SUM(o.amount)&quot;,\n      &quot;depends_on&quot;: [&quot;sales.orders.amount&quot;],\n      &quot;operation&quot;: &quot;aggregate:sum&quot;\n    }\n  ],\n  &quot;filters&quot;: [\n    {\n      &quot;expression&quot;: &quot;o.status = 'paid'&quot;,\n      &quot;depends_on&quot;: [&quot;sales.orders.status&quot;]\n    }\n  ],\n  &quot;joins&quot;: [\n    {\n      &quot;type&quot;: &quot;inner&quot;,\n      &quot;condition&quot;: &quot;o.customer_id = c.id&quot;,\n      &quot;depends_on&quot;: [\n        &quot;sales.orders.customer_id&quot;,\n        &quot;sales.customers.id&quot;\n      ]\n    }\n  ]\n}\n<\/code><\/pre>\n<p>This is not an execution plan. It is a semantic dependency plan. It is sufficient for many data governance tasks, and it avoids the complexity of pretending to be a full DBMS.<\/p>\n<p>One useful way to think about the relationship is:<\/p>\n<pre><code class=\"language-text\">Bound AST\n  -&gt; resolves what every identifier means\n\nLogical IR\n  -&gt; normalizes the query into relational operators\n\nSemantic IR\n  -&gt; packages the semantic facts needed by lineage, validation, policy, risk, and integration APIs\n<\/code><\/pre>\n<p>A system may build Semantic IR directly from a bound AST for a first implementation. Over time, it can derive more of that Semantic IR from a richer Logical IR. The key is not the class name; the key is preserving enough binding, dependency, scope, and evidence information to make the answer trustworthy.<\/p>\n<h2>8. Why AST-only lineage fails<\/h2>\n<p>Column-level lineage is a good stress test. Consider:<\/p>\n<pre><code class=\"language-sql\">WITH paid_orders AS (\n  SELECT customer_id, amount\n  FROM orders\n  WHERE status = 'paid'\n), ranked AS (\n  SELECT\n    customer_id,\n    amount,\n    ROW_NUMBER() OVER (\n      PARTITION BY customer_id\n      ORDER BY amount DESC\n    ) AS rn\n  FROM paid_orders\n)\nSELECT c.email, r.amount AS top_order_amount\nFROM customers c\nJOIN ranked r ON r.customer_id = c.id\nWHERE r.rn = 1;\n<\/code><\/pre>\n<p>An AST can list column references: <code>customer_id<\/code>, <code>amount<\/code>, <code>status<\/code>, <code>email<\/code>, <code>rn<\/code>, <code>id<\/code>. But lineage needs more:<\/p>\n<ul>\n<li><code>r.amount<\/code> comes from CTE <code>ranked.amount<\/code>;<\/li>\n<li><code>ranked.amount<\/code> comes from CTE <code>paid_orders.amount<\/code>;<\/li>\n<li><code>paid_orders.amount<\/code> comes from <code>orders.amount<\/code>;<\/li>\n<li><code>r.rn<\/code> is derived from a window function over <code>paid_orders.customer_id<\/code> and <code>paid_orders.amount<\/code>;<\/li>\n<li><code>c.email<\/code> comes from <code>customers.email<\/code>;<\/li>\n<li>the join condition depends on <code>orders.customer_id<\/code> and <code>customers.id<\/code> through CTE propagation.<\/li>\n<\/ul>\n<p>Without scopes and bindings, the tool can only guess. With a logical IR, dependency propagation becomes systematic.<\/p>\n<h2>9. Common pitfalls for parser and lineage engineers<\/h2>\n<h3>Pitfall 1: treating every <code>ColumnRef<\/code> as globally resolvable<\/h3>\n<p>A column reference is meaningful only inside a scope. CTEs, subqueries, aliases, lateral joins, and correlated references can all change resolution.<\/p>\n<h3>Pitfall 2: ignoring dialect-specific alias rules<\/h3>\n<p>Some dialects allow select-list aliases in <code>GROUP BY<\/code> or <code>ORDER BY<\/code>; others differ in subtle ways. A binder must implement dialect semantics, not generic SQL intuition.<\/p>\n<h3>Pitfall 3: expanding <code>*<\/code> too late\u2014or too early<\/h3>\n<p>Star expansion requires catalog knowledge and scope. But preserving <code>*<\/code> may be necessary for source mapping or for dynamic schemas. Decide whether the bound layer, logical layer, or final lineage output owns expansion.<\/p>\n<h3>Pitfall 4: confusing syntactic dependency with data dependency<\/h3>\n<p>A filter column may not appear in the output, but it still influences which rows appear. Governance systems may need both projection lineage and predicate lineage.<\/p>\n<h3>Pitfall 5: flattening aggregates and windows into ordinary functions<\/h3>\n<p>Aggregates and windows change cardinality, scope, and dependency semantics. Treating <code>SUM(x)<\/code> and <code>ROW_NUMBER()<\/code> like scalar functions loses important meaning.<\/p>\n<h3>Pitfall 6: treating Logical IR and Semantic IR as the same thing<\/h3>\n<p>A Logical IR is usually operator-shaped. A Semantic IR is answer-shaped. If a governance product exposes only a logical plan, users still have to compute lineage, policy, risk, and diagnostics themselves. If it exposes only a task-specific semantic summary, engineers may lose a reusable relational layer for deeper analysis. Keep the distinction explicit.<\/p>\n<h3>Pitfall 7: losing source locations<\/h3>\n<p>Logical and Semantic IRs are normalized, but users still need diagnostics that point back to SQL text. Keep source spans or back-pointers from IR nodes to AST nodes when possible.<\/p>\n<h2>10. How this maps to SQL parser products<\/h2>\n<p>For a SQL parser library, the raw AST is the entry point, but the higher-value layer is usually semantic analysis. In practical SQL tooling, customers often ask questions such as:<\/p>\n<ul>\n<li>Which table does this column belong to?<\/li>\n<li>Which source columns produce this output field?<\/li>\n<li>Is this generated SQL safe to execute?<\/li>\n<li>Does this query access a restricted or sensitive column?<\/li>\n<li>What dashboards, reports, or pipelines are affected if this column changes?<\/li>\n<li>Can this SQL be translated to another dialect without changing meaning?<\/li>\n<\/ul>\n<p>These questions require binding and often both a normalized Logical IR and a task-oriented Semantic IR.<\/p>\n<p>In Gudu\u2019s product family, this is the role of the shared SQL analysis and lineage engine. General SQL Parser (GSP) is the embeddable Java library for parsing, semantic resolution, column-to-table resolution, and lineage extraction across many SQL dialects. SQLFlow packages the same lineage capability as an application platform with APIs, visualization, widgets, batch processing, and deployment options. SQL Omni brings offline lineage and SQL inspection into VS Code.<\/p>\n<p>The product distinction matters less than the architectural principle: parser AST, semantic binding, Logical IR, and Semantic IR are separate layers. Exposing those layers clearly makes the engine more useful to platform teams, data governance teams, and AI application teams. It also lets a mature lineage engine coexist with newer semantic APIs: the production lineage path can remain stable while a Semantic IR layer adds evidence, confidence, diagnostics, and modern integration surfaces.<\/p>\n<h2>11. Design checklist: what to expose in a serious SQL analyzer<\/h2>\n<p>If you are designing a SQL parser or lineage SDK, consider exposing these artifacts explicitly:<\/p>\n<ol>\n<li>\n<p><strong>Raw AST<\/strong><br \/>\n   &#8211; source spans;<br \/>\n   &#8211; dialect-specific nodes;<br \/>\n   &#8211; comments and formatting if needed.<\/p>\n<\/li>\n<li>\n<p><strong>Diagnostic stream<\/strong><br \/>\n   &#8211; syntax errors;<br \/>\n   &#8211; semantic errors;<br \/>\n   &#8211; ambiguous references;<br \/>\n   &#8211; unsupported dialect features;<br \/>\n   &#8211; partial-analysis warnings.<\/p>\n<\/li>\n<li>\n<p><strong>Catalog interface<\/strong><br \/>\n   &#8211; table lookup;<br \/>\n   &#8211; column lookup;<br \/>\n   &#8211; function\/operator lookup;<br \/>\n   &#8211; view expansion policy;<br \/>\n   &#8211; case sensitivity and search path rules.<\/p>\n<\/li>\n<li>\n<p><strong>Bound AST or resolved tree<\/strong><br \/>\n   &#8211; relation bindings;<br \/>\n   &#8211; column bindings;<br \/>\n   &#8211; function bindings;<br \/>\n   &#8211; type information;<br \/>\n   &#8211; query-block scopes;<br \/>\n   &#8211; alias resolution.<\/p>\n<\/li>\n<li>\n<p><strong>Logical IR<\/strong><br \/>\n   &#8211; scan\/project\/filter\/join\/aggregate\/window\/set operators;<br \/>\n   &#8211; expression trees;<br \/>\n   &#8211; output schema;<br \/>\n   &#8211; normalized relation flow;<br \/>\n   &#8211; source-to-IR links.<\/p>\n<\/li>\n<li>\n<p><strong>Semantic IR<\/strong><br \/>\n   &#8211; output column mappings;<br \/>\n   &#8211; value dependencies;<br \/>\n   &#8211; filter dependencies;<br \/>\n   &#8211; join-condition dependencies;<br \/>\n   &#8211; aggregation and group-key dependencies;<br \/>\n   &#8211; confidence, evidence, and diagnostics.<\/p>\n<\/li>\n<li>\n<p><strong>Lineage and governance outputs<\/strong><br \/>\n   &#8211; column-level lineage;<br \/>\n   &#8211; table-level dependencies;<br \/>\n   &#8211; sensitive field access;<br \/>\n   &#8211; policy violations;<br \/>\n   &#8211; risk signals;<br \/>\n   &#8211; explainable diagnostics.<\/p>\n<\/li>\n<\/ol>\n<p>This interface is more useful than a monolithic \u201cparse result\u201d object because different users need different levels of abstraction.<\/p>\n<h2>12. Conclusion<\/h2>\n<p>A SQL parser AST tells you what the query says. A bound AST tells you what the identifiers mean. A Logical IR or logical plan tells you how relations flow through operations. A Semantic IR packages the semantic facts that lineage, validation, policy, risk, and integration APIs need. Relational algebra gives the system a normalization contract for reasoning about equivalence, transformation, lineage, and governance.<\/p>\n<p>For database engines, these layers lead to execution plans. For SQL parser vendors and data governance tools, they lead to semantic validation, column-level lineage, impact analysis, SQL risk scoring, and safer AI-generated SQL.<\/p>\n<p>The most practical architecture is not \u201cAST versus logical plan.\u201d It is a layered pipeline:<\/p>\n<pre><code class=\"language-text\">syntax -&gt; binding -&gt; Logical IR -&gt; Semantic IR -&gt; task-specific output\n<\/code><\/pre>\n<p>If your tool needs to answer semantic questions, do not stop at the raw AST. Build or expose the binding layer. Then decide which intermediate representation your users need: a Logical IR for normalized relational semantics, a Semantic IR for explainable governance answers, or both. For many SQL lineage and SQL Guard products, a lightweight Semantic IR on top of reliable binding is the fastest practical step toward a full SQL semantic governance engine.<\/p>\n<h2>References and further reading<\/h2>\n<ul>\n<li>E. F. Codd, \u201cA Relational Model of Data for Large Shared Data Banks,\u201d Communications of the ACM, 1970.<\/li>\n<li>P. Griffiths Selinger et al., \u201cAccess Path Selection in a Relational Database Management System,\u201d SIGMOD 1979.<\/li>\n<li>Goetz Graefe, \u201cThe Volcano Optimizer Generator: Extensibility and Efficient Search,\u201d ICDE 1993.<\/li>\n<li>Goetz Graefe, \u201cThe Cascades Framework for Query Optimization,\u201d IEEE Data Engineering Bulletin, 1995.<\/li>\n<li>PostgreSQL documentation: \u201cThe Parser Stage,\u201d \u201cThe Rule System,\u201d and \u201cPlanner\/Optimizer.\u201d<\/li>\n<li>Apache Calcite documentation: \u201cAlgebra\u201d and planner rules over <code>RelNode<\/code> relational expressions.<\/li>\n<li>Michael Armbrust et al., \u201cSpark SQL: Relational Data Processing in Spark,\u201d SIGMOD 2015.<\/li>\n<li>Apache DataFusion documentation: Query Optimizer and logical plan optimizer rules.<\/li>\n<li>GoogleSQL \/ ZetaSQL documentation: parser AST and resolved AST.<\/li>\n<li>SQLGlot documentation and AST primer.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>A deep technical guide to SQL parser ASTs, bound ASTs, Logical IR, Semantic IR, logical plans, and relational algebra for parser, lineage, and governance engineers.<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[66,14,12,93],"tags":[161,159,157,162,160,158,156,155],"blocksy_meta":{"styles_descriptor":{"styles":{"desktop":"","tablet":"","mobile":""},"google_fonts":[],"version":5}},"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v19.4 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Bound AST, Logical Plan, and Relational Algebra Explained<\/title>\n<meta name=\"description\" content=\"Bound AST, Logical Plan, and Relational Algebra Explained\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Bound AST, Logical Plan, and Relational Algebra Explained\" \/>\n<meta property=\"og:description\" content=\"Bound AST, Logical Plan, and Relational Algebra Explained\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/\" \/>\n<meta property=\"og:site_name\" content=\"SQL and Data Blog\" \/>\n<meta property=\"article:published_time\" content=\"2026-05-02T09:42:36+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-05-02T16:00:49+00:00\" \/>\n<meta name=\"author\" content=\"James\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"James\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"23 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.dpriver.com\/blog\/#organization\",\"name\":\"SQL and Data Blog\",\"url\":\"https:\/\/www.dpriver.com\/blog\/\",\"sameAs\":[],\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.dpriver.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.dpriver.com\/blog\/wp-content\/uploads\/2022\/07\/sqlpp-character.png\",\"contentUrl\":\"https:\/\/www.dpriver.com\/blog\/wp-content\/uploads\/2022\/07\/sqlpp-character.png\",\"width\":251,\"height\":72,\"caption\":\"SQL and Data Blog\"},\"image\":{\"@id\":\"https:\/\/www.dpriver.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.dpriver.com\/blog\/#website\",\"url\":\"https:\/\/www.dpriver.com\/blog\/\",\"name\":\"SQL and Data Blog\",\"description\":\"SQL related blog for database professional\",\"publisher\":{\"@id\":\"https:\/\/www.dpriver.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.dpriver.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/\",\"url\":\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/\",\"name\":\"Bound AST, Logical Plan, and Relational Algebra Explained\",\"isPartOf\":{\"@id\":\"https:\/\/www.dpriver.com\/blog\/#website\"},\"datePublished\":\"2026-05-02T09:42:36+00:00\",\"dateModified\":\"2026-05-02T16:00:49+00:00\",\"description\":\"Bound AST, Logical Plan, and Relational Algebra Explained\",\"breadcrumb\":{\"@id\":\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.dpriver.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Bound AST, Logical Plan, and Relational Algebra Explained\"}]},{\"@type\":\"Article\",\"@id\":\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/\"},\"author\":{\"name\":\"James\",\"@id\":\"https:\/\/www.dpriver.com\/blog\/#\/schema\/person\/7bbdbb6e79c5dd9747d08c59d5992b04\"},\"headline\":\"Bound AST, Logical Plan, and Relational Algebra Explained\",\"datePublished\":\"2026-05-02T09:42:36+00:00\",\"dateModified\":\"2026-05-02T16:00:49+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/\"},\"wordCount\":3864,\"publisher\":{\"@id\":\"https:\/\/www.dpriver.com\/blog\/#organization\"},\"keywords\":[\"apache-calcite\",\"column-level-lineage\",\"logical-plan\",\"postgresql\",\"query-optimizer\",\"relational-algebra\",\"semantic-analysis\",\"sql-parser\"],\"articleSection\":[\"Data Governance\",\"gsp\",\"sql translate\",\"SQLFlow\"],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.dpriver.com\/blog\/#\/schema\/person\/7bbdbb6e79c5dd9747d08c59d5992b04\",\"name\":\"James\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.dpriver.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/eeddf4ca7bdafa37ab025068efdc7302?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/eeddf4ca7bdafa37ab025068efdc7302?s=96&d=mm&r=g\",\"caption\":\"James\"},\"sameAs\":[\"http:\/\/www.dpriver.com\"],\"url\":\"https:\/\/www.dpriver.com\/blog\/author\/james\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Bound AST, Logical Plan, and Relational Algebra Explained","description":"Bound AST, Logical Plan, and Relational Algebra Explained","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/","og_locale":"en_US","og_type":"article","og_title":"Bound AST, Logical Plan, and Relational Algebra Explained","og_description":"Bound AST, Logical Plan, and Relational Algebra Explained","og_url":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/","og_site_name":"SQL and Data Blog","article_published_time":"2026-05-02T09:42:36+00:00","article_modified_time":"2026-05-02T16:00:49+00:00","author":"James","twitter_card":"summary_large_image","twitter_misc":{"Written by":"James","Est. reading time":"23 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Organization","@id":"https:\/\/www.dpriver.com\/blog\/#organization","name":"SQL and Data Blog","url":"https:\/\/www.dpriver.com\/blog\/","sameAs":[],"logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.dpriver.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.dpriver.com\/blog\/wp-content\/uploads\/2022\/07\/sqlpp-character.png","contentUrl":"https:\/\/www.dpriver.com\/blog\/wp-content\/uploads\/2022\/07\/sqlpp-character.png","width":251,"height":72,"caption":"SQL and Data Blog"},"image":{"@id":"https:\/\/www.dpriver.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"WebSite","@id":"https:\/\/www.dpriver.com\/blog\/#website","url":"https:\/\/www.dpriver.com\/blog\/","name":"SQL and Data Blog","description":"SQL related blog for database professional","publisher":{"@id":"https:\/\/www.dpriver.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.dpriver.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/","url":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/","name":"Bound AST, Logical Plan, and Relational Algebra Explained","isPartOf":{"@id":"https:\/\/www.dpriver.com\/blog\/#website"},"datePublished":"2026-05-02T09:42:36+00:00","dateModified":"2026-05-02T16:00:49+00:00","description":"Bound AST, Logical Plan, and Relational Algebra Explained","breadcrumb":{"@id":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.dpriver.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Bound AST, Logical Plan, and Relational Algebra Explained"}]},{"@type":"Article","@id":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/#article","isPartOf":{"@id":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/"},"author":{"name":"James","@id":"https:\/\/www.dpriver.com\/blog\/#\/schema\/person\/7bbdbb6e79c5dd9747d08c59d5992b04"},"headline":"Bound AST, Logical Plan, and Relational Algebra Explained","datePublished":"2026-05-02T09:42:36+00:00","dateModified":"2026-05-02T16:00:49+00:00","mainEntityOfPage":{"@id":"https:\/\/www.dpriver.com\/blog\/bound-ast-logical-plan-and-relational-algebra-explained\/"},"wordCount":3864,"publisher":{"@id":"https:\/\/www.dpriver.com\/blog\/#organization"},"keywords":["apache-calcite","column-level-lineage","logical-plan","postgresql","query-optimizer","relational-algebra","semantic-analysis","sql-parser"],"articleSection":["Data Governance","gsp","sql translate","SQLFlow"],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.dpriver.com\/blog\/#\/schema\/person\/7bbdbb6e79c5dd9747d08c59d5992b04","name":"James","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.dpriver.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/eeddf4ca7bdafa37ab025068efdc7302?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/eeddf4ca7bdafa37ab025068efdc7302?s=96&d=mm&r=g","caption":"James"},"sameAs":["http:\/\/www.dpriver.com"],"url":"https:\/\/www.dpriver.com\/blog\/author\/james\/"}]}},"_links":{"self":[{"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/posts\/3216"}],"collection":[{"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/comments?post=3216"}],"version-history":[{"count":6,"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/posts\/3216\/revisions"}],"predecessor-version":[{"id":3226,"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/posts\/3216\/revisions\/3226"}],"wp:attachment":[{"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/media?parent=3216"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/categories?post=3216"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.dpriver.com\/blog\/wp-json\/wp\/v2\/tags?post=3216"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}