DataHub #15327: Why Power BI M-Language Queries Produce Zero Lineage — 4-Query Deep Analysis
DataHub issue #15327 reports a common Power BI + DataHub problem: four Snowflake-connected Power BI datasets produce zero upstream lineage. No tables. No columns. The lineage graph just stops at the Power BI boundary.
We analyzed all four M-language queries from the issue, identified two distinct failure patterns, and ran real lineage extraction against each one using GSP’s Power Query M-language parser. Here’s what we found — and how we solve both patterns.
Why DataHub Can’t Parse These Queries
The root cause is that Power BI doesn’t generate SQL — it generates M-language (Power Query Formula Language). M is a fully functional language (closer to F# than SQL) that Power BI uses behind the scenes for all “Get Data” and “Transform Data” operations.
DataHub’s ingestion pipeline expects SQL. When it receives M-language, no SQL parser (sqlglot, sqllineage, or any other) can extract lineage from it because it’s not SQL.
Two Patterns, Both Solved
The four queries in #15327 fall into two fundamentally different categories. GSP’s Power Query M-language extractor (dbvpowerquery) handles both:
| Pattern | Queries | What Happens | GSP Result |
|---|---|---|---|
| Pattern A — M Navigation | Q1, Q3 | Table reference encoded in M’s record-access syntax: {[Name="X", Kind="Database"]}[Data]. No SQL anywhere. |
Table-level lineage recovered via navigation chain parsing |
| Pattern B — Value.NativeQuery() | Q2, Q4 | M wraps actual SQL inside a Value.NativeQuery() call. SQL has M-encoded escapes (#(lf), ""). |
Column-level lineage recovered via SQL extraction + parsing |
Pattern A: Pure M Navigation (Queries 1 & 3)
These queries use M-language’s navigation syntax to reach Snowflake objects — no SQL at all. GSP’s M-language parser walks the navigation chain {[Name="X", Kind="Y"]}[Data] to extract the full database.schema.table path.
Query 1: Navigation to a Snowflake View
Original M-Language:
let
Source = Snowflake.Databases(SnowFlakeConnector, SnowflakeWarehouse),
my_Database = Source{[Name="PROD201_DB_redacted", Kind="Database"]}[Data],
CONSUMPTION_Schema = my_Database{[Name="CONSUMPTION", Kind="Schema"]}[Data],
Details_View = CONSUMPTION_Schema{[Name="details", Kind="View"]}[Data],
#"Changed Type1" = Table.TransformColumnTypes(Details_View, ...),
#"Filtered Rows" = Table.SelectRows(#"Changed Type1", ...)
in #"Filtered Rows"
GSP analysis: The parser identifies Snowflake.Databases as the connector, then walks the navigation chain through three steps:
my_Database:{[Name="PROD201_DB_redacted", Kind="Database"]}→ Database levelCONSUMPTION_Schema:{[Name="CONSUMPTION", Kind="Schema"]}→ Schema levelDetails_View:{[Name="details", Kind="View"]}→ View level
GSP result:
Navigation resolved:
Step: Details_View | Vendor: dbvsnowflake
Path: PROD201_DB_redacted.CONSUMPTION.details
Synthetic SQL: SELECT * FROM "PROD201_DB_redacted"."CONSUMPTION"."details"
Upstream table recovered: PROD201_DB_redacted.CONSUMPTION.details (table-level lineage)
Query 3: Navigation to a Snowflake Table + Column Renaming
Original M-Language:
let
Source = Snowflake.Databases("redacted.snowflakecomputing.com", #"Warehouse name"),
my_Database = Source{[Name="PROD201_DB_redacted", Kind="Database"]}[Data],
CONSUMPTION_Schema = my_Database{[Name="CONSUMPTION", Kind="Schema"]}[Data],
MY_View = CONSUMPTION_Schema{[Name="MESSAGES", Kind="Table"]}[Data],
#"Renamed Columns" = Table.RenameColumns(MY_View, {{"DATE","Date"}, ...}),
#"Added Custom" = Table.AddColumn(#"Renamed Columns", "Message Type", ...)
in #"Added Custom"
GSP result:
Navigation resolved:
Step: MY_View | Vendor: dbvsnowflake
Path: PROD201_DB_redacted.CONSUMPTION.MESSAGES
Synthetic SQL: SELECT * FROM "PROD201_DB_redacted"."CONSUMPTION"."MESSAGES"
Upstream table recovered: PROD201_DB_redacted.CONSUMPTION.MESSAGES (table-level lineage). The Table.RenameColumns and Table.AddColumn steps are M-level transformations that don’t change the upstream source.
Pattern B: Embedded SQL (Queries 2 & 4)
These queries wrap real Snowflake SQL inside M’s Value.NativeQuery() function. GSP extracts the SQL string, decodes M-specific escapes, infers the SQL dialect from the inline connector call, and parses the SQL for full column-level lineage.
Query 2: Simple SELECT with #(lf) escape
Original M-Language:
let
Source = Value.NativeQuery(
Snowflake.Databases(SnowFlakeServer, SnowFlakeDWH)
{[Name="PROD201_redacted"]}[Data],
"SELECT *#(lf)FROM PROD201_DB_redacted.CONSUMPTION.CV_OFFER_redacted_CITY",
null, [EnableFolding=true])
in Source
Step 1 — Decode M escapes: #(lf) → newline. Step 2 — Infer vendor: The inline Snowflake.Databases() tells GSP this is Snowflake SQL. Step 3 — Parse SQL: SELECT * produces a wildcard column mapping.
GSP result:
NativeQuery resolved:
Step: Source | Vendor: dbvsnowflake | Parse RC: 0
Decoded SQL: SELECT *
FROM PROD201_DB_redacted.CONSUMPTION.CV_OFFER_redacted_CITY
Column lineage: * → * (all columns flow through)
Upstream: PROD201_DB_redacted.CONSUMPTION.CV_OFFER_redacted_CITY — 1 table-level + 1 column-level (* → *) lineage.
Query 4: Complex Snowflake SQL with GROUP BY ALL, quoted aliases, expressions
Original M-Language (abbreviated):
let
Source = Value.NativeQuery(
Snowflake.Databases("redacted.snowflakecomputing.com",
WAREHOUSE, [Implementation="2.0"])
{[Name="PROD201_DB_redacted"]}[Data],
"
SELECT
SERVICE_START_DATE AS ""Service Date"",
TO_NUMBER(OPERATIVE_OFFICE) || '_' || statement || '_'
|| VENDOR_CODE || '_' || TO_CHAR(SERVICE_START_DATE, 'YYYYMMDD')
AS OFFICE_STATEMENT_VENDOR_DATE,
VENDOR_CODE
FROM PROD201_DB_redacted_DATAMARTS.SALES_MARKETING
.CV_COMMDX_SALES_TRANSFERS
WHERE SERVICE_START_DATE >= '2023-10-01' ...
GROUP BY ALL
HAVING SUM(COST_OF_SALES_CUR) = 0 AND COUNT(*) > 1
", null, [EnableFolding=true]),
#"TimeFilter" = Table.SelectRows(Source, ...)
in #"TimeFilter"
Step 1 — Decode M escapes: ""Service Date"" → "Service Date" (Snowflake quoted identifier). Step 2 — Infer vendor: Inline Snowflake.Databases() → Snowflake. Step 3 — Analyze SQL:
- Column 1:
SERVICE_START_DATE AS "Service Date"— simple alias (1 mapping) - Column 2:
TO_NUMBER(OPERATIVE_OFFICE) || '_' || statement || '_' || VENDOR_CODE || '_' || TO_CHAR(SERVICE_START_DATE, 'YYYYMMDD') AS OFFICE_STATEMENT_VENDOR_DATE— concatenation with 4 source columns traced throughTO_NUMBER()andTO_CHAR()(4 mappings) - Column 3:
VENDOR_CODE— direct pass-through (1 mapping)
GSP result — 6 column-level lineage mappings:
| Source Column | Target Column | Why | |
|---|---|---|---|
SERVICE_START_DATE |
→ | SERVICE DATE |
Direct alias: AS "Service Date" |
OPERATIVE_OFFICE |
→ | OFFICE_STATEMENT_VENDOR_DATE |
Via TO_NUMBER() in concat |
STATEMENT |
→ | OFFICE_STATEMENT_VENDOR_DATE |
Direct in || concat |
VENDOR_CODE |
→ | OFFICE_STATEMENT_VENDOR_DATE |
Direct in || concat |
SERVICE_START_DATE |
→ | OFFICE_STATEMENT_VENDOR_DATE |
Via TO_CHAR() in concat |
VENDOR_CODE |
→ | VENDOR_CODE |
Direct pass-through |
Upstream: PROD201_DB_redacted_DATAMARTS.SALES_MARKETING.CV_COMMDX_SALES_TRANSFERS — 1 table-level + 6 column-level lineages.
Summary: All 4 Queries Solved
| Query | Pattern | Upstream Table | DataHub Today | With GSP |
|---|---|---|---|---|
| Q1 | A (M Navigation) | PROD201_DB_redacted.CONSUMPTION.details |
0 lineage | Table-level lineage |
| Q2 | B (NativeQuery) | PROD201_DB_redacted.CONSUMPTION.CV_OFFER_redacted_CITY |
0 lineage | 1 table + 1 column (*) |
| Q3 | A (M Navigation) | PROD201_DB_redacted.CONSUMPTION.MESSAGES |
0 lineage | Table-level lineage |
| Q4 | B (NativeQuery) | ...DATAMARTS.SALES_MARKETING.CV_COMMDX_SALES_TRANSFERS |
0 lineage | 1 table + 6 columns |
How It Works
GSP’s Power Query module (dbvpowerquery) is a lightweight M-language extractor — not a full M parser. It answers exactly 6 questions about any M document:
- What are the
letbindings? - Which binding is returned by
in? - Does a binding contain
Value.NativeQuery()? → Extract SQL, decode escapes, delegate to vendor SQL parser - Does a binding represent a navigation chain? → Extract
database.schema.table - What connector function is at the root? → Infer SQL dialect (Snowflake, SQL Server, Oracle, PostgreSQL, MySQL, BigQuery, Redshift, Databricks, Hive)
- Which bindings reference other bindings? → Follow step references
Everything else in M (closures, each, Table.* transforms, type system) is parsed permissively — unknown constructs produce warnings, never wrong lineage.
Try It
# Install the sidecar
pip install git+https://github.com/gudusoftware/gsp-datahub-sidecar.git
# Test with your Power BI queries
gsp-datahub-sidecar --mode authenticated \
--user-id YOUR_USER_ID \
--secret-key YOUR_SECRET_KEY \
--sql-file your_query.sql \
--db-vendor dbvsnowflake \
--dry-run
Sign up for a free API key at docs.gudusoft.com/sign-up. For a full interactive analysis, see our dedicated analysis page.
Related Issues
- #15327 — No upstream lineage tracked for specific Power BI tables (this analysis)
- #11251 — Power BI SQL comments break lineage (#(lf) encoding) — solved
- #11654 — BigQuery procedural SQL drops lineage — solved
Disclosure: I work at Gudu Software, the company behind GSP SQL Parser, SQLFlow, and the gsp-datahub-sidecar. All lineage results shown were produced by running the M scripts through GSP’s Power Query parser (dbvpowerquery) and verified against the actual M-language queries from DataHub issue #15327.

