Visualize your data under Left-Most format with SQLFlow REST API

Gudu SQLFlow is an analysis software for analyzing SQL statements and discovering data lineage relationships. It is often used with metadata management tools and is a basic tool for enterprise data governance. Check this introduction blog if you don’t know much about Gudu SQLFlow.

SQLFlow REST API

Gudu SQLFlow has various functionalities for data lineage generation and it provides you different choices to visualize your data. SQLFlow REST API is one of the most commonly used features. Your data analysis result will be generated and returned from SQLFlow server by sending standard parameterized RESTFul Api request. Apart from report under Json, XML, CSV format, SQLFlow REST API is also able to return result as image(PNG/JPEG) from your SQL.

The Left-Most format

A common usage scenario is that we would like to have a simple and direct data lineage when retrieving data from database or from multiple SQL statements. One of the features that SQLFlow provides is called the left-most result. The left-most format can simplify the data lineage result.

It generates the selected db elements(such as database/schema/table/column etc…) data lineage without the record set. Given data lineage a->b->c, the returned result under the left-most format is a->c.

The API Endpoint

To visualize the left-most data lineage with SQLFlow REST API, we need to invoke

/sqlflow/generation/sqlflow/leftMostSourceTableGraph/image

This endpoint takes sessionId as input. There are two ways to get sessionId

1. Invoke data lineage API to get sessionId

Sample:

curl --location 'https://<SQLFlow url>/gspLive_backend/sqlflow/generation/sqlflow/graph' \
--header 'accept: application/json;charset=utf-8' \
--form 'sqlfile=@"/home/workspace/test2.sql"' \
--form 'dbvendor="dbvoracle"' \
--form 'userId="gudu|0123456789"'

2. Create SQLFlow Job

The following endpoint is to create a SQLFlow:

/sqlflow/job/submitUserJob

Wait few seconds and If we check our job list, we should have this demoJob2 in our job success list. You can verify that ethiter go to the SQLFlow interface or use another SQLFlow REST API /sqlflow/job/displayUserJobsSummary.

Great! After checking the SQLFlow web, we have confirmed the SQLFlow Job is succeeded.

Now we have our Job created and we need to select our data now. To do that, invoke:

/sqlflow/job/displayUserJobGraph

Give the Job Id as the input and in the response of the above API, we will have the sessionId:

Retrieve left-most data lineage

Now we are good to invoke the left-most SQLFlow Api with the sessionId ready. You can use tableNamePattern to map the desired table. Full parameter table here:

Field nameTypeDescription
dbvendorstringdatabase vendor, support values: – dbvazuresql
– dbvbigquery
– dbvcouchbase
– dbvdb2
– dbvgreenplum
– dbvhana
– dbvhive
– dbvimpala
– dbvinformix
– dbvmdx
– dbvmysql
– dbvnetezza
– dbvopenedge
– dbvoracle
– dbvpostgresql
– dbvredshift
– dbvsnowflake
– dbvmssql
– dbvsparksql
– dbvsybase
– dbvteradata
– dbvvertica
hideColumnbooleanwhether hide column
ignoreFunctionbooleanwhether ignore function
simpleOutputbooleansimple output, ignore the intermediate results, defualt is false.
ignoreRecordSetbooleansame as simpleOutput, but will keep output of the top level select
list, default is false.
jobIdstringgive the job Id if need to use the job settings and display the
job graph
normalizeIdentifierbooleanwhether normalize the Identifier, default is true
showTransformbooleanwhether show transform
tokenstringThe token is only used when connecting to the SQLFlow Cloud server,
It’s not in use when connect to the SQLFlow on-premise version.
treatArgumentsInCountFunctionAsDirectDataflowbooleanWhether treat the arguments in COUNT function as direct Dataflow
userIdstringthe user id of sqlflow web or client
columnLevelstringwhether to show table level or column level data, false or true
isReturnModelbooleanwhether to return model
sessionIdstringthe graph session Id
sharebooleanwhether shareable
showConstantTablebooleanwhether show constant table
showLinkOnlybooleanwhether show link only
showRelationTypestringshow relation type, required false, default value is ‘fdd’, multiple
values seperated by comma like fdd,frd,fdr,join. Availables are ‘fdd’ value
of target column from source column, ‘frd’ the recordset count of target
column which is affected by value of source column, ‘fdr’ value of target
column which is affected by the recordset count of source column, ‘join’
combines rows from two or more tables, based on a related column between
them
tableColumnstringtarget table column
tableNamePatternstringtarget table name, supports regex expression

Sample:

curl --location 'https://<SQLFLOW URL>/gspLive_backend/sqlflow/generation/sqlflow/leftMostSourceTableGraph/image' \
--header 'accept: image/*' \
--form 'userId="gudu|0123456789"' \
--form 'sessionId="85b5119f83fcfefa93136e79dc05f2f61efe3b75e6a2b733568e0879c24f0c08_1680615728918"' \
--form 'dbvendor="dbvmssql"' \
--form 'jobId="6def00e8796e4c20ba216826c0ab5b73"' \
--form 'tableNamePattern="msdb.@event_data.nodes"'

Newsletter Updates

Enter your email address below to subscribe to our newsletter