Visualize your data with SQLFlow REST API

Gudu SQLFlow is an analysis software for analyzing SQL statements and discovering data lineage relationships. It is often used with metadata management tools and is a basic tool for enterprise data governance. Check this introduction blog if you don’t know much about Gudu SQLFlow.

Gudu SQLFlow has various functionalities for data lineage generation and it provides you different choices to visualize your data. SQLFlow REST API is one of the most commonly used features. Your data analysis result will be generated and returned from SQLFlow server by sending standard parameterized RESTFul Api request. Apart from report under Json, XML, CSV format, SQLFlow REST API is also able to return result as image(PNG/JPEG) from your SQL.

The API Endpoint

The related SQLFlow Api to visualize your data lineage is:

/sqlflow/generation/sqlflow/graph/image

The /graph/image accepts either SQL statements text or it can reads from SQL file. Export your database metadata into SQL files if you would like to generate data lineage image for your database with /graph/image. (tips: you can use SQLFlow Ingester to export metadata files from your database)

Let’s use the following SQL statements to see how to sort out the data dependencies between various tables/views with a simple call to SQLFlow REST API:

By analyzing the above INSERT SQL statement, we can know:

The data of the deptsal table comes from the dept and emp tables. Further, the data dependency (data lineage) at the field level is:

  • The data of the deptsal.dept_no field comes from dept.deptno;
  • The data of the deptsal.dept_name field comes from dept.name;
  • The data of the deptsal.salary field comes from emp.sal and emp.comm;

Simply upload the above piece of code to SQLFlow API (or you can put the SQL statements in SQL file in case of the sql statement is complex):

curl --location --request POST 'https://api.gudusoft.com/gspLive_backend/sqlflow/generation/sqlflow/graph/image' \
    --header 'accept: image/*' \
    --form 'sqlfile=@"/home/test.sql"' \
    --form 'dbvendor="dbvoracle"' \
    --form 'userId="<YOUR USER ID>"' \
    --form 'token="<THE GENERATED TOKEN>"'

Result:

Hint: SQLFlow On-Premise is used in the above capture, hence token is not required and only userId is given in param. Please refer to this document to check how to get the userid and the token.

Request Parameters

In the above request, only sqlfile, dbvendor and userId are given as the request parameters. The sqlfile is the file that you are going to upload. Use sqltext If you prefer use SQL text. The dbvendor tells SQLFlow to what database this SQL is based on. Other request parameters are not required and have default value. However, you can apply all SQLFlow settings to customize the output image. Check the following table:

Field nameTypeDescription
dbvendorstringdatabase vendor, support values: – dbvazuresql
– dbvbigquery
– dbvcouchbase
– dbvdb2
– dbvgreenplum
– dbvhana
– dbvhive
– dbvimpala
– dbvinformix
– dbvmdx
– dbvmysql
– dbvnetezza
– dbvopenedge
– dbvoracle
– dbvpostgresql
– dbvredshift
– dbvsnowflake
– dbvmssql
– dbvsparksql
– dbvsybase
– dbvteradata
– dbvvertica
hideColumnbooleanwhether hide column
ignoreFunctionbooleanwhether ignore function
simpleOutputbooleansimple output, ignore the intermediate results, defualt is false.
ignoreRecordSetbooleansame as simpleOutput, but will keep output of the top level select
list, default is false.
jobIdstringgive the job Id if need to use the job settings and display the
job graph
normalizeIdentifierbooleanwhether normalize the Identifier, default is true
showTransformbooleanwhether show transform
sqltextstringsql text from which to generate the sqlflow model
tokenstringThe token is only used when connecting to the SQLFlow Cloud server,
It’s not in use when connect to the SQLFlow on-premise version.
treatArgumentsInCountFunctionAsDirectDataflowbooleanWhether treat the arguments in COUNT function as direct Dataflow
userIdstringthe user id of sqlflow web or client
columnLevelstringwhether to show table level or column level data, false or true

Image Data Lineage Result

Let’s use the following settings and make the output more easier to understand:

Field nameValue
showRelationTypefdd
treatArgumentsInCountFunctionAsDirectDataflowtrue
ignoreRecordSetfalse
ignoreFunctiontrue
showConstantTablefalse
showTransformfalse

All database objects discovered during the data lineage analysis are stored in the $.dbobjs object.

Table

There are four tables founded in the above result:

  • DEPTSAL: you can use $.dbobjs[1].name to return the table name, and $.dbobjs[1].type to return the type of this object which is table in this case. you can also use expression like this to get this table:
$.dbobjs[?(@.name=='deptsal')].name
  • DEPT
$.dbobjs[?(@.name=='dept')].name
  • EMP
$.dbobjs[?(@.name=='emp')].name
  • SQL_CONSTANTS: This is not a real table, but a table generated by the Gudu SQLFlow to store the constant used in the SQL query.
$.dbobjs[?(@.name=='SQL_CONSTANTS')].name

Online Web Tool

You can get more information about SQLFlow with the online demo. The online web tool uses SQLFlow API to generate the result. You can also build your own web tool with the help of SQLFlow REST API. Check this fantastic data visualization tool today!

Newsletter Updates

Enter your email address below to subscribe to our newsletter