admin管理员组文章数量:1410712
I have not managed to found a good enough method to extract the tables and columns. My final intention is to have all the columns used in format table.column
. I use python.
I have tried python libraries like sql_metadata, but it is not as precise as I would like. I tried Parser(query).tables
and Parser(query).columns
I also have the .sqlite (I am using SQLite) databases if it is needed to execute sql.
For example:
SELECT student_id FROM student_course_attendance WHERE course_id = 301 ORDER BY date_of_attendance DESC LIMIT 1
student_course_attendance.student_id
,
student_course_attendance.course_id
and
student_course_attendance.date_of_attendance
And I also want to take into account that if for example I do SELECT(*)
I will have to get all the attributes.
I have not managed to found a good enough method to extract the tables and columns. My final intention is to have all the columns used in format table.column
. I use python.
I have tried python libraries like sql_metadata, but it is not as precise as I would like. I tried Parser(query).tables
and Parser(query).columns
I also have the .sqlite (I am using SQLite) databases if it is needed to execute sql.
For example:
SELECT student_id FROM student_course_attendance WHERE course_id = 301 ORDER BY date_of_attendance DESC LIMIT 1
student_course_attendance.student_id
,
student_course_attendance.course_id
and
student_course_attendance.date_of_attendance
And I also want to take into account that if for example I do SELECT(*)
I will have to get all the attributes.
1 Answer
Reset to default 1Your ability to get this information out of just a sql statement is going to be very limited. In your shared example it is somewhat possible but some assumptions have to be made.
An example using sqlparse
:
import sqlparse
from sqlparse.sql import IdentifierList, Identifier
from sqlparse.tokens import Keyword, DML
def is_subselect(parsed):
if not parsed.is_group:
return False
for item in parsed.tokens:
if item.ttype is DML and item.value.upper() == 'SELECT':
return True
return False
def extract_from_part(parsed):
from_seen = False
for item in parsed.tokens:
if from_seen:
if is_subselect(item):
yield from extract_from_part(item)
elif item.ttype is Keyword:
return
else:
yield item
elif item.ttype is Keyword and item.value.upper() == 'FROM':
from_seen = True
def extract_table_identifiers(token_stream):
for item in token_stream:
if isinstance(item, IdentifierList):
for identifier in item.get_identifiers():
yield identifier.get_name()
elif isinstance(item, Identifier):
yield item.get_name()
elif item.ttype is Keyword: # needed for a sqlparse bug
yield item.value
def extract_tables(sql):
stream = extract_from_part(sqlparse.parse(sql)[0])
return list(extract_table_identifiers(stream))
sql = 'SELECT student_id FROM student_course_attendance WHERE course_id = 301 ORDER BY date_of_attendance DESC LIMIT 1;'
parsed = sqlparse.parse(sql)
display(parsed[0].tokens)
columns = []
for token in parsed[0].tokens:
if isinstance(token, sqlparse.sql.Identifier):
columns.append(token.get_name())
tables = list(extract_table_identifiers(extract_from_part(parsed[0])))
#it's only possible to determine which table the unqualified columns came
# when there is a single table in the FROM clause.
if len(tables) == 1:
columns = [tables[0] + '.' + column for column in columns]
print(columns)
This all falls apart as soon as you add in another table to the FROM
clause or use a SELECT *
where there is simply no way to to determine which column came from which table or what the columns are at all. This also gets very ugly when you add in subqueries or CTEs.
At the end of the day if your only workable solution to a problem is "parsing sql" then rethink how badly you need to solve the problem.
本文标签: pythonExtract columns and tables used given an SQL queryStack Overflow
版权声明:本文标题:python - Extract columns and tables used given an SQL query? - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1744847428a2628293.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
select c1, ABS(c2 + c3) from t1 where c4 = ? and not exists (select * from t2 where t1.c5 = t2.xc)
. – jarlh Commented Mar 10 at 12:37SELECT *
refers to. It doesn't know the table schema, it just parses the SQL string. – Barmar Commented Mar 10 at 15:21