Light-weight Text-2-SQL on Pandas Data Frames
- Clone repository
git clone [email protected]:IBM/pandasqlite.git
- Install PandaSQLite
cd pandasqlite
pip install .
- Choose language model:
- Get your watsonx.ai API key from https://www.ibm.com/products/watsonx-ai, then continue with step 4.
- OR use a custom language model (see below)
- Set environment variables:
WXAI_PROJECT_ID
- Set to your watsonx.ai project IDWXAI_API_KEY
- Set to your watsonx.ai API keyPANDASQLITE_CACHE_DIR
- (optional) Set the cache directory location
import json
import pandas as pd
from pandasqlite import pandasqlite as pdsql
# load CSV as pandas dataframe(s)
df1 = pd.read_csv("my.csv")
df2 = ...
# ingest dataframe(s)
ingestion, db, _ = pdsql.ingest([df1, df2, ...])
# ask some questions
for question in [
"Show the categories for products sold in Italy.",
"Return the top 10 customers with highest turnover last month, sorted alphabetically by last name.",
"What's the average number of items sold per purchase?",
"Generate an interesting query."
]:
sql = pdsql.text2sql(question, ingestion) # generate query
result = pd.read_sql(sql, db) # execute query
print(question)
print(json.dumps(result.to_json()) + "\n")
You can also take a look at this example.
Not ready to use watsonx.ai? You can plug-in a custom language model callback function as a parameter:
def my_model_callback(input):
# resolve input string by call to local model or external service
output = ...
return output
pdsql.ingest([df1, df2, ...], my_model_callback) # ingest with custom model
sql = pdsql.text2sql(question, ingestion, my_model_callback) # generate query with custom model
Daniel Karl I. Weidele and Gaetano Rossiello. PandaSQLite: Light-weight Text-2-SQL on Pandas Data Frames in Python. GitHub, https://github.com/IBM/pandasqlite. 2025.