aixplain.modules.agent.tool.sql_tool
__author__
Copyright 2024 The aiXplain SDK authors
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Author: Lucas Pavanelli and Thiago Castro Ferreira Date: May 16th 2024 Description: Agentification Class
SQLToolError Objects
class SQLToolError(Exception)
Base exception for SQL Tool errors
CSVError Objects
class CSVError(SQLToolError)
Exception for CSV-related errors
DatabaseError Objects
class DatabaseError(SQLToolError)
Exception for database-related errors
clean_column_name
def clean_column_name(col: Text) -> Text
Clean column names by replacing spaces and special characters with underscores.
This function makes column names SQLite-compatible by:
- Converting to lowercase
- Replacing special characters with underscores
- Removing duplicate underscores
- Adding 'col_' prefix to names starting with numbers
Arguments:
col
Text - The original column name.
Returns:
Text
- The cleaned, SQLite-compatible column name.
check_duplicate_columns
def check_duplicate_columns(df: pd.DataFrame) -> None
Check for duplicate column names in DataFrame after cleaning.
This function checks if any column names would become duplicates after being cleaned for SQLite compatibility.
Arguments:
df
pd.DataFrame - The DataFrame to check for duplicate column names.
Raises:
CSVError
- If any cleaned column names would be duplicates.
infer_sqlite_type
def infer_sqlite_type(dtype) -> Text
Infer SQLite type from pandas dtype.
This function maps pandas data types to appropriate SQLite types:
- Integer types -> INTEGER
- Float types -> REAL
- Boolean types -> INTEGER
- Datetime types -> TIMESTAMP
- All others -> TEXT
Arguments:
dtype
- The pandas dtype to convert.
Returns:
Text
- The corresponding SQLite type.
Notes:
Issues a warning when falling back to TEXT type.
get_table_schema
def get_table_schema(database_path: str) -> str
Get the schema of all tables in the SQLite database.
This function retrieves the CREATE TABLE statements for all tables in the database.
Arguments:
database_path
str - Path to the SQLite database file.
Returns:
str
- A string containing all table schemas, separated by newlines.
Raises:
DatabaseError
- If the database file doesn't exist or there's an error accessing it.
Notes:
Issues a warning if no tables are found in the database.
create_database_from_csv
def create_database_from_csv(csv_path: str,
database_path: str,
table_name: str = None) -> str
Create SQLite database from CSV file and return the schema.
This function creates or modifies a SQLite database by importing data from a CSV file. It handles column name cleaning, data type inference, and data conversion.
Arguments:
csv_path
str - Path to the CSV file to import.database_path
str - Path where the SQLite database should be created/modified.table_name
str, optional - Name for the table to create. If not provided, uses the CSV filename (cleaned). Defaults to None.
Returns:
str
- The schema of the created database.
Raises:
CSVError
- If there are issues with the CSV file (doesn't exist, empty, parsing error).DatabaseError
- If there are issues with database creation or modification.
Notes:
- Issues warnings for column name changes and existing database/table modifications.
- Automatically cleans column names for SQLite compatibility.
- Handles NULL values, timestamps, and numeric data types appropriately.
get_table_names_from_schema
def get_table_names_from_schema(schema: str) -> List[str]
Extract table names from a database schema string.
This function parses CREATE TABLE statements to extract table names.
Arguments:
schema
str - The database schema string containing CREATE TABLE statements.
Returns:
List[str]
- A list of table names found in the schema. Returns an empty list if no tables are found or if the schema is empty.
SQLTool Objects
class SQLTool(Tool)
A tool for executing SQL commands in an SQLite database.
This tool provides an interface for interacting with SQLite databases, including executing queries, managing schema, and handling table operations. It supports both read-only and write operations based on configuration.
Attributes:
description
Text - A description of what the SQL tool does.database
Text - The database URI or path.schema
Text - The database schema containing table definitions.tables
Optional[Union[List[Text], Text]] - List of table names that can be accessed by this tool. If None, all tables are accessible.enable_commit
bool - Whether write operations (INSERT, UPDATE, DELETE) are allowed. If False, only read operations are permitted.status
AssetStatus - The current status of the tool (DRAFT or ONBOARDED).
__init__
def __init__(name: Text,
description: Text,
database: Text,
schema: Optional[Text] = None,
tables: Optional[Union[List[Text], Text]] = None,
enable_commit: bool = False,
**additional_info) -> None
Initialize a new SQLTool instance.
Arguments:
name
Text - The name of the tool.description
Text - A description of what the SQL tool does.database
Text - The database URI or path. Can be a local file path, S3 URI, or HTTP(S) URL.schema
Optional[Text], optional - The database schema containing table definitions. If not provided, will be inferred from the database. Defaults to None.tables
Optional[Union[List[Text], Text]], optional - List of table names that can be accessed by this tool. If not provided, all tables are accessible. Defaults to None.enable_commit
bool, optional - Whether write operations are allowed. If False, only read operations are permitted. Defaults to False.**additional_info
- Additional keyword arguments for tool configuration.
Raises:
SQLToolError
- If required parameters are missing or invalid.
to_dict
def to_dict() -> Dict[str, Text]
Convert the tool instance to a dictionary representation.
Returns:
Dict[str, Text]: A dictionary containing the tool's configuration with keys:
- name: The tool's name
- description: The tool's description
- parameters: List of parameter dictionaries containing:
- database: The database URI or path
- schema: The database schema
- tables: Comma-separated list of table names or None
- enable_commit: Whether write operations are allowed
- type: Always "sql"
validate
def validate()
Validate the SQL tool's configuration.
This method performs several checks:
- Verifies required fields (description, database) are provided
- Validates database path/URI format
- Infers schema from database if not provided
- Sets table list from schema if not provided
- Uploads local database file to storage
Raises:
SQLToolError
- If any validation check fails or if there are issues with database access or file operations.
deploy
def deploy() -> None
Deploy the SQL tool by downloading and preparing the database.
This method handles the deployment process:
- For HTTP(S) URLs: Downloads the database file
- Creates a unique local filename
- Uploads the database to the aiXplain platform
- Cleans up temporary files
Raises:
requests.exceptions.RequestException
- If downloading the database fails.Exception
- If there are issues with file operations or uploads.