Skip to main content

LangChain

In this article, we will mainly introduce how to use LangChain to connect to the CnosDB database, enabling communication between natural language and the database.

Introduction

LangChain is a framework for developing applications driven by language models.It can achieve the following functions:

  • Data Sensing: Connecting the language model with other data sources.
  • Subjectivity: allowing the language model to interact with its environment.

The main value of LangChain is:

  1. Componentization: provides tools for abstracting the use of language models, while also providing a series of implementations for each abstract tool.These components are modular and easy to use, whether you are using other parts of the LangChain framework.
  2. Ready-made chain structure: A structured combination of components used to complete specific advanced tasks. The ready-made chain structure makes it easy to get started.For more complex applications and nuanced use cases, components make it easy to customize existing chain structures or build new chain structures.

Realize Architecture

Implementation Architecture Diagram

From the architecture diagram, we can see that by using the components of LangChain with existing chains, users do not need to learn how to interact with SQL scripts and databases in advance, saving a lot of time and effort.With the powerful features of LangChain, SQLDatabase, SQL Agent, and OpenAI's large language models, we can now create applications that allow users to communicate with CnosDB using natural language.

Install and deploy LangChain

Execute the following command:

pip install langchain

Install CnosDB Dependency

pip install cnos-connector
# The cnosdb_connector version needs to be greater than 0.1.8

Connect CnosDB

  1. When using cnosdb_connector and SQLDatabase to connect to CnosDB, you need to create the uri required by SQLDatabase:
# Use make_cnosdb_langchain_uri to create a uri
uri = cnosdb_connector.make_cnosdb_langchain_uri()
# Create a DB using SQLDatabase.from_uri
db = SQLDatabase.from_uri(uri)
  1. Alternatively, use the from_cnosdb method of SQLDatabase:
def SQLDatabase.from_cnosdb(url: str = "127.0.0.1:8902",
user: str = "root",
password: str = "",
tenant: str = "cnosdb",
database: str = "public")
Parameter nameDescription
urlHostname and port number for the http connection of CnosDB service, excluding "http://" or "https://", with a default value of "127.0.0.1:8902".
userThe username used to connect to the CnosDB service, with a default value of "root".
passwordThe user password used to connect to the CnosDB service, with a default value of an empty string "".
tenantThe tenant name used to connect to the CnosDB service, with a default value of "cnosdb".
databaseName of the database in the CnosDB tenant.

Usage Examples

# Connect to CnosDB using SQLDatabase
from cnosdb_connector import make_cnosdb_langchain_uri
from langchain import SQLDatabase

uri = cnosdb_connector.make_cnosdb_langchain_uri()
db = SQLDatabase.from_uri(uri)

# Create OpenAI Chat LLM
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0, model_name="gpt-3.5-turbo")

SQL Database Chain Example

The following example demonstrates how to use SQL Chain to answer a question through a database:

from langchain import SQLDatabaseChain

db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)

db_chain.run(
"What is the average temperature of air at station XiaoMaiDao between October 19, 2022 and October 20, 2022?"
)
> Entering new  chain...
What is the average temperature of air at station XiaoMaiDao between October 19, 2022 and Occtober 20, 2022?
SQLQuery:SELECT AVG(temperature) FROM air WHERE station = 'XiaoMaiDao' AND time >= '2022-10-19' AND time < '2022-10-20'
SQLResult: [(68.0,)]
Answer:The average temperature of air at station XiaoMaiDao between October 19, 2022 and October 20, 2022 is 68.0.
> Finished chain.

SQL Database Agent Example

from langchain.agents import create_sql_agent
from langchain.agents.agent_toolkits import SQLDatabaseToolkit

toolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent = create_sql_agent(llm=llm, toolkit=toolkit, verbose=True)
agent.run(
"What is the average temperature of air at station XiaoMaiDao between October 19, 2022 and Occtober 20, 2022?"
)
> Entering new  chain...
Action: sql_db_list_tables
Action Input: ""
Observation: air
Thought:The "air" table seems relevant to the question. I should query the schema of the "air" table to see what columns are available.
Action: sql_db_schema
Action Input: "air"
Observation:
CREATE TABLE air (
pressure FLOAT,
station STRING,
temperature FLOAT,
time TIMESTAMP,
visibility FLOAT
)

/*
3 rows from air table:
pressure station temperature time visibility
75.0 XiaoMaiDao 67.0 2022-10-19T03:40:00 54.0
77.0 XiaoMaiDao 69.0 2022-10-19T04:40:00 56.0
76.0 XiaoMaiDao 68.0 2022-10-19T05:40:00 55.0
*/
Thought:The "temperature" column in the "air" table is relevant to the question. I can query the average temperature between the specified dates.
Action: sql_db_query
Action Input: "SELECT AVG(temperature) FROM air WHERE station = 'XiaoMaiDao' AND time >= '2022-10-19' AND time <= '2022-10-20'"
Observation: [(68.0,)]
Thought:The average temperature of air at station XiaoMaiDao between October 19, 2022 and October 20, 2022 is 68.0.
Final Answer: 68.0

> Finished chain.