2.1 KiB

Raw Blame History Unescape Escape

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

基于reAct范式实现的agent

本项目用于验证语音控制大屏后端模块。获取用户输入后，由LLM进行意图识别，并通过function calling调用相关函数，实现语音控制大屏。使用的LLM是阿里开源QwQ-32B，模型特点为有一定的推理能力并且运行速度快。DeepSeek-R1由于不是天生支持function calling所以不考虑。

环境搭建

clone 本项目

git clone http://1.14.96.249:3000/old-tom/reActLLMDemo.git

安装依赖推荐使用uv创建虚拟环境，python版本为3.12及以上

uv sync

3.向量库部署和初始化 (docker)

注：向量库使用marqo,嵌入模型为hf/e5-base-v2,相似度查询效果不太好。

docker run --name marqo -it --privileged -p 8882:8882 --add-host host.docker.internal:host-gateway marqoai/marqo:latest

初始化：执行vector_db.py create_and_set_index()方法

测试：执行vector_db.py query_vector_db() 方法,参数为任意字符串

4.配置文件 env.toml

[base]
# 向量库相似度阈值
similarity_threshold = 0.93
# 模型供应商
model_form = 'siliconflow'
####### 模型配置 #######
[siliconflow]
# 硅基流动
# 密钥
api_key = ''
# 模型名称
model = ''
# API地址
base_url = ''
# 最大token数
max_tokens = 4096
# 温度系数
temperature = 0.6
# 是否流式返回
streaming = true

本地测试

环境搭建完成后，本地运行local_test.py 即可在终端体验对话

服务端简单实现

服务端使用fastapi实现,启动server.py 即可, 完整接口文档请访问 http://ip:port/docs

建立sse连接，用于接收模型返回（流式） [GET] http://ip:port/sse/{client_id}

参数：client_id: 客户端ID，用于区分不同客户端及历史聊天
请求对话 [GET] http://ip:port/chat/{client_id}?ask=xxx

参数：client_id: 客户端ID，用于区分不同客户端及历史聊天 ask: 请求内容

系统提示词

todo

替换向量库并升级嵌入模型为bge-m3

2.1 KiB Raw Blame History Unescape Escape