系统架构方案

总体架构

Frontend
  ├── Admin Console
  ├── Study Builder
  ├── Participant Interview UI
  ├── Results Dashboard
  └── Review / Evidence UI

Backend API
  ├── Auth / Workspace / Project API
  ├── Study Config API
  ├── Session Runtime API
  ├── Agent Harness Service
  ├── Interaction Tool Service
  ├── Extraction Service
  ├── Knowledge Base / RAG Service
  ├── Model Gateway
  ├── Export / Integration Service
  └── Observability / Audit Service

Storage
  ├── PostgreSQL
  ├── Vector Store / pgvector
  ├── Object Storage
  ├── Redis / Queue
  └── Event Log

服务拆分

1. API Service

负责：

auth；
workspace/project/study CRUD；
session API；
results API；
export API。

2. Agent Harness Service

负责：

加载 study config；
管理 session state；
决策下一步动作；
调用 LLM；
调用 tools；
生成 assistant message；
判断是否结束。

3. Interaction Service

负责：

校验 interaction schema；
记录 tool call；
向前端发送 UI event；
接收 tool result；
转换成 transcript/event；
通知 harness 继续。

4. Extraction Service

负责：

real-time extraction；
checkpoint extraction；
post-session extraction；
evidence linking；
confidence scoring；
conflict detection；
review queue。

5. Knowledge Base Service

负责：

文档上传；
chunking；
embedding；
retrieval；
citation；
KB versioning。

6. Model Gateway

负责：

provider abstraction；
model routing；
credentials；
cost tracking；
retry/fallback；
structured output parsing；
safety filter。

7. Observability Service

负责：

agent decisions；
prompts；
tool calls；
token/cost；
latency；
errors；
extraction quality；
audit log。

Runtime Event Flow

Participant sends message
  -> Session API receives input
  -> Store transcript turn
  -> Agent Harness loads state
  -> Harness checks coverage and next objective
  -> Optional RAG retrieval
  -> LLM decides next action
       ├── assistant message
       ├── render_interaction tool call
       ├── extract_data tool call
       └── conclude_interview tool call
  -> Backend streams response/events
  -> Frontend renders message or interaction UI
  -> User submits interaction result
  -> Store result and continue harness

Event Log

建议使用 append-only event log 记录 runtime：

session.created
participant.message.created
agent.message.started
agent.message.delta
agent.message.completed
tool.call.created
tool.call.completed
interaction.rendered
interaction.submitted
extraction.started
extraction.completed
session.completed

优点：

可重放；
易 debug；
支持审计；
适合 streaming；
便于后续分析 agent behavior。

配置版本化

每个 session 应记录：

study_config_version_id；
agent_config_version_id；
output_schema_version_id；
kb_version_ids；
enabled_tool_version_ids；
model_config_version_id。

否则后续无法解释为什么 Agent 当时这样问、为什么抽取出这个字段。

核心工程原则

研究配置和运行时状态分离。
Agent decision 要可审计。
UI tool call 要结构化，而不是塞 HTML。
Extraction 要可重跑。
Session 要可重放。
模型 provider 不应泄露到业务逻辑。
Tool registry 不应和单个 study 硬绑定。
Evidence 是一等对象。

系统架构方案

系统架构方案

总体架构

推荐技术栈

Frontend

Backend

Agent ↔ UI 协议

Database

服务拆分

1. API Service

2. Agent Harness Service

3. Interaction Service

4. Extraction Service

5. Knowledge Base Service

6. Model Gateway

7. Observability Service

Runtime Event Flow

Event Log

配置版本化

推荐部署形态

MVP 单体部署

平台化部署

核心工程原则