docs: add root AGENTS.md guide for AI assistants
This commit is contained in:
@@ -0,0 +1,78 @@
|
||||
# 🤖 ONNX VC - Agent Guidelines & Architecture Map
|
||||
|
||||
This file serves as a guide for AI agents (Gemini, Claude, Cursor, etc.) working on the ONNX Voice Changer repository. It explains the project architecture, directory structure, core conventions, and how to maintain the codebase.
|
||||
|
||||
---
|
||||
|
||||
## 🛠️ Technology Stack
|
||||
1. **Backend:** Python 3.10+, WebSocket (using `websockets`), ONNX Runtime, NumPy, PyTorch (only for RVC export).
|
||||
2. **Frontend:** Next.js 15 (App Router), React 19, TypeScript, Tailwind CSS, Lucide React, Framer Motion.
|
||||
3. **Voice Conversion:** Retrieval-based Voice Conversion (RVC) models accelerated via ONNX Runtime (CPU, CUDA, DirectML).
|
||||
|
||||
---
|
||||
|
||||
## 📁 Repository Structure
|
||||
* [/server.py](file:///M:/Users/ahmad/project/onnx-voice-changer/server.py) — The main WebSocket API server.
|
||||
* [/frontend/](file:///M:/Users/ahmad/project/onnx-voice-changer/frontend) — Next.js 15 client dashboard app.
|
||||
* [/frontend-deprecated/](file:///M:/Users/ahmad/project/onnx-voice-changer/frontend-deprecated) — Legacy static single-page web files (HTML/CSS/JS). Do not modify.
|
||||
* [/docs/](file:///M:/Users/ahmad/project/onnx-voice-changer/docs) — Holds localized README documentation files (Indonesian, Spanish, Japanese, Chinese).
|
||||
* [/lib/](file:///M:/Users/ahmad/project/onnx-voice-changer/lib) — RVC models and export scripts (e.g. `export_onnx.py`).
|
||||
* [/weights/](file:///M:/Users/ahmad/project/onnx-voice-changer/weights) — Character voice models (e.g., `weights/HuTao/HuTao.onnx`).
|
||||
* [/pretrained/](file:///M:/Users/ahmad/project/onnx-voice-changer/pretrained) — Holds the pre-trained `vec-768-layer-12.onnx` ContentVec model.
|
||||
|
||||
---
|
||||
|
||||
## ⚙️ Core Architecture & Conventions
|
||||
|
||||
### 1. Pure API Backend (No Static Hosting)
|
||||
* **Rule:** The Python backend (`server.py`) operates **strictly as a WebSocket API**.
|
||||
* **Do NOT** configure Python to serve frontend static pages, build files, or index HTML.
|
||||
* The Next.js frontend client runs independently (via `npm run dev` or a separate production server).
|
||||
|
||||
### 2. WebSocket Audio Pipeline
|
||||
* Audio chunks are sent to and from the server as **binary WebSocket messages** containing raw `Float32` PCM audio data.
|
||||
* Configuration changes, telemetry, and status controls are handled using **JSON WebSocket messages** sent in the same connection.
|
||||
* Always check the message payload type (binary vs. string JSON text) in `server.py`.
|
||||
|
||||
### 3. Digital Signal Processing (DSP) Staging
|
||||
* Audio preprocessing is handled on the server side:
|
||||
1. **Low-Cut Filter:** Active Butterworth 1st order high-pass filter at 80Hz to eliminate AC hum.
|
||||
2. **Noise Gate:** Threshold-based silence gate to bypass inference when the user is silent.
|
||||
3. **Gain Controls:** Input and output gain staging before and after inference.
|
||||
* Ensure all DSP math is optimized using `numpy` arrays to maintain low latency.
|
||||
|
||||
### 4. RVC ONNX Export
|
||||
* PyTorch RVC models (`.pth`) must be converted to ONNX (`.onnx`) before inference.
|
||||
* Always use [/lib/export_onnx.py](file:///M:/Users/ahmad/project/onnx-voice-changer/lib/export_onnx.py) for conversion:
|
||||
```bash
|
||||
python lib/export_onnx.py --model_name <CharacterFolder>
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🎨 Frontend Design Guidelines
|
||||
* **Responsive Layout:** Must support mobile and desktop views, utilizing a collapsible sidebar.
|
||||
* **Themes & Accent Colors:** Supports dark/light mode toggling, with a custom accent color system (Purple, Blue, Emerald, Rose, Amber) stored in state.
|
||||
* **i18n Translation:** Do not hardcode English/Indonesian strings. Ensure all labels, warnings, and messages are registered in [translations.ts](file:///M:/Users/ahmad/project/onnx-voice-changer/frontend/src/utils/translations.ts).
|
||||
|
||||
---
|
||||
|
||||
## 🏃 Useful Development Commands
|
||||
|
||||
### Running Backend
|
||||
```bash
|
||||
python server.py --host 127.0.0.1 --port 8765 --device cuda
|
||||
```
|
||||
|
||||
### Running Frontend Dev Server
|
||||
```bash
|
||||
cd frontend
|
||||
npm run dev
|
||||
```
|
||||
|
||||
### Building Frontend Production Server
|
||||
```bash
|
||||
cd frontend
|
||||
npm run build
|
||||
npm run start
|
||||
```
|
||||
Reference in New Issue
Block a user