mirror of
https://github.com/linyqh/NarratoAI.git
synced 2025-12-13 20:42:48 +00:00
Compare commits
90 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
4f964ad98d | ||
|
|
dfb96e9b0f | ||
|
|
97bb59220f | ||
|
|
169daac94d | ||
|
|
c0e3ff045a | ||
|
|
7b9ef2f244 | ||
|
|
854cfab460 | ||
|
|
474ebe46e2 | ||
|
|
46042d17d6 | ||
|
|
eb57d2a0fe | ||
|
|
d5f089c9a7 | ||
|
|
77c0aa47f2 | ||
|
|
efa02d83ca | ||
|
|
eca1fcbe67 | ||
|
|
d7b1b51a36 | ||
|
|
4423195313 | ||
|
|
4b0f7c3bb9 | ||
|
|
bad4a95ced | ||
|
|
a99d752069 | ||
|
|
6b8082244c | ||
|
|
52f96f9eae | ||
|
|
2c5c7cbd77 | ||
|
|
303ba571cc | ||
|
|
067d82885b | ||
|
|
a26c07d3dc | ||
|
|
207b49c9cc | ||
|
|
f2ba9689e1 | ||
|
|
87afe738fe | ||
|
|
74b52eec7b | ||
|
|
b3fd32569e | ||
|
|
b5548b050d | ||
|
|
95e3b66bc7 | ||
|
|
b1bcedd5d5 | ||
|
|
81d8c55580 | ||
|
|
c41bd682a9 | ||
|
|
9811607756 | ||
|
|
d8a06cc591 | ||
|
|
287cddcc35 | ||
|
|
bb7362809a | ||
|
|
07da580919 | ||
|
|
aebd169900 | ||
|
|
a184662f8b | ||
|
|
787d17a1a9 | ||
|
|
e7db1668f8 | ||
|
|
e389412dc2 | ||
|
|
aff6aca00c | ||
|
|
7ae4263943 | ||
|
|
cd3a5bc837 | ||
|
|
4dc1448154 | ||
|
|
33fc3dab10 | ||
|
|
a15ab4c944 | ||
|
|
d83863182a | ||
|
|
1c8b526c3c | ||
|
|
4ca7ed9721 | ||
|
|
c7fdb3fc94 | ||
|
|
9132e2b148 | ||
|
|
271401af99 | ||
|
|
f70cfbab46 | ||
|
|
5ef9f4a10c | ||
|
|
d55754c7fb | ||
|
|
e76031832c | ||
|
|
eadaf1be6e | ||
|
|
79b0d613e3 | ||
|
|
706d73383e | ||
|
|
2e0c492778 | ||
|
|
13a87e2a00 | ||
|
|
458071d583 | ||
|
|
9c4b3338c2 | ||
|
|
053212b182 | ||
|
|
6f48fa2563 | ||
|
|
18d2efd664 | ||
|
|
70b8b49e41 | ||
|
|
c3d855c547 | ||
|
|
f740e5a4bd | ||
|
|
72165dbcd9 | ||
|
|
ebdae9998d | ||
|
|
316be8f422 | ||
|
|
3537c19f4b | ||
|
|
2ac74132fc | ||
|
|
6bbe4bc14b | ||
|
|
a94baee22a | ||
|
|
0d944413ab | ||
|
|
a70d396143 | ||
|
|
05fb2681d5 | ||
|
|
ef68697491 | ||
|
|
f2d652e7a8 | ||
|
|
ca05440fc0 | ||
|
|
57cafaa73f | ||
|
|
97b30e4390 | ||
|
|
716b22ef9a |
84
README-ja.md
84
README-ja.md
@ -1,84 +0,0 @@
|
||||
<div align="center">
|
||||
<h1 align="center" style="font-size: 2cm;"> NarratoAI 😎📽️ </h1>
|
||||
<h3 align="center">一体型AI映画解説および自動ビデオ編集ツール🎬🎞️ </h3>
|
||||
|
||||
<h3>📖 <a href="README-cn.md">简体中文</a> | <a href="README.md">English</a> | 日本語 </h3>
|
||||
<div align="center">
|
||||
|
||||
[//]: # ( <a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FNarratoAI | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>)
|
||||
</div>
|
||||
<br>
|
||||
NarratoAIは、LLMを活用してスクリプト作成、自動ビデオ編集、ナレーション、字幕生成の一体型ソリューションを提供する自動化ビデオナレーションツールです。
|
||||
<br>
|
||||
|
||||
[](https://github.com/linyqh/NarratoAI)
|
||||
[](https://github.com/linyqh/NarratoAI/blob/main/LICENSE)
|
||||
[](https://github.com/linyqh/NarratoAI/issues)
|
||||
[](https://github.com/linyqh/NarratoAI/stargazers)
|
||||
|
||||
<a href="https://discord.gg/uVAJftcm" target="_blank">💬 Discordオープンソースコミュニティに参加して、プロジェクトの最新情報を入手しましょう。</a>
|
||||
|
||||
<h2><a href="https://p9mf6rjv3c.feishu.cn/wiki/SP8swLLZki5WRWkhuFvc2CyInDg?from=from_copylink" target="_blank">🎉🎉🎉 公式ドキュメント 🎉🎉🎉</a> </h2>
|
||||
<h3>ホーム</h3>
|
||||
|
||||

|
||||
|
||||
<h3>ビデオレビューインターフェース</h3>
|
||||
|
||||

|
||||
|
||||
</div>
|
||||
|
||||
## 最新情報
|
||||
- 2024.11.24 Discordコミュニティ開設:https://discord.gg/uVAJftcm
|
||||
- 2024.11.11 オープンソースコミュニティに移行、参加を歓迎します! [公式コミュニティに参加](https://github.com/linyqh/NarratoAI/wiki)
|
||||
- 2024.11.10 公式ドキュメント公開、詳細は [公式ドキュメント](https://p9mf6rjv3c.feishu.cn/wiki/SP8swLLZki5WRWkhuFvc2CyInDg) を参照
|
||||
- 2024.11.10 新バージョンv0.3.5リリース;ビデオ編集プロセスの最適化
|
||||
|
||||
## 今後の計画 🥳
|
||||
- [x] Windows統合パックリリース
|
||||
- [x] ストーリー生成プロセスの最適化、生成効果の向上
|
||||
- [x] バージョン0.3.5統合パックリリース
|
||||
- [x] アリババQwen2-VL大規模モデルのビデオ理解サポート
|
||||
- [x] 短編ドラマの解説サポート
|
||||
- [x] 一クリックで素材を統合
|
||||
- [x] 一クリックで文字起こし
|
||||
- [x] 一クリックでキャッシュをクリア
|
||||
- [ ] ジャン映草稿のエクスポートをサポート
|
||||
- [ ] 主役の顔のマッチング
|
||||
- [ ] 音声、スクリプト、ビデオ素材に基づいて自動マッチングをサポート
|
||||
- [ ] より多くのTTSエンジンをサポート
|
||||
- [ ] ...
|
||||
|
||||
## システム要件 📦
|
||||
|
||||
- 推奨最低:CPU 4コア以上、メモリ8GB以上、GPUは必須ではありません
|
||||
- Windows 10またはMacOS 11.0以上
|
||||
|
||||
## フィードバックと提案 📢
|
||||
|
||||
👏 1. [issue](https://github.com/linyqh/NarratoAI/issues)または[pull request](https://github.com/linyqh/NarratoAI/pulls)を提出できます
|
||||
|
||||
💬 2. [オープンソースコミュニティ交流グループに参加](https://github.com/linyqh/NarratoAI/wiki)
|
||||
|
||||
📷 3. 公式アカウント【NarratoAI助手】をフォローして最新情報を入手
|
||||
|
||||
## 参考プロジェクト 📚
|
||||
- https://github.com/FujiwaraChoki/MoneyPrinter
|
||||
- https://github.com/harry0703/MoneyPrinterTurbo
|
||||
|
||||
このプロジェクトは上記のプロジェクトを基にリファクタリングされ、映画解説機能が追加されました。オリジナルの作者に感謝します 🥳🥳🥳
|
||||
|
||||
## 作者にコーヒーを一杯おごる ☕️
|
||||
<div style="display: flex; justify-content: space-between;">
|
||||
<img src="https://github.com/user-attachments/assets/5038ccfb-addf-4db1-9966-99415989fd0c" alt="Image 1" style="width: 350px; height: 350px; margin: auto;"/>
|
||||
<img src="https://github.com/user-attachments/assets/07d4fd58-02f0-425c-8b59-2ab94b4f09f8" alt="Image 2" style="width: 350px; height: 350px; margin: auto;"/>
|
||||
</div>
|
||||
|
||||
## ライセンス 📝
|
||||
|
||||
[`LICENSE`](LICENSE) ファイルをクリックして表示
|
||||
|
||||
## Star History
|
||||
|
||||
[](https://star-history.com/#linyqh/NarratoAI&Date)
|
||||
@ -4,7 +4,7 @@
|
||||
<h3 align="center">一站式 AI 影视解说+自动化剪辑工具🎬🎞️ </h3>
|
||||
|
||||
|
||||
<h3>📖 <a href="README-en.md">English</a> | 简体中文 | <a href="README-ja.md">日本語</a> </h3>
|
||||
<h3>📖 <a href="README-en.md">English</a> | 简体中文 </h3>
|
||||
<div align="center">
|
||||
|
||||
[//]: # ( <a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FNarratoAI | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>)
|
||||
@ -31,6 +31,7 @@ NarratoAI 是一个自动化影视解说工具,基于LLM实现文案撰写、
|
||||
本项目仅供学习和研究使用,不得商用。如需商业授权,请联系作者。
|
||||
|
||||
## 最新资讯
|
||||
- 2025.11.20 发布新版本 0.7.5, 新增 [IndexTTS2](https://github.com/index-tts/index-tts) 语音克隆支持
|
||||
- 2025.10.15 发布新版本 0.7.3, 使用 [LiteLLM](https://github.com/BerriAI/litellm) 管理模型供应商
|
||||
- 2025.09.10 发布新版本 0.7.2, 新增腾讯云tts
|
||||
- 2025.08.18 发布新版本 0.7.1,支持 **语音克隆** 和 最新大模型
|
||||
|
||||
@ -52,6 +52,7 @@ def save_config():
|
||||
_cfg["soulvoice"] = soulvoice
|
||||
_cfg["ui"] = ui
|
||||
_cfg["tts_qwen"] = tts_qwen
|
||||
_cfg["indextts2"] = indextts2
|
||||
f.write(toml.dumps(_cfg))
|
||||
|
||||
|
||||
@ -65,6 +66,7 @@ soulvoice = _cfg.get("soulvoice", {})
|
||||
ui = _cfg.get("ui", {})
|
||||
frames = _cfg.get("frames", {})
|
||||
tts_qwen = _cfg.get("tts_qwen", {})
|
||||
indextts2 = _cfg.get("indextts2", {})
|
||||
|
||||
hostname = socket.gethostname()
|
||||
|
||||
|
||||
@ -187,8 +187,27 @@ class LiteLLMVisionProvider(VisionModelProvider):
|
||||
# 调用 LiteLLM
|
||||
try:
|
||||
# 准备参数
|
||||
effective_model_name = self.model_name
|
||||
|
||||
# SiliconFlow 特殊处理
|
||||
if self.model_name.lower().startswith("siliconflow/"):
|
||||
# 替换 provider 为 openai
|
||||
if "/" in self.model_name:
|
||||
effective_model_name = f"openai/{self.model_name.split('/', 1)[1]}"
|
||||
else:
|
||||
effective_model_name = f"openai/{self.model_name}"
|
||||
|
||||
# 确保设置了 OPENAI_API_KEY (如果尚未设置)
|
||||
import os
|
||||
if not os.environ.get("OPENAI_API_KEY") and os.environ.get("SILICONFLOW_API_KEY"):
|
||||
os.environ["OPENAI_API_KEY"] = os.environ.get("SILICONFLOW_API_KEY")
|
||||
|
||||
# 确保设置了 base_url (如果尚未设置)
|
||||
if not hasattr(self, '_api_base'):
|
||||
self._api_base = "https://api.siliconflow.cn/v1"
|
||||
|
||||
completion_kwargs = {
|
||||
"model": self.model_name,
|
||||
"model": effective_model_name,
|
||||
"messages": messages,
|
||||
"temperature": kwargs.get("temperature", 1.0),
|
||||
"max_tokens": kwargs.get("max_tokens", 4000)
|
||||
@ -198,6 +217,12 @@ class LiteLLMVisionProvider(VisionModelProvider):
|
||||
if hasattr(self, '_api_base'):
|
||||
completion_kwargs["api_base"] = self._api_base
|
||||
|
||||
# 支持动态传递 api_key 和 api_base
|
||||
if "api_key" in kwargs:
|
||||
completion_kwargs["api_key"] = kwargs["api_key"]
|
||||
if "api_base" in kwargs:
|
||||
completion_kwargs["api_base"] = kwargs["api_base"]
|
||||
|
||||
response = await acompletion(**completion_kwargs)
|
||||
|
||||
if response.choices and len(response.choices) > 0:
|
||||
@ -346,8 +371,27 @@ class LiteLLMTextProvider(TextModelProvider):
|
||||
messages = self._build_messages(prompt, system_prompt)
|
||||
|
||||
# 准备参数
|
||||
effective_model_name = self.model_name
|
||||
|
||||
# SiliconFlow 特殊处理
|
||||
if self.model_name.lower().startswith("siliconflow/"):
|
||||
# 替换 provider 为 openai
|
||||
if "/" in self.model_name:
|
||||
effective_model_name = f"openai/{self.model_name.split('/', 1)[1]}"
|
||||
else:
|
||||
effective_model_name = f"openai/{self.model_name}"
|
||||
|
||||
# 确保设置了 OPENAI_API_KEY (如果尚未设置)
|
||||
import os
|
||||
if not os.environ.get("OPENAI_API_KEY") and os.environ.get("SILICONFLOW_API_KEY"):
|
||||
os.environ["OPENAI_API_KEY"] = os.environ.get("SILICONFLOW_API_KEY")
|
||||
|
||||
# 确保设置了 base_url (如果尚未设置)
|
||||
if not hasattr(self, '_api_base'):
|
||||
self._api_base = "https://api.siliconflow.cn/v1"
|
||||
|
||||
completion_kwargs = {
|
||||
"model": self.model_name,
|
||||
"model": effective_model_name,
|
||||
"messages": messages,
|
||||
"temperature": temperature
|
||||
}
|
||||
@ -369,6 +413,12 @@ class LiteLLMTextProvider(TextModelProvider):
|
||||
if hasattr(self, '_api_base'):
|
||||
completion_kwargs["api_base"] = self._api_base
|
||||
|
||||
# 支持动态传递 api_key 和 api_base (修复认证问题)
|
||||
if "api_key" in kwargs:
|
||||
completion_kwargs["api_key"] = kwargs["api_key"]
|
||||
if "api_base" in kwargs:
|
||||
completion_kwargs["api_base"] = kwargs["api_base"]
|
||||
|
||||
try:
|
||||
# 调用 LiteLLM(自动重试)
|
||||
response = await acompletion(**completion_kwargs)
|
||||
|
||||
@ -251,7 +251,9 @@ class SubtitleAnalyzerAdapter:
|
||||
UnifiedLLMService.analyze_subtitle,
|
||||
subtitle_content=subtitle_content,
|
||||
provider=self.provider,
|
||||
temperature=1.0
|
||||
temperature=1.0,
|
||||
api_key=self.api_key,
|
||||
api_base=self.base_url
|
||||
)
|
||||
|
||||
return {
|
||||
@ -301,7 +303,9 @@ class SubtitleAnalyzerAdapter:
|
||||
system_prompt="你是一位专业的短视频解说脚本撰写专家。",
|
||||
provider=self.provider,
|
||||
temperature=temperature,
|
||||
response_format="json"
|
||||
response_format="json",
|
||||
api_key=self.api_key,
|
||||
api_base=self.base_url
|
||||
)
|
||||
|
||||
# 清理JSON输出
|
||||
|
||||
@ -1107,6 +1107,10 @@ def tts(
|
||||
if tts_engine == "edge_tts":
|
||||
logger.info("分发到 Edge TTS")
|
||||
return azure_tts_v1(text, voice_name, voice_rate, voice_pitch, voice_file)
|
||||
|
||||
if tts_engine == "indextts2":
|
||||
logger.info("分发到 IndexTTS2")
|
||||
return indextts2_tts(text, voice_name, voice_file, speed=voice_rate)
|
||||
|
||||
# Fallback for unknown engine - default to azure v1
|
||||
logger.warning(f"未知的 TTS 引擎: '{tts_engine}', 将默认使用 Edge TTS (Azure V1)。")
|
||||
@ -1541,8 +1545,8 @@ def tts_multiple(task_id: str, list_script: list, voice_name: str, voice_rate: f
|
||||
f"或者使用其他 tts 引擎")
|
||||
continue
|
||||
else:
|
||||
# SoulVoice 引擎不生成字幕文件
|
||||
if is_soulvoice_voice(voice_name) or is_qwen_engine(tts_engine):
|
||||
# SoulVoice、Qwen3、IndexTTS2 引擎不生成字幕文件
|
||||
if is_soulvoice_voice(voice_name) or is_qwen_engine(tts_engine) or tts_engine == "indextts2":
|
||||
# 获取实际音频文件的时长
|
||||
duration = get_audio_duration_from_file(audio_file)
|
||||
if duration <= 0:
|
||||
@ -1943,4 +1947,127 @@ def parse_soulvoice_voice(voice_name: str) -> str:
|
||||
return voice_name
|
||||
|
||||
|
||||
def parse_indextts2_voice(voice_name: str) -> str:
|
||||
"""
|
||||
解析 IndexTTS2 语音名称
|
||||
支持格式:indextts2:reference_audio_path
|
||||
返回参考音频文件路径
|
||||
"""
|
||||
if voice_name.startswith("indextts2:"):
|
||||
return voice_name[10:] # 移除 "indextts2:" 前缀
|
||||
return voice_name
|
||||
|
||||
|
||||
def indextts2_tts(text: str, voice_name: str, voice_file: str, speed: float = 1.0) -> Union[SubMaker, None]:
|
||||
"""
|
||||
使用 IndexTTS2 API 进行零样本语音克隆
|
||||
|
||||
Args:
|
||||
text: 要转换的文本
|
||||
voice_name: 参考音频路径(格式:indextts2:path/to/audio.wav)
|
||||
voice_file: 输出音频文件路径
|
||||
speed: 语音速度(此引擎暂不支持速度调节)
|
||||
|
||||
Returns:
|
||||
SubMaker: 包含时间戳信息的字幕制作器,失败时返回 None
|
||||
"""
|
||||
# 获取配置
|
||||
api_url = config.indextts2.get("api_url", "http://192.168.3.6:8081/tts")
|
||||
infer_mode = config.indextts2.get("infer_mode", "普通推理")
|
||||
temperature = config.indextts2.get("temperature", 1.0)
|
||||
top_p = config.indextts2.get("top_p", 0.8)
|
||||
top_k = config.indextts2.get("top_k", 30)
|
||||
do_sample = config.indextts2.get("do_sample", True)
|
||||
num_beams = config.indextts2.get("num_beams", 3)
|
||||
repetition_penalty = config.indextts2.get("repetition_penalty", 10.0)
|
||||
|
||||
# 解析参考音频路径
|
||||
reference_audio_path = parse_indextts2_voice(voice_name)
|
||||
|
||||
if not reference_audio_path or not os.path.exists(reference_audio_path):
|
||||
logger.error(f"IndexTTS2 参考音频文件不存在: {reference_audio_path}")
|
||||
return None
|
||||
|
||||
# 准备请求数据
|
||||
files = {
|
||||
'prompt_audio': open(reference_audio_path, 'rb')
|
||||
}
|
||||
|
||||
data = {
|
||||
'text': text.strip(),
|
||||
'infer_mode': infer_mode,
|
||||
'temperature': temperature,
|
||||
'top_p': top_p,
|
||||
'top_k': top_k,
|
||||
'do_sample': do_sample,
|
||||
'num_beams': num_beams,
|
||||
'repetition_penalty': repetition_penalty,
|
||||
}
|
||||
|
||||
# 重试机制
|
||||
for attempt in range(3):
|
||||
try:
|
||||
logger.info(f"第 {attempt + 1} 次调用 IndexTTS2 API")
|
||||
|
||||
# 设置代理
|
||||
proxies = {}
|
||||
if config.proxy.get("http"):
|
||||
proxies = {
|
||||
'http': config.proxy.get("http"),
|
||||
'https': config.proxy.get("https", config.proxy.get("http"))
|
||||
}
|
||||
|
||||
# 调用 API
|
||||
response = requests.post(
|
||||
api_url,
|
||||
files=files,
|
||||
data=data,
|
||||
proxies=proxies,
|
||||
timeout=120 # IndexTTS2 推理可能需要较长时间
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
# 保存音频文件
|
||||
with open(voice_file, 'wb') as f:
|
||||
f.write(response.content)
|
||||
|
||||
logger.info(f"IndexTTS2 成功生成音频: {voice_file}, 大小: {len(response.content)} 字节")
|
||||
|
||||
# IndexTTS2 不支持精确字幕生成,返回简单的 SubMaker 对象
|
||||
sub_maker = SubMaker()
|
||||
# 估算音频时长(基于文本长度)
|
||||
estimated_duration_ms = max(1000, int(len(text) * 200))
|
||||
sub_maker.create_sub((0, estimated_duration_ms * 10000), text)
|
||||
|
||||
return sub_maker
|
||||
|
||||
else:
|
||||
logger.error(f"IndexTTS2 API 调用失败: {response.status_code} - {response.text}")
|
||||
|
||||
except requests.exceptions.Timeout:
|
||||
logger.error(f"IndexTTS2 API 调用超时 (尝试 {attempt + 1}/3)")
|
||||
except requests.exceptions.RequestException as e:
|
||||
logger.error(f"IndexTTS2 API 网络错误: {str(e)} (尝试 {attempt + 1}/3)")
|
||||
except Exception as e:
|
||||
logger.error(f"IndexTTS2 TTS 处理错误: {str(e)} (尝试 {attempt + 1}/3)")
|
||||
finally:
|
||||
# 确保关闭文件
|
||||
try:
|
||||
files['prompt_audio'].close()
|
||||
except:
|
||||
pass
|
||||
|
||||
if attempt < 2: # 不是最后一次尝试
|
||||
time.sleep(2) # 等待2秒后重试
|
||||
# 重新打开文件用于下次重试
|
||||
if attempt < 2:
|
||||
try:
|
||||
files['prompt_audio'] = open(reference_audio_path, 'rb')
|
||||
except:
|
||||
pass
|
||||
|
||||
logger.error("IndexTTS2 TTS 生成失败,已达到最大重试次数")
|
||||
return None
|
||||
|
||||
|
||||
|
||||
|
||||
@ -1,5 +1,5 @@
|
||||
[app]
|
||||
project_version="0.7.4"
|
||||
project_version="0.7.5"
|
||||
|
||||
# LLM API 超时配置(秒)
|
||||
llm_vision_timeout = 120 # 视觉模型基础超时时间
|
||||
@ -115,10 +115,30 @@
|
||||
# 访问 https://bailian.console.aliyun.com/?tab=model#/api-key 获取你的 API 密钥
|
||||
api_key = ""
|
||||
model_name = "qwen3-tts-flash"
|
||||
|
||||
[indextts2]
|
||||
# IndexTTS2 语音克隆配置
|
||||
# 这是一个开源的零样本语音克隆项目,需要自行部署
|
||||
# 项目地址:https://github.com/index-tts/index-tts
|
||||
# 默认 API 地址(本地部署)
|
||||
api_url = "http://127.0.0.1:8081/tts"
|
||||
|
||||
# 默认参考音频路径(可选)
|
||||
# reference_audio = "/path/to/reference_audio.wav"
|
||||
|
||||
# 推理模式:普通推理 / 快速推理
|
||||
infer_mode = "普通推理"
|
||||
|
||||
# 高级参数
|
||||
temperature = 1.0
|
||||
top_p = 0.8
|
||||
top_k = 30
|
||||
do_sample = true
|
||||
num_beams = 3
|
||||
repetition_penalty = 10.0
|
||||
|
||||
[ui]
|
||||
# TTS 引擎选择
|
||||
# 可选:edge_tts, azure_speech, soulvoice, tencent_tts, tts_qwen
|
||||
# TTS引擎选择 (edge_tts, azure_speech, soulvoice, tencent_tts, tts_qwen)
|
||||
tts_engine = "edge_tts"
|
||||
|
||||
# Edge TTS 配置
|
||||
|
||||
@ -1,31 +0,0 @@
|
||||
ARG BASE=nvidia/cuda:12.1.0-devel-ubuntu22.04
|
||||
FROM ${BASE}
|
||||
|
||||
# 设置环境变量
|
||||
ENV http_proxy=http://host.docker.internal:7890
|
||||
ENV https_proxy=http://host.docker.internal:7890
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
|
||||
# 安装系统依赖
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
gcc g++ make git python3 python3-dev python3-pip python3-venv python3-wheel \
|
||||
espeak-ng libsndfile1-dev nano vim unzip wget xz-utils && \
|
||||
apt-get clean && \
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# 设置工作目录
|
||||
WORKDIR /root/MiniCPM-V/
|
||||
|
||||
# 安装 Python 依赖
|
||||
RUN git clone https://github.com/OpenBMB/MiniCPM-V.git && \
|
||||
cd MiniCPM-V && \
|
||||
pip3 install decord && \
|
||||
pip3 install --no-cache-dir -r requirements.txt && \
|
||||
pip3 install flash_attn
|
||||
|
||||
# 清理代理环境变量
|
||||
ENV http_proxy=""
|
||||
ENV https_proxy=""
|
||||
|
||||
# 设置 PYTHONPATH
|
||||
ENV PYTHONPATH="/root/MiniCPM-V/"
|
||||
@ -1,174 +0,0 @@
|
||||
# 音频音量平衡优化 - 完成总结
|
||||
|
||||
## 问题解决
|
||||
|
||||
✅ **已解决**:视频原声音量比TTS解说音量小的问题
|
||||
|
||||
### 原始问题
|
||||
- 即使设置视频原声为1.0,解说音量为0.7,原声依然比解说小很多
|
||||
- 用户体验差,需要手动调整音量才能听清原声
|
||||
|
||||
### 根本原因
|
||||
1. **音频响度差异**:TTS音频通常具有-24dB LUFS的响度,而视频原声可能只有-33dB LUFS
|
||||
2. **缺乏标准化**:简单的音量乘法器无法解决响度差异问题
|
||||
3. **配置不合理**:默认的原声音量0.7太低
|
||||
|
||||
## 解决方案实施
|
||||
|
||||
### 1. 音频分析工具 ✅
|
||||
- **文件**: `app/services/audio_normalizer.py`
|
||||
- **功能**: LUFS响度分析、RMS计算、音频标准化
|
||||
- **测试结果**:
|
||||
- TTS测试音频: -24.15 LUFS
|
||||
- 原声测试音频: -32.95 LUFS
|
||||
- 智能调整建议: TTS×1.61, 原声×3.00
|
||||
|
||||
### 2. 配置优化 ✅
|
||||
- **文件**: `app/models/schema.py`
|
||||
- **改进**:
|
||||
- 原声默认音量: 0.7 → 1.2
|
||||
- 最大音量限制: 1.0 → 2.0
|
||||
- 新增智能调整开关
|
||||
|
||||
### 3. 智能音量调整 ✅
|
||||
- **文件**: `app/services/generate_video.py`
|
||||
- **功能**: 自动分析音频响度差异,计算合适的调整系数
|
||||
- **特点**: 保留用户设置的相对比例,限制调整范围
|
||||
|
||||
### 4. 配置管理系统 ✅
|
||||
- **文件**: `app/config/audio_config.py`
|
||||
- **功能**:
|
||||
- 不同视频类型的音量配置
|
||||
- 预设配置文件(balanced、voice_focused等)
|
||||
- 内容类型推荐
|
||||
|
||||
### 5. 任务集成 ✅
|
||||
- **文件**: `app/services/task.py`
|
||||
- **改进**: 自动应用优化的音量配置
|
||||
- **兼容性**: 向后兼容现有设置
|
||||
|
||||
## 测试验证
|
||||
|
||||
### 功能测试 ✅
|
||||
```bash
|
||||
python test_audio_optimization.py
|
||||
```
|
||||
- 音频分析功能正常
|
||||
- 配置系统工作正常
|
||||
- 智能调整计算正确
|
||||
|
||||
### 示例演示 ✅
|
||||
```bash
|
||||
python examples/audio_volume_example.py
|
||||
```
|
||||
- 基本配置使用
|
||||
- 智能分析演示
|
||||
- 实际场景应用
|
||||
|
||||
## 效果对比
|
||||
|
||||
| 项目 | 优化前 | 优化后 | 改进 |
|
||||
|------|--------|--------|------|
|
||||
| TTS音量 | 0.7 | 0.8 (智能调整) | 更平衡 |
|
||||
| 原声音量 | 1.0 | 1.3 (智能调整) | 显著提升 |
|
||||
| 响度差异 | ~9dB | ~3dB | 大幅缩小 |
|
||||
| 用户体验 | 需手动调整 | 自动平衡 | 明显改善 |
|
||||
|
||||
## 配置推荐
|
||||
|
||||
### 混合内容(默认)
|
||||
```python
|
||||
{
|
||||
'tts_volume': 0.8,
|
||||
'original_volume': 1.3,
|
||||
'bgm_volume': 0.3
|
||||
}
|
||||
```
|
||||
|
||||
### 原声为主的内容
|
||||
```python
|
||||
{
|
||||
'tts_volume': 0.6,
|
||||
'original_volume': 1.6,
|
||||
'bgm_volume': 0.1
|
||||
}
|
||||
```
|
||||
|
||||
### 教育类视频
|
||||
```python
|
||||
{
|
||||
'tts_volume': 0.9,
|
||||
'original_volume': 0.8,
|
||||
'bgm_volume': 0.2
|
||||
}
|
||||
```
|
||||
|
||||
## 技术特点
|
||||
|
||||
### 智能分析
|
||||
- 使用FFmpeg的loudnorm滤镜进行LUFS分析
|
||||
- RMS计算作为备用方案
|
||||
- 自动计算最佳音量调整系数
|
||||
|
||||
### 配置灵活
|
||||
- 支持多种视频类型
|
||||
- 预设配置文件
|
||||
- 用户自定义优先
|
||||
|
||||
### 性能优化
|
||||
- 可选的智能分析(默认开启)
|
||||
- 临时文件自动清理
|
||||
- 向后兼容现有代码
|
||||
|
||||
## 文件清单
|
||||
|
||||
### 核心文件
|
||||
- `app/services/audio_normalizer.py` - 音频分析和标准化
|
||||
- `app/config/audio_config.py` - 音频配置管理
|
||||
- `app/services/generate_video.py` - 集成智能调整
|
||||
- `app/services/task.py` - 任务处理优化
|
||||
- `app/models/schema.py` - 配置参数更新
|
||||
|
||||
### 测试和文档
|
||||
- `test_audio_optimization.py` - 功能测试脚本
|
||||
- `examples/audio_volume_example.py` - 使用示例
|
||||
- `docs/audio_optimization_guide.md` - 详细指南
|
||||
- `AUDIO_OPTIMIZATION_SUMMARY.md` - 本总结文档
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 自动优化(推荐)
|
||||
系统会自动应用优化配置,无需额外操作。
|
||||
|
||||
### 手动配置
|
||||
```python
|
||||
# 应用预设配置
|
||||
volumes = AudioConfig.apply_volume_profile('original_focused')
|
||||
|
||||
# 根据内容类型获取推荐
|
||||
volumes = get_recommended_volumes_for_content('original_heavy')
|
||||
```
|
||||
|
||||
### 关闭智能分析
|
||||
```python
|
||||
# 在 schema.py 中设置
|
||||
ENABLE_SMART_VOLUME = False
|
||||
```
|
||||
|
||||
## 后续改进建议
|
||||
|
||||
1. **用户界面集成**: 在WebUI中添加音量配置选项
|
||||
2. **实时预览**: 提供音量调整的实时预览功能
|
||||
3. **机器学习**: 基于用户反馈学习最佳配置
|
||||
4. **批量处理**: 支持批量音频标准化
|
||||
|
||||
## 结论
|
||||
|
||||
通过实施音频响度分析和智能音量调整,成功解决了视频原声音量过小的问题。新系统能够:
|
||||
|
||||
1. **自动检测**音频响度差异
|
||||
2. **智能调整**音量平衡
|
||||
3. **保持兼容**现有配置
|
||||
4. **提供灵活**的配置选项
|
||||
|
||||
用户现在可以享受到更平衡的音频体验,无需手动调整音量即可清晰听到视频原声和TTS解说。
|
||||
@ -1,367 +0,0 @@
|
||||
# NarratoAI 大模型服务迁移指南
|
||||
|
||||
## 📋 概述
|
||||
|
||||
本指南帮助开发者将现有代码从旧的大模型调用方式迁移到新的统一LLM服务架构。新架构提供了更好的模块化、错误处理和配置管理。
|
||||
|
||||
## 🔄 迁移对比
|
||||
|
||||
### 旧的调用方式 vs 新的调用方式
|
||||
|
||||
#### 1. 视觉分析器创建
|
||||
|
||||
**旧方式:**
|
||||
```python
|
||||
from app.utils import gemini_analyzer, qwenvl_analyzer
|
||||
|
||||
if provider == 'gemini':
|
||||
analyzer = gemini_analyzer.VisionAnalyzer(
|
||||
model_name=model,
|
||||
api_key=api_key,
|
||||
base_url=base_url
|
||||
)
|
||||
elif provider == 'qwenvl':
|
||||
analyzer = qwenvl_analyzer.QwenAnalyzer(
|
||||
model_name=model,
|
||||
api_key=api_key,
|
||||
base_url=base_url
|
||||
)
|
||||
```
|
||||
|
||||
**新方式:**
|
||||
```python
|
||||
from app.services.llm.unified_service import UnifiedLLMService
|
||||
|
||||
# 方式1: 直接使用统一服务
|
||||
results = await UnifiedLLMService.analyze_images(
|
||||
images=images,
|
||||
prompt=prompt,
|
||||
provider=provider # 可选,使用配置中的默认值
|
||||
)
|
||||
|
||||
# 方式2: 使用迁移适配器(向后兼容)
|
||||
from app.services.llm.migration_adapter import create_vision_analyzer
|
||||
analyzer = create_vision_analyzer(provider, api_key, model, base_url)
|
||||
results = await analyzer.analyze_images(images, prompt)
|
||||
```
|
||||
|
||||
#### 2. 文本生成
|
||||
|
||||
**旧方式:**
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(api_key=api_key, base_url=base_url)
|
||||
response = client.chat.completions.create(
|
||||
model=model,
|
||||
messages=[
|
||||
{"role": "system", "content": system_prompt},
|
||||
{"role": "user", "content": prompt}
|
||||
],
|
||||
temperature=temperature,
|
||||
response_format={"type": "json_object"}
|
||||
)
|
||||
result = response.choices[0].message.content
|
||||
```
|
||||
|
||||
**新方式:**
|
||||
```python
|
||||
from app.services.llm.unified_service import UnifiedLLMService
|
||||
|
||||
result = await UnifiedLLMService.generate_text(
|
||||
prompt=prompt,
|
||||
system_prompt=system_prompt,
|
||||
temperature=temperature,
|
||||
response_format="json"
|
||||
)
|
||||
```
|
||||
|
||||
#### 3. 解说文案生成
|
||||
|
||||
**旧方式:**
|
||||
```python
|
||||
from app.services.generate_narration_script import generate_narration
|
||||
|
||||
narration = generate_narration(
|
||||
markdown_content,
|
||||
api_key,
|
||||
base_url=base_url,
|
||||
model=model
|
||||
)
|
||||
# 手动解析JSON和验证格式
|
||||
import json
|
||||
narration_dict = json.loads(narration)['items']
|
||||
```
|
||||
|
||||
**新方式:**
|
||||
```python
|
||||
from app.services.llm.unified_service import UnifiedLLMService
|
||||
|
||||
# 自动验证输出格式
|
||||
narration_items = await UnifiedLLMService.generate_narration_script(
|
||||
prompt=prompt,
|
||||
validate_output=True # 自动验证JSON格式和字段
|
||||
)
|
||||
```
|
||||
|
||||
## 📝 具体迁移步骤
|
||||
|
||||
### 步骤1: 更新配置文件
|
||||
|
||||
**旧配置格式:**
|
||||
```toml
|
||||
[app]
|
||||
llm_provider = "openai"
|
||||
openai_api_key = "sk-xxx"
|
||||
openai_model_name = "gpt-4"
|
||||
|
||||
vision_llm_provider = "gemini"
|
||||
gemini_api_key = "xxx"
|
||||
gemini_model_name = "gemini-1.5-pro"
|
||||
```
|
||||
|
||||
**新配置格式:**
|
||||
```toml
|
||||
[app]
|
||||
# 视觉模型配置
|
||||
vision_llm_provider = "gemini"
|
||||
vision_gemini_api_key = "xxx"
|
||||
vision_gemini_model_name = "gemini-2.0-flash-lite"
|
||||
vision_gemini_base_url = "https://generativelanguage.googleapis.com/v1beta"
|
||||
|
||||
# 文本模型配置
|
||||
text_llm_provider = "openai"
|
||||
text_openai_api_key = "sk-xxx"
|
||||
text_openai_model_name = "gpt-4o-mini"
|
||||
text_openai_base_url = "https://api.openai.com/v1"
|
||||
```
|
||||
|
||||
### 步骤2: 更新导入语句
|
||||
|
||||
**旧导入:**
|
||||
```python
|
||||
from app.utils import gemini_analyzer, qwenvl_analyzer
|
||||
from app.services.generate_narration_script import generate_narration
|
||||
from app.services.SDE.short_drama_explanation import analyze_subtitle
|
||||
```
|
||||
|
||||
**新导入:**
|
||||
```python
|
||||
from app.services.llm.unified_service import UnifiedLLMService
|
||||
from app.services.llm.migration_adapter import (
|
||||
create_vision_analyzer,
|
||||
SubtitleAnalyzerAdapter
|
||||
)
|
||||
```
|
||||
|
||||
### 步骤3: 更新函数调用
|
||||
|
||||
#### 图片分析迁移
|
||||
|
||||
**旧代码:**
|
||||
```python
|
||||
def analyze_images_old(provider, api_key, model, base_url, images, prompt):
|
||||
if provider == 'gemini':
|
||||
analyzer = gemini_analyzer.VisionAnalyzer(
|
||||
model_name=model,
|
||||
api_key=api_key,
|
||||
base_url=base_url
|
||||
)
|
||||
else:
|
||||
analyzer = qwenvl_analyzer.QwenAnalyzer(
|
||||
model_name=model,
|
||||
api_key=api_key,
|
||||
base_url=base_url
|
||||
)
|
||||
|
||||
# 同步调用
|
||||
results = []
|
||||
for batch in batches:
|
||||
result = analyzer.analyze_batch(batch, prompt)
|
||||
results.append(result)
|
||||
return results
|
||||
```
|
||||
|
||||
**新代码:**
|
||||
```python
|
||||
async def analyze_images_new(images, prompt, provider=None):
|
||||
# 异步调用,自动批处理
|
||||
results = await UnifiedLLMService.analyze_images(
|
||||
images=images,
|
||||
prompt=prompt,
|
||||
provider=provider,
|
||||
batch_size=10
|
||||
)
|
||||
return results
|
||||
```
|
||||
|
||||
#### 字幕分析迁移
|
||||
|
||||
**旧代码:**
|
||||
```python
|
||||
from app.services.SDE.short_drama_explanation import analyze_subtitle
|
||||
|
||||
result = analyze_subtitle(
|
||||
subtitle_file_path=subtitle_path,
|
||||
api_key=api_key,
|
||||
model=model,
|
||||
base_url=base_url,
|
||||
provider=provider
|
||||
)
|
||||
```
|
||||
|
||||
**新代码:**
|
||||
```python
|
||||
# 方式1: 使用统一服务
|
||||
with open(subtitle_path, 'r', encoding='utf-8') as f:
|
||||
subtitle_content = f.read()
|
||||
|
||||
result = await UnifiedLLMService.analyze_subtitle(
|
||||
subtitle_content=subtitle_content,
|
||||
provider=provider,
|
||||
validate_output=True
|
||||
)
|
||||
|
||||
# 方式2: 使用适配器
|
||||
from app.services.llm.migration_adapter import SubtitleAnalyzerAdapter
|
||||
|
||||
analyzer = SubtitleAnalyzerAdapter(api_key, model, base_url, provider)
|
||||
result = analyzer.analyze_subtitle(subtitle_content)
|
||||
```
|
||||
|
||||
## 🔧 常见迁移问题
|
||||
|
||||
### 1. 同步 vs 异步调用
|
||||
|
||||
**问题:** 新架构使用异步调用,旧代码是同步的。
|
||||
|
||||
**解决方案:**
|
||||
```python
|
||||
# 在同步函数中调用异步函数
|
||||
import asyncio
|
||||
|
||||
def sync_function():
|
||||
result = asyncio.run(UnifiedLLMService.generate_text(prompt))
|
||||
return result
|
||||
|
||||
# 或者将整个函数改为异步
|
||||
async def async_function():
|
||||
result = await UnifiedLLMService.generate_text(prompt)
|
||||
return result
|
||||
```
|
||||
|
||||
### 2. 配置获取方式变化
|
||||
|
||||
**问题:** 配置键名发生变化。
|
||||
|
||||
**解决方案:**
|
||||
```python
|
||||
# 旧方式
|
||||
api_key = config.app.get('openai_api_key')
|
||||
model = config.app.get('openai_model_name')
|
||||
|
||||
# 新方式
|
||||
provider = config.app.get('text_llm_provider', 'openai')
|
||||
api_key = config.app.get(f'text_{provider}_api_key')
|
||||
model = config.app.get(f'text_{provider}_model_name')
|
||||
```
|
||||
|
||||
### 3. 错误处理更新
|
||||
|
||||
**旧方式:**
|
||||
```python
|
||||
try:
|
||||
result = some_llm_call()
|
||||
except Exception as e:
|
||||
print(f"Error: {e}")
|
||||
```
|
||||
|
||||
**新方式:**
|
||||
```python
|
||||
from app.services.llm.exceptions import LLMServiceError, ValidationError
|
||||
|
||||
try:
|
||||
result = await UnifiedLLMService.generate_text(prompt)
|
||||
except ValidationError as e:
|
||||
print(f"输出验证失败: {e.message}")
|
||||
except LLMServiceError as e:
|
||||
print(f"LLM服务错误: {e.message}")
|
||||
except Exception as e:
|
||||
print(f"未知错误: {e}")
|
||||
```
|
||||
|
||||
## ✅ 迁移检查清单
|
||||
|
||||
### 配置迁移
|
||||
- [ ] 更新配置文件格式
|
||||
- [ ] 验证所有API密钥配置正确
|
||||
- [ ] 运行配置验证器检查
|
||||
|
||||
### 代码迁移
|
||||
- [ ] 更新导入语句
|
||||
- [ ] 将同步调用改为异步调用
|
||||
- [ ] 更新错误处理机制
|
||||
- [ ] 使用新的统一接口
|
||||
|
||||
### 测试验证
|
||||
- [ ] 运行LLM服务测试脚本
|
||||
- [ ] 测试所有功能模块
|
||||
- [ ] 验证输出格式正确
|
||||
- [ ] 检查性能和稳定性
|
||||
|
||||
### 清理工作
|
||||
- [ ] 移除未使用的旧代码
|
||||
- [ ] 更新文档和注释
|
||||
- [ ] 清理过时的依赖
|
||||
|
||||
## 🚀 迁移最佳实践
|
||||
|
||||
### 1. 渐进式迁移
|
||||
- 先迁移一个模块,测试通过后再迁移其他模块
|
||||
- 保留旧代码作为备用方案
|
||||
- 使用迁移适配器确保向后兼容
|
||||
|
||||
### 2. 充分测试
|
||||
- 在每个迁移步骤后运行测试
|
||||
- 比较新旧实现的输出结果
|
||||
- 测试边界情况和错误处理
|
||||
|
||||
### 3. 监控和日志
|
||||
- 启用详细日志记录
|
||||
- 监控API调用成功率
|
||||
- 跟踪性能指标
|
||||
|
||||
### 4. 文档更新
|
||||
- 更新代码注释
|
||||
- 更新API文档
|
||||
- 记录迁移过程中的问题和解决方案
|
||||
|
||||
## 📞 获取帮助
|
||||
|
||||
如果在迁移过程中遇到问题:
|
||||
|
||||
1. **查看测试脚本输出**:
|
||||
```bash
|
||||
python app/services/llm/test_llm_service.py
|
||||
```
|
||||
|
||||
2. **验证配置**:
|
||||
```python
|
||||
from app.services.llm.config_validator import LLMConfigValidator
|
||||
results = LLMConfigValidator.validate_all_configs()
|
||||
LLMConfigValidator.print_validation_report(results)
|
||||
```
|
||||
|
||||
3. **查看详细日志**:
|
||||
```python
|
||||
from loguru import logger
|
||||
logger.add("migration.log", level="DEBUG")
|
||||
```
|
||||
|
||||
4. **参考示例代码**:
|
||||
- 查看 `app/services/llm/test_llm_service.py` 中的使用示例
|
||||
- 参考已迁移的文件如 `webui/tools/base.py`
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2025-01-07*
|
||||
@ -1,294 +0,0 @@
|
||||
# NarratoAI 大模型服务使用指南
|
||||
|
||||
## 📖 概述
|
||||
|
||||
NarratoAI 项目已完成大模型服务的全面重构,提供了统一、模块化、可扩展的大模型集成架构。新架构支持多种大模型供应商,具有严格的输出格式验证和完善的错误处理机制。
|
||||
|
||||
## 🏗️ 架构概览
|
||||
|
||||
### 核心组件
|
||||
|
||||
```
|
||||
app/services/llm/
|
||||
├── __init__.py # 模块入口
|
||||
├── base.py # 抽象基类
|
||||
├── manager.py # 服务管理器
|
||||
├── unified_service.py # 统一服务接口
|
||||
├── validators.py # 输出格式验证器
|
||||
├── exceptions.py # 异常类定义
|
||||
├── migration_adapter.py # 迁移适配器
|
||||
├── config_validator.py # 配置验证器
|
||||
├── test_llm_service.py # 测试脚本
|
||||
└── providers/ # 提供商实现
|
||||
├── __init__.py
|
||||
├── gemini_provider.py
|
||||
├── gemini_openai_provider.py
|
||||
├── openai_provider.py
|
||||
├── qwen_provider.py
|
||||
├── deepseek_provider.py
|
||||
└── siliconflow_provider.py
|
||||
```
|
||||
|
||||
### 支持的供应商
|
||||
|
||||
#### 视觉模型供应商
|
||||
- **Gemini** (原生API + OpenAI兼容)
|
||||
- **QwenVL** (通义千问视觉)
|
||||
- **Siliconflow** (硅基流动)
|
||||
|
||||
#### 文本生成模型供应商
|
||||
- **OpenAI** (标准OpenAI API)
|
||||
- **Gemini** (原生API + OpenAI兼容)
|
||||
- **DeepSeek** (深度求索)
|
||||
- **Qwen** (通义千问)
|
||||
- **Siliconflow** (硅基流动)
|
||||
|
||||
## ⚙️ 配置说明
|
||||
|
||||
### 配置文件格式
|
||||
|
||||
在 `config.toml` 中配置大模型服务:
|
||||
|
||||
```toml
|
||||
[app]
|
||||
# 视觉模型提供商配置
|
||||
vision_llm_provider = "gemini"
|
||||
|
||||
# Gemini 视觉模型
|
||||
vision_gemini_api_key = "your_gemini_api_key"
|
||||
vision_gemini_model_name = "gemini-2.0-flash-lite"
|
||||
vision_gemini_base_url = "https://generativelanguage.googleapis.com/v1beta"
|
||||
|
||||
# QwenVL 视觉模型
|
||||
vision_qwenvl_api_key = "your_qwen_api_key"
|
||||
vision_qwenvl_model_name = "qwen2.5-vl-32b-instruct"
|
||||
vision_qwenvl_base_url = "https://dashscope.aliyuncs.com/compatible-mode/v1"
|
||||
|
||||
# 文本模型提供商配置
|
||||
text_llm_provider = "openai"
|
||||
|
||||
# OpenAI 文本模型
|
||||
text_openai_api_key = "your_openai_api_key"
|
||||
text_openai_model_name = "gpt-4o-mini"
|
||||
text_openai_base_url = "https://api.openai.com/v1"
|
||||
|
||||
# DeepSeek 文本模型
|
||||
text_deepseek_api_key = "your_deepseek_api_key"
|
||||
text_deepseek_model_name = "deepseek-chat"
|
||||
text_deepseek_base_url = "https://api.deepseek.com"
|
||||
```
|
||||
|
||||
### 配置验证
|
||||
|
||||
使用配置验证器检查配置是否正确:
|
||||
|
||||
```python
|
||||
from app.services.llm.config_validator import LLMConfigValidator
|
||||
|
||||
# 验证所有配置
|
||||
results = LLMConfigValidator.validate_all_configs()
|
||||
|
||||
# 打印验证报告
|
||||
LLMConfigValidator.print_validation_report(results)
|
||||
|
||||
# 获取配置建议
|
||||
suggestions = LLMConfigValidator.get_config_suggestions()
|
||||
```
|
||||
|
||||
## 🚀 使用方法
|
||||
|
||||
### 1. 统一服务接口(推荐)
|
||||
|
||||
```python
|
||||
from app.services.llm.unified_service import UnifiedLLMService
|
||||
|
||||
# 图片分析
|
||||
results = await UnifiedLLMService.analyze_images(
|
||||
images=["path/to/image1.jpg", "path/to/image2.jpg"],
|
||||
prompt="请描述这些图片的内容",
|
||||
provider="gemini", # 可选,不指定则使用配置中的默认值
|
||||
batch_size=10
|
||||
)
|
||||
|
||||
# 文本生成
|
||||
text = await UnifiedLLMService.generate_text(
|
||||
prompt="请介绍人工智能的发展历史",
|
||||
system_prompt="你是一个专业的AI专家",
|
||||
provider="openai", # 可选
|
||||
temperature=0.7,
|
||||
response_format="json" # 可选,支持JSON格式输出
|
||||
)
|
||||
|
||||
# 解说文案生成(带验证)
|
||||
narration_items = await UnifiedLLMService.generate_narration_script(
|
||||
prompt="根据视频内容生成解说文案...",
|
||||
validate_output=True # 自动验证输出格式
|
||||
)
|
||||
|
||||
# 字幕分析
|
||||
analysis = await UnifiedLLMService.analyze_subtitle(
|
||||
subtitle_content="字幕内容...",
|
||||
validate_output=True
|
||||
)
|
||||
```
|
||||
|
||||
### 2. 直接使用服务管理器
|
||||
|
||||
```python
|
||||
from app.services.llm.manager import LLMServiceManager
|
||||
|
||||
# 获取视觉模型提供商
|
||||
vision_provider = LLMServiceManager.get_vision_provider("gemini")
|
||||
results = await vision_provider.analyze_images(images, prompt)
|
||||
|
||||
# 获取文本模型提供商
|
||||
text_provider = LLMServiceManager.get_text_provider("openai")
|
||||
text = await text_provider.generate_text(prompt)
|
||||
```
|
||||
|
||||
### 3. 迁移适配器(向后兼容)
|
||||
|
||||
```python
|
||||
from app.services.llm.migration_adapter import create_vision_analyzer
|
||||
|
||||
# 兼容旧的接口
|
||||
analyzer = create_vision_analyzer("gemini", api_key, model, base_url)
|
||||
results = await analyzer.analyze_images(images, prompt)
|
||||
```
|
||||
|
||||
## 🔍 输出格式验证
|
||||
|
||||
### 解说文案验证
|
||||
|
||||
```python
|
||||
from app.services.llm.validators import OutputValidator
|
||||
|
||||
# 验证解说文案格式
|
||||
try:
|
||||
narration_items = OutputValidator.validate_narration_script(output)
|
||||
print(f"验证成功,共 {len(narration_items)} 个片段")
|
||||
except ValidationError as e:
|
||||
print(f"验证失败: {e.message}")
|
||||
```
|
||||
|
||||
### JSON输出验证
|
||||
|
||||
```python
|
||||
# 验证JSON格式
|
||||
try:
|
||||
data = OutputValidator.validate_json_output(output)
|
||||
print("JSON格式验证成功")
|
||||
except ValidationError as e:
|
||||
print(f"JSON验证失败: {e.message}")
|
||||
```
|
||||
|
||||
## 🧪 测试和调试
|
||||
|
||||
### 运行测试脚本
|
||||
|
||||
```bash
|
||||
# 运行完整的LLM服务测试
|
||||
python app/services/llm/test_llm_service.py
|
||||
```
|
||||
|
||||
测试脚本会验证:
|
||||
- 配置有效性
|
||||
- 提供商信息获取
|
||||
- 文本生成功能
|
||||
- JSON格式生成
|
||||
- 字幕分析功能
|
||||
- 解说文案生成功能
|
||||
|
||||
### 调试技巧
|
||||
|
||||
1. **启用详细日志**:
|
||||
```python
|
||||
from loguru import logger
|
||||
logger.add("llm_service.log", level="DEBUG")
|
||||
```
|
||||
|
||||
2. **清空提供商缓存**:
|
||||
```python
|
||||
UnifiedLLMService.clear_cache()
|
||||
```
|
||||
|
||||
3. **检查提供商信息**:
|
||||
```python
|
||||
info = UnifiedLLMService.get_provider_info()
|
||||
print(info)
|
||||
```
|
||||
|
||||
## ⚠️ 注意事项
|
||||
|
||||
### 1. API密钥安全
|
||||
- 不要在代码中硬编码API密钥
|
||||
- 使用环境变量或配置文件管理密钥
|
||||
- 定期轮换API密钥
|
||||
|
||||
### 2. 错误处理
|
||||
- 所有LLM服务调用都应该包装在try-catch中
|
||||
- 使用适当的异常类型进行错误处理
|
||||
- 实现重试机制处理临时性错误
|
||||
|
||||
### 3. 性能优化
|
||||
- 合理设置批处理大小
|
||||
- 使用缓存避免重复调用
|
||||
- 监控API调用频率和成本
|
||||
|
||||
### 4. 模型选择
|
||||
- 根据任务类型选择合适的模型
|
||||
- 考虑成本和性能的平衡
|
||||
- 定期更新到最新的模型版本
|
||||
|
||||
## 🔧 扩展新供应商
|
||||
|
||||
### 1. 创建提供商类
|
||||
|
||||
```python
|
||||
# app/services/llm/providers/new_provider.py
|
||||
from ..base import TextModelProvider
|
||||
|
||||
class NewTextProvider(TextModelProvider):
|
||||
@property
|
||||
def provider_name(self) -> str:
|
||||
return "new_provider"
|
||||
|
||||
@property
|
||||
def supported_models(self) -> List[str]:
|
||||
return ["model-1", "model-2"]
|
||||
|
||||
async def generate_text(self, prompt: str, **kwargs) -> str:
|
||||
# 实现具体的API调用逻辑
|
||||
pass
|
||||
```
|
||||
|
||||
### 2. 注册提供商
|
||||
|
||||
```python
|
||||
# app/services/llm/providers/__init__.py
|
||||
from .new_provider import NewTextProvider
|
||||
|
||||
LLMServiceManager.register_text_provider('new_provider', NewTextProvider)
|
||||
```
|
||||
|
||||
### 3. 添加配置支持
|
||||
|
||||
```toml
|
||||
# config.toml
|
||||
text_new_provider_api_key = "your_api_key"
|
||||
text_new_provider_model_name = "model-1"
|
||||
text_new_provider_base_url = "https://api.newprovider.com/v1"
|
||||
```
|
||||
|
||||
## 📞 技术支持
|
||||
|
||||
如果在使用过程中遇到问题:
|
||||
|
||||
1. 首先运行测试脚本检查配置
|
||||
2. 查看日志文件了解详细错误信息
|
||||
3. 检查API密钥和网络连接
|
||||
4. 参考本文档的故障排除部分
|
||||
|
||||
---
|
||||
|
||||
*最后更新: 2025-01-07*
|
||||
@ -1,162 +0,0 @@
|
||||
# 音频音量平衡优化指南
|
||||
|
||||
## 问题描述
|
||||
|
||||
在视频剪辑后台任务中,经常出现视频原声音量比TTS生成的解说声音音量小很多的问题。即使设置了视频原声为1.0,解说音量为0.7,原声依然听起来比较小。
|
||||
|
||||
## 原因分析
|
||||
|
||||
1. **音频响度差异**:TTS生成的音频通常具有较高且一致的响度,而视频原声的音量可能本身就比较低,或者动态范围较大。
|
||||
|
||||
2. **缺乏音频标准化**:之前的代码只是简单地通过乘法器调整音量,没有进行音频响度分析和标准化处理。
|
||||
|
||||
3. **音频混合方式**:使用 `CompositeAudioClip` 进行音频混合时,不同音频轨道的响度差异会被保留。
|
||||
|
||||
## 解决方案
|
||||
|
||||
### 1. 音频标准化工具 (`audio_normalizer.py`)
|
||||
|
||||
实现了 `AudioNormalizer` 类,提供以下功能:
|
||||
|
||||
- **LUFS响度分析**:使用FFmpeg的loudnorm滤镜分析音频的LUFS响度
|
||||
- **RMS音量计算**:作为LUFS分析的备用方案
|
||||
- **音频标准化**:将音频标准化到目标响度
|
||||
- **智能音量调整**:分析TTS和原声的响度差异,计算合适的音量调整系数
|
||||
|
||||
### 2. 音频配置管理 (`audio_config.py`)
|
||||
|
||||
实现了 `AudioConfig` 类,提供:
|
||||
|
||||
- **默认音量配置**:优化后的默认音量设置
|
||||
- **视频类型配置**:针对不同类型视频的音量配置
|
||||
- **预设配置文件**:balanced、voice_focused、original_focused等
|
||||
- **内容类型推荐**:根据内容类型推荐音量设置
|
||||
|
||||
### 3. 智能音量调整
|
||||
|
||||
在 `generate_video.py` 中集成了智能音量调整功能:
|
||||
|
||||
- 自动分析TTS和原声的响度差异
|
||||
- 计算合适的音量调整系数
|
||||
- 保留用户设置的相对比例
|
||||
- 限制调整范围,避免过度调整
|
||||
|
||||
## 配置更新
|
||||
|
||||
### 默认音量设置
|
||||
|
||||
```python
|
||||
# 原来的设置
|
||||
ORIGINAL_VOLUME = 0.7
|
||||
|
||||
# 优化后的设置
|
||||
ORIGINAL_VOLUME = 1.2 # 提高原声音量
|
||||
MAX_VOLUME = 2.0 # 允许原声音量超过1.0
|
||||
```
|
||||
|
||||
### 推荐音量配置
|
||||
|
||||
```python
|
||||
# 混合内容(默认)
|
||||
'mixed': {
|
||||
'tts_volume': 0.8,
|
||||
'original_volume': 1.3,
|
||||
'bgm_volume': 0.3,
|
||||
}
|
||||
|
||||
# 原声为主的内容
|
||||
'original_heavy': {
|
||||
'tts_volume': 0.6,
|
||||
'original_volume': 1.6,
|
||||
'bgm_volume': 0.1,
|
||||
}
|
||||
```
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 1. 自动优化(推荐)
|
||||
|
||||
系统会自动应用优化的音量配置:
|
||||
|
||||
```python
|
||||
# 在 task.py 中自动应用
|
||||
optimized_volumes = get_recommended_volumes_for_content('mixed')
|
||||
```
|
||||
|
||||
### 2. 手动配置
|
||||
|
||||
可以通过配置文件或参数手动设置:
|
||||
|
||||
```python
|
||||
# 应用预设配置文件
|
||||
volumes = AudioConfig.apply_volume_profile('original_focused')
|
||||
|
||||
# 根据视频类型获取配置
|
||||
volumes = AudioConfig.get_optimized_volumes('entertainment')
|
||||
```
|
||||
|
||||
### 3. 智能分析
|
||||
|
||||
启用智能音量分析(默认开启):
|
||||
|
||||
```python
|
||||
# 在 schema.py 中控制
|
||||
ENABLE_SMART_VOLUME = True
|
||||
```
|
||||
|
||||
## 测试验证
|
||||
|
||||
运行测试脚本验证功能:
|
||||
|
||||
```bash
|
||||
source .venv/bin/activate
|
||||
python test_audio_optimization.py
|
||||
```
|
||||
|
||||
测试结果显示:
|
||||
- TTS测试音频LUFS: -24.15
|
||||
- 原声测试音频LUFS: -32.95
|
||||
- 建议调整系数:TTS 1.61, 原声 3.00
|
||||
|
||||
## 效果对比
|
||||
|
||||
### 优化前
|
||||
- TTS音量:0.7
|
||||
- 原声音量:1.0
|
||||
- 问题:原声明显比TTS小
|
||||
|
||||
### 优化后
|
||||
- TTS音量:0.8(智能调整)
|
||||
- 原声音量:1.3(智能调整)
|
||||
- 效果:音量平衡,听感自然
|
||||
|
||||
## 注意事项
|
||||
|
||||
1. **FFmpeg依赖**:音频分析功能需要FFmpeg支持loudnorm滤镜
|
||||
2. **性能影响**:智能分析会增加少量处理时间
|
||||
3. **音质保持**:所有调整都保持音频质量不变
|
||||
4. **兼容性**:向后兼容现有的音量设置
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 1. LUFS分析失败
|
||||
- 检查FFmpeg是否安装
|
||||
- 确认音频文件格式支持
|
||||
- 自动降级到RMS分析
|
||||
|
||||
### 2. 音量调整过度
|
||||
- 检查音量限制设置
|
||||
- 调整目标LUFS值
|
||||
- 使用预设配置文件
|
||||
|
||||
### 3. 性能问题
|
||||
- 关闭智能分析:`ENABLE_SMART_VOLUME = False`
|
||||
- 使用简单音量配置
|
||||
- 减少音频分析频率
|
||||
|
||||
## 未来改进
|
||||
|
||||
1. **机器学习优化**:基于用户反馈学习最佳音量配置
|
||||
2. **实时预览**:在UI中提供音量调整预览
|
||||
3. **批量处理**:支持批量音频标准化
|
||||
4. **更多音频格式**:扩展支持的音频格式
|
||||
@ -1,143 +0,0 @@
|
||||
# 短剧解说原声片段集成指南
|
||||
|
||||
## 📋 更新概述
|
||||
|
||||
本次更新为短剧解说脚本生成提示词添加了详细的原声片段使用规范,确保生成的解说脚本能够在适当位置插入原声片段,增强观众的代入感和情感体验。
|
||||
|
||||
## 🎬 原声片段使用规范
|
||||
|
||||
### 📢 格式要求
|
||||
|
||||
原声片段必须严格按照以下JSON格式:
|
||||
```json
|
||||
{
|
||||
"_id": 序号,
|
||||
"timestamp": "开始时间-结束时间",
|
||||
"picture": "画面内容描述",
|
||||
"narration": "播放原片+序号",
|
||||
"OST": 1
|
||||
}
|
||||
```
|
||||
|
||||
### 🎯 插入策略
|
||||
|
||||
#### 1. 🔥 关键情绪爆发点
|
||||
在角色强烈情绪表达时必须保留原声:
|
||||
- **愤怒爆发**:角色愤怒咆哮、情绪失控的瞬间
|
||||
- **感动落泪**:角色感动哭泣、情感宣泄的时刻
|
||||
- **震惊反应**:角色震惊、不敢置信的表情和台词
|
||||
- **绝望崩溃**:角色绝望、崩溃的情感表达
|
||||
- **狂欢庆祝**:角色兴奋、狂欢的情绪高潮
|
||||
|
||||
#### 2. 💬 重要对白时刻
|
||||
保留推动剧情发展的关键台词和对话:
|
||||
- **身份揭露**:揭示角色真实身份的重要台词
|
||||
- **真相大白**:揭晓谜底、真相的关键对话
|
||||
- **情感告白**:爱情告白、情感表达的重要台词
|
||||
- **威胁警告**:反派威胁、警告的重要对白
|
||||
- **决定宣布**:角色做出重要决定的宣告
|
||||
|
||||
#### 3. 💥 爽点瞬间
|
||||
在"爽点"时刻保留原声增强痛快感:
|
||||
- **主角逆袭**:弱者反击、逆转局面的台词
|
||||
- **反派被打脸**:恶人得到报应、被揭穿的瞬间
|
||||
- **智商碾压**:主角展现智慧、碾压对手的台词
|
||||
- **正义伸张**:正义得到伸张、恶有恶报的时刻
|
||||
- **实力展现**:主角展现真实实力、震撼全场
|
||||
|
||||
#### 4. 🎪 悬念节点
|
||||
在制造悬念或揭晓答案的关键时刻保留原声:
|
||||
- **悬念制造**:制造悬念、留下疑问的台词
|
||||
- **答案揭晓**:揭晓答案、解开谜团的对话
|
||||
- **转折预告**:暗示即将发生转折的重要台词
|
||||
- **危机降临**:危机来临、紧张时刻的对白
|
||||
|
||||
## ⚙️ 技术规范
|
||||
|
||||
### 🔧 格式规范
|
||||
- **OST字段**:设置为1表示保留原声(解说片段设置为0)
|
||||
- **narration格式**:严格使用"播放原片+序号"(如"播放原片26")
|
||||
- **picture字段**:详细描述画面内容,便于后期剪辑参考
|
||||
- **时间戳精度**:必须与字幕中的重要对白时间精确匹配
|
||||
|
||||
### 📊 比例控制
|
||||
- **原声与解说比例**:3:7(原声30%,解说70%)
|
||||
- **分布均匀**:原声片段要在整个视频中均匀分布
|
||||
- **长度适中**:单个原声片段时长控制在3-8秒
|
||||
- **衔接自然**:原声片段与解说片段之间衔接自然流畅
|
||||
|
||||
### 🎯 选择原则
|
||||
- **情感优先**:优先选择情感强烈的台词和对话
|
||||
- **剧情关键**:必须是推动剧情发展的重要内容
|
||||
- **观众共鸣**:选择能引起观众共鸣的经典台词
|
||||
- **视听效果**:考虑台词的声音效果和表演张力
|
||||
|
||||
## 📝 输出示例
|
||||
|
||||
```json
|
||||
{
|
||||
"items": [
|
||||
{
|
||||
"_id": 1,
|
||||
"timestamp": "00:00:01,000-00:00:05,500",
|
||||
"picture": "女主角林小雨慌张地道歉,男主角沈墨轩冷漠地看着她",
|
||||
"narration": "一个普通女孩的命运即将因为一杯咖啡彻底改变!她撞到的这个男人,竟然是...",
|
||||
"OST": 0
|
||||
},
|
||||
{
|
||||
"_id": 2,
|
||||
"timestamp": "00:00:05,500-00:00:08,000",
|
||||
"picture": "沈墨轩质问林小雨,语气冷厉威严",
|
||||
"narration": "播放原片2",
|
||||
"OST": 1
|
||||
},
|
||||
{
|
||||
"_id": 3,
|
||||
"timestamp": "00:00:08,000-00:00:12,000",
|
||||
"picture": "林小雨惊慌失措,沈墨轩眼中闪过一丝兴趣",
|
||||
"narration": "霸道总裁的经典开场!一杯咖啡引发的爱情故事就这样开始了...",
|
||||
"OST": 0
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 🔄 使用方法
|
||||
|
||||
使用方法与之前完全一致,无需修改调用代码:
|
||||
|
||||
```python
|
||||
from app.services.prompts import PromptManager
|
||||
|
||||
prompt = PromptManager.get_prompt(
|
||||
category="short_drama_narration",
|
||||
name="script_generation",
|
||||
parameters={
|
||||
"drama_name": "短剧名称",
|
||||
"plot_analysis": "剧情分析内容",
|
||||
"subtitle_content": "原始字幕内容"
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
## 📈 预期效果
|
||||
|
||||
通过添加原声片段使用规范,预期能够:
|
||||
- **增强情感体验**:在关键情绪点保留原声,让观众更有代入感
|
||||
- **提升观看质量**:重要对白的原声保留,避免信息丢失
|
||||
- **强化爽点效果**:在爽点时刻保留原声,增强观众的痛快感
|
||||
- **优化节奏控制**:合理的原声与解说比例,保持观看节奏
|
||||
- **提高专业水准**:规范的原声片段使用,体现专业制作水平
|
||||
|
||||
## ✅ 验证结果
|
||||
|
||||
通过测试验证,更新后的提示词:
|
||||
- ✅ 包含完整的原声片段使用规范
|
||||
- ✅ 提供详细的插入策略指导
|
||||
- ✅ 明确技术规范和格式要求
|
||||
- ✅ 给出具体的输出示例
|
||||
- ✅ 保持代码完全兼容性
|
||||
|
||||
## 🎉 总结
|
||||
|
||||
本次更新成功为短剧解说脚本生成提示词添加了专业的原声片段使用规范,为AI生成更高质量、更具观赏性的短剧解说脚本提供了强有力的技术支持。
|
||||
@ -1,267 +0,0 @@
|
||||
# 提示词管理系统文档
|
||||
|
||||
## 概述
|
||||
|
||||
本项目实现了统一的提示词管理系统,用于集中管理三个核心功能的提示词:
|
||||
- **纪录片解说** - 视频帧分析和解说文案生成
|
||||
- **短剧混剪** - 字幕分析和爆点提取
|
||||
- **短剧解说** - 剧情分析和解说脚本生成
|
||||
|
||||
## 系统架构
|
||||
|
||||
```
|
||||
app/services/prompts/
|
||||
├── __init__.py # 模块初始化
|
||||
├── base.py # 基础提示词类
|
||||
├── manager.py # 提示词管理器
|
||||
├── registry.py # 提示词注册机制
|
||||
├── template.py # 模板渲染引擎
|
||||
├── validators.py # 输出验证器
|
||||
├── exceptions.py # 异常定义
|
||||
├── documentary/ # 纪录片解说提示词
|
||||
│ ├── __init__.py
|
||||
│ ├── frame_analysis.py # 视频帧分析
|
||||
│ └── narration_generation.py # 解说文案生成
|
||||
├── short_drama_editing/ # 短剧混剪提示词
|
||||
│ ├── __init__.py
|
||||
│ ├── subtitle_analysis.py # 字幕分析
|
||||
│ └── plot_extraction.py # 爆点提取
|
||||
└── short_drama_narration/ # 短剧解说提示词
|
||||
├── __init__.py
|
||||
├── plot_analysis.py # 剧情分析
|
||||
└── script_generation.py # 解说脚本生成
|
||||
```
|
||||
|
||||
## 核心特性
|
||||
|
||||
### 1. 统一管理
|
||||
- 所有提示词集中在 `app/services/prompts/` 模块中
|
||||
- 按功能模块分类组织
|
||||
- 支持版本控制和回滚
|
||||
|
||||
### 2. 模型类型适配
|
||||
- **TextPrompt**: 文本模型专用
|
||||
- **VisionPrompt**: 视觉模型专用
|
||||
- **ParameterizedPrompt**: 支持参数化
|
||||
|
||||
### 3. 参数化支持
|
||||
- 动态参数替换
|
||||
- 参数验证
|
||||
- 模板渲染
|
||||
|
||||
### 4. 输出验证
|
||||
- 严格的JSON格式验证
|
||||
- 特定业务场景验证(解说文案、剧情分析等)
|
||||
- 自定义验证规则
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 基本用法
|
||||
|
||||
```python
|
||||
from app.services.prompts import PromptManager
|
||||
|
||||
# 获取纪录片解说的视频帧分析提示词
|
||||
prompt = PromptManager.get_prompt(
|
||||
category="documentary",
|
||||
name="frame_analysis",
|
||||
parameters={
|
||||
"video_theme": "荒野建造",
|
||||
"custom_instructions": "请特别关注建造过程的细节"
|
||||
}
|
||||
)
|
||||
|
||||
# 获取短剧解说的剧情分析提示词
|
||||
prompt = PromptManager.get_prompt(
|
||||
category="short_drama_narration",
|
||||
name="plot_analysis",
|
||||
parameters={"subtitle_content": "字幕内容..."}
|
||||
)
|
||||
```
|
||||
|
||||
### 高级功能
|
||||
|
||||
```python
|
||||
# 搜索提示词
|
||||
results = PromptManager.search_prompts(
|
||||
keyword="分析",
|
||||
model_type=ModelType.TEXT
|
||||
)
|
||||
|
||||
# 获取提示词详细信息
|
||||
info = PromptManager.get_prompt_info(
|
||||
category="documentary",
|
||||
name="narration_generation"
|
||||
)
|
||||
|
||||
# 验证输出
|
||||
validated_data = PromptManager.validate_output(
|
||||
output=llm_response,
|
||||
category="documentary",
|
||||
name="narration_generation"
|
||||
)
|
||||
```
|
||||
|
||||
## 已注册的提示词
|
||||
|
||||
### 纪录片解说 (documentary)
|
||||
- `frame_analysis` - 视频帧分析提示词
|
||||
- `narration_generation` - 解说文案生成提示词
|
||||
|
||||
### 短剧混剪 (short_drama_editing)
|
||||
- `subtitle_analysis` - 字幕分析提示词
|
||||
- `plot_extraction` - 爆点提取提示词
|
||||
|
||||
### 短剧解说 (short_drama_narration)
|
||||
- `plot_analysis` - 剧情分析提示词
|
||||
- `script_generation` - 解说脚本生成提示词
|
||||
|
||||
## 迁移指南
|
||||
|
||||
### 旧代码迁移
|
||||
|
||||
**之前的用法:**
|
||||
```python
|
||||
from app.services.SDE.prompt import subtitle_plot_analysis_v1
|
||||
prompt = subtitle_plot_analysis_v1
|
||||
```
|
||||
|
||||
**新的用法:**
|
||||
```python
|
||||
from app.services.prompts import PromptManager
|
||||
prompt = PromptManager.get_prompt(
|
||||
category="short_drama_narration",
|
||||
name="plot_analysis",
|
||||
parameters={"subtitle_content": content}
|
||||
)
|
||||
```
|
||||
|
||||
### 已更新的文件
|
||||
- `app/services/SDE/short_drama_explanation.py`
|
||||
- `app/services/SDP/utils/step1_subtitle_analyzer_openai.py`
|
||||
- `app/services/generate_narration_script.py`
|
||||
|
||||
## 扩展指南
|
||||
|
||||
### 添加新提示词
|
||||
|
||||
1. 在相应分类目录下创建新的提示词类:
|
||||
|
||||
```python
|
||||
from ..base import TextPrompt, PromptMetadata, ModelType, OutputFormat
|
||||
|
||||
class NewPrompt(TextPrompt):
|
||||
def __init__(self):
|
||||
metadata = PromptMetadata(
|
||||
name="new_prompt",
|
||||
category="your_category",
|
||||
version="v1.0",
|
||||
description="提示词描述",
|
||||
model_type=ModelType.TEXT,
|
||||
output_format=OutputFormat.JSON,
|
||||
parameters=["param1", "param2"]
|
||||
)
|
||||
super().__init__(metadata)
|
||||
|
||||
def get_template(self) -> str:
|
||||
return "您的提示词模板内容..."
|
||||
```
|
||||
|
||||
2. 在 `__init__.py` 中注册:
|
||||
|
||||
```python
|
||||
def register_prompts():
|
||||
new_prompt = NewPrompt()
|
||||
PromptManager.register_prompt(new_prompt, is_default=True)
|
||||
```
|
||||
|
||||
### 添加新分类
|
||||
|
||||
1. 创建新的分类目录
|
||||
2. 实现提示词类
|
||||
3. 在主模块的 `__init__.py` 中导入并注册
|
||||
|
||||
## 测试
|
||||
|
||||
运行测试脚本验证系统功能:
|
||||
|
||||
```bash
|
||||
python test_prompt_system.py
|
||||
```
|
||||
|
||||
## 注意事项
|
||||
|
||||
1. **模板参数**: 使用 `${parameter_name}` 格式
|
||||
2. **JSON格式**: 模板中的JSON示例使用标准格式 `{` 和 `}`,不要使用双大括号
|
||||
3. **参数验证**: 必需参数会自动验证
|
||||
4. **版本管理**: 支持多版本共存,默认使用最新版本
|
||||
5. **输出验证**: 建议对LLM输出进行验证以确保格式正确
|
||||
6. **JSON解析**: 系统提供强大的JSON解析兼容性,自动处理各种格式问题
|
||||
|
||||
## JSON解析优化
|
||||
|
||||
系统提供了强大的JSON解析兼容性,能够处理LLM生成的各种格式问题:
|
||||
|
||||
### 支持的格式修复
|
||||
|
||||
1. **双大括号修复**: 自动将 `{{` 和 `}}` 转换为标准的 `{` 和 `}`
|
||||
2. **代码块提取**: 自动从 ````json` 代码块中提取JSON内容
|
||||
3. **额外文本处理**: 自动提取大括号包围的JSON内容,忽略前后的额外文本
|
||||
4. **尾随逗号修复**: 自动移除对象和数组末尾的多余逗号
|
||||
5. **注释移除**: 自动移除 `//` 和 `#` 注释
|
||||
6. **引号修复**: 自动修复单引号和缺失的属性名引号
|
||||
|
||||
### 解析策略
|
||||
|
||||
系统采用多重解析策略,按优先级依次尝试:
|
||||
|
||||
```python
|
||||
strategies = [
|
||||
("直接解析", lambda s: json.loads(s)),
|
||||
("修复双大括号", _fix_double_braces),
|
||||
("提取代码块", _extract_code_block),
|
||||
("提取大括号内容", _extract_braces_content),
|
||||
("修复常见格式问题", _fix_common_json_issues),
|
||||
("修复引号问题", _fix_quote_issues),
|
||||
("修复尾随逗号", _fix_trailing_commas),
|
||||
("强制修复", _force_fix_json),
|
||||
]
|
||||
```
|
||||
|
||||
### 使用示例
|
||||
|
||||
```python
|
||||
from webui.tools.generate_short_summary import parse_and_fix_json
|
||||
|
||||
# 处理双大括号JSON
|
||||
json_str = '{{ "items": [{{ "_id": 1, "name": "test" }}] }}'
|
||||
result = parse_and_fix_json(json_str) # 自动修复并解析
|
||||
|
||||
# 处理有额外文本的JSON
|
||||
json_str = '这是一些文本\n{"items": []}\n更多文本'
|
||||
result = parse_and_fix_json(json_str) # 自动提取JSON部分
|
||||
```
|
||||
|
||||
## 性能优化
|
||||
|
||||
- 提示词模板会被缓存
|
||||
- 支持批量操作
|
||||
- 异步渲染支持(未来版本)
|
||||
- JSON解析采用多策略优化,确保高成功率
|
||||
|
||||
## 故障排除
|
||||
|
||||
### 常见问题
|
||||
|
||||
1. **模板渲染错误**: 检查参数名称和格式
|
||||
2. **提示词未找到**: 确认分类、名称和版本正确
|
||||
3. **输出验证失败**: 检查LLM输出格式是否符合要求
|
||||
|
||||
### 日志调试
|
||||
|
||||
系统使用 loguru 记录详细日志,可通过日志排查问题:
|
||||
|
||||
```python
|
||||
from loguru import logger
|
||||
logger.debug("调试信息")
|
||||
```
|
||||
@ -1,202 +0,0 @@
|
||||
# 短剧解说功能优化说明
|
||||
|
||||
## 概述
|
||||
|
||||
本次优化解决了短剧解说功能中原始字幕信息缺失的问题,确保生成的解说文案与视频时间戳正确匹配。
|
||||
|
||||
## 问题分析
|
||||
|
||||
### 原始问题
|
||||
1. **参数化调用错误**:`SubtitleAnalyzer` 在获取 `PlotAnalysisPrompt` 时传入空参数字典,导致模板中的占位符无法被正确替换
|
||||
2. **数据传递链断裂**:解说脚本生成阶段无法直接访问原始字幕的时间戳信息
|
||||
3. **时间戳信息丢失**:生成的解说文案与视频画面时间戳不匹配
|
||||
|
||||
### 根本原因
|
||||
- 提示词模板期望参数化方式接收字幕内容,但实际使用了简单的字符串拼接
|
||||
- 解说脚本生成时只能访问剧情分析结果,无法获取原始字幕的准确时间戳
|
||||
|
||||
## 解决方案
|
||||
|
||||
### 1. 修复参数化调用问题
|
||||
|
||||
**修改文件**: `app/services/SDE/short_drama_explanation.py`
|
||||
|
||||
**修改内容**:
|
||||
```python
|
||||
# 修改前
|
||||
self.prompt_template = PromptManager.get_prompt(
|
||||
category="short_drama_narration",
|
||||
name="plot_analysis",
|
||||
parameters={} # 空参数字典
|
||||
)
|
||||
prompt = f"{self.prompt_template}\n\n{subtitle_content}" # 字符串拼接
|
||||
|
||||
# 修改后
|
||||
if self.custom_prompt:
|
||||
prompt = f"{self.custom_prompt}\n\n{subtitle_content}"
|
||||
else:
|
||||
prompt = PromptManager.get_prompt(
|
||||
category="short_drama_narration",
|
||||
name="plot_analysis",
|
||||
parameters={"subtitle_content": subtitle_content} # 正确传入参数
|
||||
)
|
||||
```
|
||||
|
||||
### 2. 增强解说脚本生成的数据访问
|
||||
|
||||
**修改文件**: `app/services/prompts/short_drama_narration/script_generation.py`
|
||||
|
||||
**修改内容**:
|
||||
```python
|
||||
# 添加 subtitle_content 参数支持
|
||||
parameters=["drama_name", "plot_analysis", "subtitle_content"]
|
||||
|
||||
# 优化提示词模板,添加原始字幕信息
|
||||
template = """
|
||||
下面<plot>中的内容是短剧的剧情概述:
|
||||
<plot>
|
||||
${plot_analysis}
|
||||
</plot>
|
||||
|
||||
下面<subtitles>中的内容是短剧的原始字幕(包含准确的时间戳信息):
|
||||
<subtitles>
|
||||
${subtitle_content}
|
||||
</subtitles>
|
||||
|
||||
重要要求:
|
||||
6. **时间戳必须严格基于<subtitles>中的原始时间戳**,确保与视频画面精确匹配
|
||||
11. **确保每个解说片段的时间戳都能在原始字幕中找到对应的时间范围**
|
||||
"""
|
||||
```
|
||||
|
||||
### 3. 更新方法签名和调用
|
||||
|
||||
**修改内容**:
|
||||
```python
|
||||
# 方法签名更新
|
||||
def generate_narration_script(
|
||||
self,
|
||||
short_name: str,
|
||||
plot_analysis: str,
|
||||
subtitle_content: str = "", # 新增参数
|
||||
temperature: float = 0.7
|
||||
) -> Dict[str, Any]:
|
||||
|
||||
# 调用时传入原始字幕内容
|
||||
prompt = PromptManager.get_prompt(
|
||||
category="short_drama_narration",
|
||||
name="script_generation",
|
||||
parameters={
|
||||
"drama_name": short_name,
|
||||
"plot_analysis": plot_analysis,
|
||||
"subtitle_content": subtitle_content # 传入原始字幕
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
## 使用方法
|
||||
|
||||
### 基本用法
|
||||
|
||||
```python
|
||||
from app.services.SDE.short_drama_explanation import analyze_subtitle, generate_narration_script
|
||||
|
||||
# 1. 分析字幕
|
||||
analysis_result = analyze_subtitle(
|
||||
subtitle_file_path="path/to/subtitle.srt",
|
||||
api_key="your_api_key",
|
||||
model="your_model",
|
||||
base_url="your_base_url"
|
||||
)
|
||||
|
||||
# 2. 读取原始字幕内容
|
||||
with open("path/to/subtitle.srt", 'r', encoding='utf-8') as f:
|
||||
subtitle_content = f.read()
|
||||
|
||||
# 3. 生成解说脚本(现在包含原始字幕信息)
|
||||
narration_result = generate_narration_script(
|
||||
short_name="短剧名称",
|
||||
plot_analysis=analysis_result["analysis"],
|
||||
subtitle_content=subtitle_content, # 传入原始字幕内容
|
||||
api_key="your_api_key",
|
||||
model="your_model",
|
||||
base_url="your_base_url"
|
||||
)
|
||||
```
|
||||
|
||||
### 完整示例
|
||||
|
||||
```python
|
||||
# 完整的短剧解说生成流程
|
||||
subtitle_path = "path/to/your/subtitle.srt"
|
||||
|
||||
# 步骤1:分析字幕
|
||||
analysis_result = analyze_subtitle(
|
||||
subtitle_file_path=subtitle_path,
|
||||
api_key="your_api_key",
|
||||
model="gemini-2.0-flash",
|
||||
base_url="https://api.narratoai.cn/v1/chat/completions",
|
||||
save_result=True
|
||||
)
|
||||
|
||||
if analysis_result["status"] == "success":
|
||||
# 步骤2:读取原始字幕内容
|
||||
with open(subtitle_path, 'r', encoding='utf-8') as f:
|
||||
subtitle_content = f.read()
|
||||
|
||||
# 步骤3:生成解说脚本
|
||||
narration_result = generate_narration_script(
|
||||
short_name="我的短剧",
|
||||
plot_analysis=analysis_result["analysis"],
|
||||
subtitle_content=subtitle_content, # 关键:传入原始字幕
|
||||
api_key="your_api_key",
|
||||
model="gemini-2.0-flash",
|
||||
base_url="https://api.narratoai.cn/v1/chat/completions",
|
||||
save_result=True
|
||||
)
|
||||
|
||||
if narration_result["status"] == "success":
|
||||
print("解说脚本生成成功!")
|
||||
print(narration_result["narration_script"])
|
||||
```
|
||||
|
||||
## 优化效果
|
||||
|
||||
### 修改前
|
||||
- ❌ 字幕内容无法正确嵌入提示词
|
||||
- ❌ 解说脚本生成时缺少原始时间戳信息
|
||||
- ❌ 生成的时间戳可能不准确或缺失
|
||||
|
||||
### 修改后
|
||||
- ✅ 字幕内容正确嵌入到剧情分析提示词中
|
||||
- ✅ 解说脚本生成时可访问完整的原始字幕信息
|
||||
- ✅ 生成的解说文案时间戳与视频画面精确匹配
|
||||
- ✅ 保持时间连续性和逻辑顺序
|
||||
- ✅ 支持时间片段的合理拆分
|
||||
|
||||
## 测试验证
|
||||
|
||||
运行测试脚本验证修改效果:
|
||||
|
||||
```bash
|
||||
python3 test_short_drama_narration.py
|
||||
```
|
||||
|
||||
测试覆盖:
|
||||
1. ✅ 剧情分析提示词参数化功能
|
||||
2. ✅ 解说脚本生成提示词参数化功能
|
||||
3. ✅ SubtitleAnalyzer集成功能
|
||||
|
||||
## 注意事项
|
||||
|
||||
1. **向后兼容性**:修改保持了原有API的向后兼容性
|
||||
2. **参数传递**:确保在调用 `generate_narration_script` 时传入 `subtitle_content` 参数
|
||||
3. **时间戳准确性**:生成的解说文案时间戳现在严格基于原始字幕
|
||||
4. **模块化设计**:保持了提示词管理系统的模块化架构
|
||||
|
||||
## 相关文件
|
||||
|
||||
- `app/services/SDE/short_drama_explanation.py` - 主要功能实现
|
||||
- `app/services/prompts/short_drama_narration/plot_analysis.py` - 剧情分析提示词
|
||||
- `app/services/prompts/short_drama_narration/script_generation.py` - 解说脚本生成提示词
|
||||
- `test_short_drama_narration.py` - 测试脚本
|
||||
@ -1,150 +0,0 @@
|
||||
# 短剧解说提示词优化总结
|
||||
|
||||
## 📋 优化概述
|
||||
|
||||
本次优化基于短剧解说文案创作的核心要素,对 `app/services/prompts/short_drama_narration/script_generation.py` 文件中的提示词模板进行了全面重构,使其更加精确和实用。
|
||||
|
||||
## 🎯 优化目标
|
||||
|
||||
将短剧解说文案创作的6大核心要素和严格技术要求整合到提示词模板中,确保生成的解说文案既符合短剧特点,又满足技术要求。
|
||||
|
||||
## 🔥 核心要素整合
|
||||
|
||||
### 1. 黄金开场(3秒法则)
|
||||
- **悬念设置**:直接抛出最核心的冲突或疑问
|
||||
- **冲突展示**:展现最激烈的对立关系
|
||||
- **情感共鸣**:触及观众内心的普遍情感
|
||||
- **反转预告**:暗示即将发生的惊人转折
|
||||
|
||||
### 2. 主线提炼(去繁就简)
|
||||
- 舍弃次要情节和配角,专注核心主线
|
||||
- 突出核心矛盾冲突
|
||||
- 快速跳过铺垫,直击剧情要害
|
||||
- 确保每个片段都有明确的剧情推进作用
|
||||
|
||||
### 3. 爽点放大(情绪引爆)
|
||||
- **主角逆袭**:突出弱者变强、反败为胜的瞬间
|
||||
- **反派被打脸**:强调恶人得到报应的痛快感
|
||||
- **智商在线**:赞美角色的机智和策略
|
||||
- **情感爆发**:放大感人、愤怒、震撼等强烈情绪
|
||||
|
||||
### 4. 个性吐槽(增加趣味)
|
||||
- 以观众视角进行犀利点评
|
||||
- 开启"上帝视角"分析角色行为
|
||||
- 适当吐槽剧情套路或角色愚蠢行为
|
||||
- 用幽默、犀利的语言增加观看趣味
|
||||
|
||||
### 5. 悬念预埋(引导互动)
|
||||
- 在剧情高潮前"卖关子"
|
||||
- 提出引导性问题激发思考
|
||||
- 预告后续精彩内容
|
||||
- 激发评论、点赞、关注
|
||||
|
||||
### 6. 卡点配合(视听协调)
|
||||
- 在情感高潮处预设BGM卡点
|
||||
- 解说节奏配合画面节奏
|
||||
- 重要台词处保留原声
|
||||
- 追求文案+画面+音乐的协同效应
|
||||
|
||||
## ⚙️ 严格技术要求
|
||||
|
||||
### 🕐 时间戳管理
|
||||
- **绝对不能重叠**:确保剪辑后无重复画面
|
||||
- **连续且不交叉**:严格按时间顺序排列
|
||||
- **精确匹配**:每个时间戳都必须在原始字幕中找到对应范围
|
||||
- **时间连续性**:可拆分但必须保持连续
|
||||
|
||||
### ⏱️ 时长控制(1/3原则)
|
||||
- **解说视频总长度 = 原视频长度的 1/3**
|
||||
- 精确控制节奏和密度
|
||||
- 合理分配解说和原声的时间比例
|
||||
|
||||
### 🔗 剧情连贯性
|
||||
- **保持故事逻辑完整**
|
||||
- **严格按照时间顺序**,禁止跳跃式叙述
|
||||
- **符合因果逻辑**:先发生A,再发生B,A导致B
|
||||
|
||||
## 📊 优化前后对比
|
||||
|
||||
### 优化前
|
||||
- 简单的任务描述
|
||||
- 基础的技术要求
|
||||
- 缺乏具体的创作指导
|
||||
- 没有明确的质量标准
|
||||
|
||||
### 优化后
|
||||
- 详细的6大核心要素指导
|
||||
- 严格的技术规范约束
|
||||
- 具体的操作指南和示例
|
||||
- 明确的质量标准和评判原则
|
||||
|
||||
## 🎯 质量标准
|
||||
|
||||
### 解说文案要求
|
||||
- **字数控制**:每段80-150字
|
||||
- **语言风格**:生动有趣,富有感染力
|
||||
- **情感调动**:有效调动观众情绪,产生代入感
|
||||
- **节奏把控**:快节奏但不失条理
|
||||
|
||||
### 技术规范
|
||||
- **解说与原片比例**:7:3(解说70%,原片30%)
|
||||
- **关键情绪点**:必须保留原片原声
|
||||
- **时间戳精度**:精确到毫秒级别
|
||||
- **逻辑连贯性**:严格遵循剧情发展顺序
|
||||
|
||||
## 🔧 技术实现
|
||||
|
||||
### 版本升级
|
||||
- 版本号从 v1.0 升级到 v2.0
|
||||
- 保持与现有代码结构的完全兼容性
|
||||
- 参数化机制保持不变
|
||||
|
||||
### 模板结构
|
||||
- 使用 Markdown 格式增强可读性
|
||||
- 采用 emoji 图标提升视觉效果
|
||||
- 分层级结构便于理解和执行
|
||||
|
||||
### 兼容性保证
|
||||
- 保持原有的类名和方法签名
|
||||
- 参数列表不变:`drama_name`, `plot_analysis`, `subtitle_content`
|
||||
- JSON输出格式保持一致
|
||||
|
||||
## ✅ 测试验证
|
||||
|
||||
通过测试验证,优化后的提示词:
|
||||
- ✅ 成功渲染所有参数
|
||||
- ✅ 包含所有6大核心要素
|
||||
- ✅ 包含所有技术要求
|
||||
- ✅ 保持代码兼容性
|
||||
- ✅ 输出格式正确
|
||||
|
||||
## 🚀 使用方法
|
||||
|
||||
优化后的提示词使用方法与之前完全一致:
|
||||
|
||||
```python
|
||||
from app.services.prompts import PromptManager
|
||||
|
||||
prompt = PromptManager.get_prompt(
|
||||
category="short_drama_narration",
|
||||
name="script_generation",
|
||||
parameters={
|
||||
"drama_name": "短剧名称",
|
||||
"plot_analysis": "剧情分析内容",
|
||||
"subtitle_content": "原始字幕内容"
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
## 📈 预期效果
|
||||
|
||||
使用优化后的提示词,预期能够生成:
|
||||
- 更具吸引力的开场钩子
|
||||
- 更精准的爽点识别和放大
|
||||
- 更有个性的解说风格
|
||||
- 更严格的技术规范遵循
|
||||
- 更高质量的整体解说文案
|
||||
|
||||
## 🎉 总结
|
||||
|
||||
本次优化成功将短剧解说创作的专业技巧系统性地整合到提示词模板中,为AI生成高质量的短剧解说文案提供了强有力的指导框架。优化后的模板不仅保持了技术兼容性,还大幅提升了创作指导的专业性和实用性。
|
||||
@ -1,170 +0,0 @@
|
||||
# WebUI短剧解说功能Bug修复总结
|
||||
|
||||
## 问题描述
|
||||
|
||||
在运行WebUI的短剧解说功能时,出现以下错误:
|
||||
|
||||
```
|
||||
2025-07-11 22:15:29 | ERROR | "./app/services/prompts/manager.py:59": get_prompt - 提示词渲染失败: short_drama_narration.script_generation - 模板渲染失败 'script_generation': 缺少必需参数 (缺少参数: subtitle_content)
|
||||
```
|
||||
|
||||
## 根本原因
|
||||
|
||||
在之前的优化中,我们修改了 `ScriptGenerationPrompt` 类,添加了 `subtitle_content` 作为必需参数,但是在 `app/services/llm/migration_adapter.py` 中的 `SubtitleAnalyzerAdapter.generate_narration_script` 方法没有相应更新,导致调用提示词时缺少必需的参数。
|
||||
|
||||
## 修复内容
|
||||
|
||||
### 1. 修复 migration_adapter.py
|
||||
|
||||
**文件**: `app/services/llm/migration_adapter.py`
|
||||
|
||||
**修改内容**:
|
||||
```python
|
||||
# 修改前
|
||||
def generate_narration_script(self, short_name: str, plot_analysis: str, temperature: float = 0.7) -> Dict[str, Any]:
|
||||
|
||||
# 修改后
|
||||
def generate_narration_script(self, short_name: str, plot_analysis: str, subtitle_content: str = "", temperature: float = 0.7) -> Dict[str, Any]:
|
||||
```
|
||||
|
||||
**参数传递修复**:
|
||||
```python
|
||||
# 修改前
|
||||
prompt = PromptManager.get_prompt(
|
||||
category="short_drama_narration",
|
||||
name="script_generation",
|
||||
parameters={
|
||||
"drama_name": short_name,
|
||||
"plot_analysis": plot_analysis
|
||||
}
|
||||
)
|
||||
|
||||
# 修改后
|
||||
prompt = PromptManager.get_prompt(
|
||||
category="short_drama_narration",
|
||||
name="script_generation",
|
||||
parameters={
|
||||
"drama_name": short_name,
|
||||
"plot_analysis": plot_analysis,
|
||||
"subtitle_content": subtitle_content # 添加缺失的参数
|
||||
}
|
||||
)
|
||||
```
|
||||
|
||||
### 2. 修复 WebUI 调用代码
|
||||
|
||||
**文件**: `webui/tools/generate_short_summary.py`
|
||||
|
||||
**修改内容**:
|
||||
|
||||
1. **确保字幕内容在所有情况下都可用**:
|
||||
```python
|
||||
# 修改前:字幕内容只在新LLM服务架构中读取
|
||||
try:
|
||||
analyzer = SubtitleAnalyzerAdapter(...)
|
||||
with open(subtitle_path, 'r', encoding='utf-8') as f:
|
||||
subtitle_content = f.read()
|
||||
analysis_result = analyzer.analyze_subtitle(subtitle_content)
|
||||
except Exception as e:
|
||||
# 回退时没有subtitle_content变量
|
||||
|
||||
# 修改后:无论使用哪种实现都先读取字幕内容
|
||||
with open(subtitle_path, 'r', encoding='utf-8') as f:
|
||||
subtitle_content = f.read()
|
||||
|
||||
try:
|
||||
analyzer = SubtitleAnalyzerAdapter(...)
|
||||
analysis_result = analyzer.analyze_subtitle(subtitle_content)
|
||||
except Exception as e:
|
||||
# 回退时subtitle_content变量仍然可用
|
||||
```
|
||||
|
||||
2. **修复新LLM服务架构的调用**:
|
||||
```python
|
||||
# 修改前
|
||||
narration_result = analyzer.generate_narration_script(
|
||||
short_name=video_theme,
|
||||
plot_analysis=analysis_result["analysis"],
|
||||
temperature=temperature
|
||||
)
|
||||
|
||||
# 修改后
|
||||
narration_result = analyzer.generate_narration_script(
|
||||
short_name=video_theme,
|
||||
plot_analysis=analysis_result["analysis"],
|
||||
subtitle_content=subtitle_content, # 添加字幕内容参数
|
||||
temperature=temperature
|
||||
)
|
||||
```
|
||||
|
||||
3. **修复回退到旧实现的调用**:
|
||||
```python
|
||||
# 修改前
|
||||
narration_result = generate_narration_script(
|
||||
short_name=video_theme,
|
||||
plot_analysis=analysis_result["analysis"],
|
||||
api_key=text_api_key,
|
||||
model=text_model,
|
||||
base_url=text_base_url,
|
||||
save_result=True,
|
||||
temperature=temperature,
|
||||
provider=text_provider
|
||||
)
|
||||
|
||||
# 修改后
|
||||
narration_result = generate_narration_script(
|
||||
short_name=video_theme,
|
||||
plot_analysis=analysis_result["analysis"],
|
||||
subtitle_content=subtitle_content, # 添加字幕内容参数
|
||||
api_key=text_api_key,
|
||||
model=text_model,
|
||||
base_url=text_base_url,
|
||||
save_result=True,
|
||||
temperature=temperature,
|
||||
provider=text_provider
|
||||
)
|
||||
```
|
||||
|
||||
## 测试验证
|
||||
|
||||
创建并运行了测试脚本,验证了以下内容:
|
||||
|
||||
1. ✅ 提示词参数化功能正常
|
||||
2. ✅ 所有必需参数都正确传递
|
||||
3. ✅ 方法签名包含所有必需参数
|
||||
4. ✅ 字幕内容正确嵌入到提示词中
|
||||
|
||||
## 修复效果
|
||||
|
||||
**修复前**:
|
||||
- ❌ WebUI运行时出现"缺少必需参数"错误
|
||||
- ❌ 无法生成解说脚本
|
||||
- ❌ 用户体验中断
|
||||
|
||||
**修复后**:
|
||||
- ✅ WebUI正常运行,无参数错误
|
||||
- ✅ 解说脚本生成功能正常
|
||||
- ✅ 原始字幕内容正确传递到提示词
|
||||
- ✅ 生成的解说文案基于准确的时间戳信息
|
||||
|
||||
## 相关文件
|
||||
|
||||
- `app/services/llm/migration_adapter.py` - 修复适配器方法签名和参数传递
|
||||
- `webui/tools/generate_short_summary.py` - 修复WebUI调用代码
|
||||
- `app/services/prompts/short_drama_narration/script_generation.py` - 提示词模板(之前已优化)
|
||||
|
||||
## 注意事项
|
||||
|
||||
1. **向后兼容性**: 修改保持了API的向后兼容性,`subtitle_content` 参数有默认值
|
||||
2. **错误处理**: 确保在所有代码路径中都能获取到字幕内容
|
||||
3. **一致性**: 新旧实现都使用相同的参数传递方式
|
||||
|
||||
## 总结
|
||||
|
||||
这次修复解决了WebUI中短剧解说功能的关键bug,确保了:
|
||||
- 提示词系统的参数完整性
|
||||
- WebUI功能的正常运行
|
||||
- 用户体验的连续性
|
||||
- 代码的健壮性和一致性
|
||||
|
||||
现在用户可以正常使用WebUI的短剧解说功能,生成基于准确时间戳的高质量解说文案。
|
||||
@ -1 +1 @@
|
||||
0.7.4
|
||||
0.7.5
|
||||
88
webui.py
88
webui.py
@ -26,7 +26,7 @@ st.set_page_config(
|
||||
|
||||
# 设置页面样式
|
||||
hide_streamlit_style = """
|
||||
<style>#root > div:nth-child(1) > div > div > div > div > section > div {padding-top: 6px; padding-bottom: 10px; padding-left: 20px; padding-right: 20px;}</style>
|
||||
<style>#root > div:nth-child(1) > div > div > div > div > section > div {padding-top: 2rem; padding-bottom: 10px; padding-left: 20px; padding-right: 20px;}</style>
|
||||
"""
|
||||
st.markdown(hide_streamlit_style, unsafe_allow_html=True)
|
||||
|
||||
@ -131,18 +131,11 @@ def render_generate_button():
|
||||
"""渲染生成按钮和处理逻辑"""
|
||||
if st.button(tr("Generate Video"), use_container_width=True, type="primary"):
|
||||
from app.services import task as tm
|
||||
|
||||
# 重置日志容器和记录
|
||||
log_container = st.empty()
|
||||
log_records = []
|
||||
|
||||
def log_received(msg):
|
||||
with log_container:
|
||||
log_records.append(msg)
|
||||
st.code("\n".join(log_records))
|
||||
|
||||
from loguru import logger
|
||||
logger.add(log_received)
|
||||
from app.services import state as sm
|
||||
from app.models import const
|
||||
import threading
|
||||
import time
|
||||
import uuid
|
||||
|
||||
config.save_config()
|
||||
|
||||
@ -155,9 +148,6 @@ def render_generate_button():
|
||||
st.error(tr("视频文件不能为空"))
|
||||
return
|
||||
|
||||
st.toast(tr("生成视频"))
|
||||
logger.info(tr("开始生成视频"))
|
||||
|
||||
# 获取所有参数
|
||||
script_params = script_settings.get_script_params()
|
||||
video_params = video_settings.get_video_params()
|
||||
@ -175,29 +165,61 @@ def render_generate_button():
|
||||
# 创建参数对象
|
||||
params = VideoClipParams(**all_params)
|
||||
|
||||
# 使用新的统一裁剪策略,不再需要预裁剪的subclip_videos
|
||||
# 生成一个新的task_id用于本次处理
|
||||
import uuid
|
||||
task_id = str(uuid.uuid4())
|
||||
|
||||
result = tm.start_subclip_unified(
|
||||
task_id=task_id,
|
||||
params=params
|
||||
)
|
||||
# 创建进度条
|
||||
progress_bar = st.progress(0)
|
||||
status_text = st.empty()
|
||||
|
||||
video_files = result.get("videos", [])
|
||||
st.success(tr("视生成完成"))
|
||||
def run_task():
|
||||
try:
|
||||
tm.start_subclip_unified(
|
||||
task_id=task_id,
|
||||
params=params
|
||||
)
|
||||
except Exception as e:
|
||||
logger.error(f"任务执行失败: {e}")
|
||||
sm.state.update_task(task_id, state=const.TASK_STATE_FAILED, message=str(e))
|
||||
|
||||
try:
|
||||
if video_files:
|
||||
player_cols = st.columns(len(video_files) * 2 + 1)
|
||||
for i, url in enumerate(video_files):
|
||||
player_cols[i * 2 + 1].video(url)
|
||||
except Exception as e:
|
||||
logger.error(f"播放视频失败: {e}")
|
||||
# 在新线程中启动任务
|
||||
thread = threading.Thread(target=run_task)
|
||||
thread.start()
|
||||
|
||||
# 轮询任务状态
|
||||
while True:
|
||||
task = sm.state.get_task(task_id)
|
||||
if task:
|
||||
progress = task.get("progress", 0)
|
||||
state = task.get("state")
|
||||
|
||||
# 更新进度条
|
||||
progress_bar.progress(progress / 100)
|
||||
status_text.text(f"Processing... {progress}%")
|
||||
|
||||
if state == const.TASK_STATE_COMPLETE:
|
||||
status_text.text(tr("视频生成完成"))
|
||||
progress_bar.progress(1.0)
|
||||
|
||||
# 显示结果
|
||||
video_files = task.get("videos", [])
|
||||
try:
|
||||
if video_files:
|
||||
player_cols = st.columns(len(video_files) * 2 + 1)
|
||||
for i, url in enumerate(video_files):
|
||||
player_cols[i * 2 + 1].video(url)
|
||||
except Exception as e:
|
||||
logger.error(f"播放视频失败: {e}")
|
||||
|
||||
st.success(tr("视频生成完成"))
|
||||
break
|
||||
|
||||
elif state == const.TASK_STATE_FAILED:
|
||||
st.error(f"任务失败: {task.get('message', 'Unknown error')}")
|
||||
break
|
||||
|
||||
time.sleep(0.5)
|
||||
|
||||
# file_utils.open_task_folder(config.root_dir, task_id)
|
||||
logger.info(tr("视频生成完成"))
|
||||
|
||||
|
||||
def main():
|
||||
|
||||
@ -1,4 +1,3 @@
|
||||
from venv import logger
|
||||
import streamlit as st
|
||||
import os
|
||||
from uuid import uuid4
|
||||
@ -26,7 +25,8 @@ def get_tts_engine_options():
|
||||
"edge_tts": "Edge TTS",
|
||||
"azure_speech": "Azure Speech Services",
|
||||
"tencent_tts": "腾讯云 TTS",
|
||||
"qwen3_tts": "通义千问 Qwen3 TTS"
|
||||
"qwen3_tts": "通义千问 Qwen3 TTS",
|
||||
"indextts2": "IndexTTS2 语音克隆"
|
||||
}
|
||||
|
||||
|
||||
@ -56,6 +56,12 @@ def get_tts_engine_descriptions():
|
||||
"features": "阿里云通义千问语音合成,音质优秀,支持多种音色",
|
||||
"use_case": "需要高质量中文语音合成的用户",
|
||||
"registration": "https://dashscope.aliyuncs.com/"
|
||||
},
|
||||
"indextts2": {
|
||||
"title": "IndexTTS2 语音克隆",
|
||||
"features": "零样本语音克隆,上传参考音频即可合成相同音色的语音,需要本地或私有部署",
|
||||
"use_case": "下载地址:https://pan.quark.cn/s/0767c9bcefd5",
|
||||
"registration": None
|
||||
}
|
||||
}
|
||||
|
||||
@ -139,6 +145,8 @@ def render_tts_settings(tr):
|
||||
render_tencent_tts_settings(tr)
|
||||
elif selected_engine == "qwen3_tts":
|
||||
render_qwen3_tts_settings(tr)
|
||||
elif selected_engine == "indextts2":
|
||||
render_indextts2_tts_settings(tr)
|
||||
|
||||
# 4. 试听功能
|
||||
render_voice_preview_new(tr, selected_engine)
|
||||
@ -562,6 +570,139 @@ def render_qwen3_tts_settings(tr):
|
||||
config.ui["qwen3_rate"] = voice_rate
|
||||
config.ui["voice_name"] = voice_type #兼容性
|
||||
|
||||
|
||||
def render_indextts2_tts_settings(tr):
|
||||
"""渲染 IndexTTS2 TTS 设置"""
|
||||
import os
|
||||
|
||||
# API 地址配置
|
||||
api_url = st.text_input(
|
||||
"API 地址",
|
||||
value=config.indextts2.get("api_url", "http://127.0.0.1:8081/tts"),
|
||||
help="IndexTTS2 API 服务地址"
|
||||
)
|
||||
|
||||
# 参考音频文件路径
|
||||
reference_audio = st.text_input(
|
||||
"参考音频路径",
|
||||
value=config.indextts2.get("reference_audio", ""),
|
||||
help="用于语音克隆的参考音频文件路径(WAV 格式,建议 3-10 秒)"
|
||||
)
|
||||
|
||||
# 文件上传功能
|
||||
uploaded_file = st.file_uploader(
|
||||
"或上传参考音频文件",
|
||||
type=["wav", "mp3"],
|
||||
help="上传一段清晰的音频用于语音克隆"
|
||||
)
|
||||
|
||||
if uploaded_file is not None:
|
||||
# 保存上传的文件
|
||||
import tempfile
|
||||
temp_dir = tempfile.gettempdir()
|
||||
audio_path = os.path.join(temp_dir, f"indextts2_ref_{uploaded_file.name}")
|
||||
with open(audio_path, "wb") as f:
|
||||
f.write(uploaded_file.getbuffer())
|
||||
reference_audio = audio_path
|
||||
st.success(f"✅ 音频已上传: {audio_path}")
|
||||
|
||||
# 推理模式
|
||||
infer_mode = st.selectbox(
|
||||
"推理模式",
|
||||
options=["普通推理", "快速推理"],
|
||||
index=0 if config.indextts2.get("infer_mode", "普通推理") == "普通推理" else 1,
|
||||
help="普通推理质量更高但速度较慢,快速推理速度更快但质量略低"
|
||||
)
|
||||
|
||||
# 高级参数折叠面板
|
||||
with st.expander("🔧 高级参数", expanded=False):
|
||||
col1, col2 = st.columns(2)
|
||||
|
||||
with col1:
|
||||
temperature = st.slider(
|
||||
"采样温度 (Temperature)",
|
||||
min_value=0.1,
|
||||
max_value=2.0,
|
||||
value=float(config.indextts2.get("temperature", 1.0)),
|
||||
step=0.1,
|
||||
help="控制随机性,值越高输出越随机,值越低越确定"
|
||||
)
|
||||
|
||||
top_p = st.slider(
|
||||
"Top P",
|
||||
min_value=0.0,
|
||||
max_value=1.0,
|
||||
value=float(config.indextts2.get("top_p", 0.8)),
|
||||
step=0.05,
|
||||
help="nucleus 采样的概率阈值,值越小结果越确定"
|
||||
)
|
||||
|
||||
top_k = st.slider(
|
||||
"Top K",
|
||||
min_value=0,
|
||||
max_value=100,
|
||||
value=int(config.indextts2.get("top_k", 30)),
|
||||
step=5,
|
||||
help="top-k 采样的 k 值,0 表示不使用 top-k"
|
||||
)
|
||||
|
||||
with col2:
|
||||
num_beams = st.slider(
|
||||
"束搜索 (Num Beams)",
|
||||
min_value=1,
|
||||
max_value=10,
|
||||
value=int(config.indextts2.get("num_beams", 3)),
|
||||
step=1,
|
||||
help="束搜索的 beam 数量,值越大质量可能越好但速度越慢"
|
||||
)
|
||||
|
||||
repetition_penalty = st.slider(
|
||||
"重复惩罚 (Repetition Penalty)",
|
||||
min_value=1.0,
|
||||
max_value=20.0,
|
||||
value=float(config.indextts2.get("repetition_penalty", 10.0)),
|
||||
step=0.5,
|
||||
help="值越大越能避免重复,但过大可能导致不自然"
|
||||
)
|
||||
|
||||
do_sample = st.checkbox(
|
||||
"启用采样",
|
||||
value=config.indextts2.get("do_sample", True),
|
||||
help="启用采样可以获得更自然的语音"
|
||||
)
|
||||
|
||||
# 显示使用说明
|
||||
with st.expander("💡 IndexTTS2 使用说明", expanded=False):
|
||||
st.markdown("""
|
||||
**零样本语音克隆**
|
||||
|
||||
1. **准备参考音频**:上传或指定一段清晰的音频文件(建议 3-10 秒)
|
||||
2. **设置 API 地址**:确保 IndexTTS2 服务正常运行
|
||||
3. **开始合成**:系统会自动使用参考音频的音色合成新语音
|
||||
|
||||
**注意事项**:
|
||||
- 参考音频质量直接影响合成效果
|
||||
- 建议使用无背景噪音的清晰音频
|
||||
- 文本长度建议控制在合理范围内
|
||||
- 首次合成可能需要较长时间
|
||||
""")
|
||||
|
||||
# 保存配置
|
||||
config.indextts2["api_url"] = api_url
|
||||
config.indextts2["reference_audio"] = reference_audio
|
||||
config.indextts2["infer_mode"] = infer_mode
|
||||
config.indextts2["temperature"] = temperature
|
||||
config.indextts2["top_p"] = top_p
|
||||
config.indextts2["top_k"] = top_k
|
||||
config.indextts2["num_beams"] = num_beams
|
||||
config.indextts2["repetition_penalty"] = repetition_penalty
|
||||
config.indextts2["do_sample"] = do_sample
|
||||
|
||||
# 保存 voice_name 用于兼容性
|
||||
if reference_audio:
|
||||
config.ui["voice_name"] = f"indextts2:{reference_audio}"
|
||||
|
||||
|
||||
def render_voice_preview_new(tr, selected_engine):
|
||||
"""渲染新的语音试听功能"""
|
||||
if st.button("🎵 试听语音合成", use_container_width=True):
|
||||
@ -599,6 +740,12 @@ def render_voice_preview_new(tr, selected_engine):
|
||||
voice_name = f"qwen3:{vt}"
|
||||
voice_rate = config.ui.get("qwen3_rate", 1.0)
|
||||
voice_pitch = 1.0 # Qwen3 TTS 不支持音调调节
|
||||
elif selected_engine == "indextts2":
|
||||
reference_audio = config.indextts2.get("reference_audio", "")
|
||||
if reference_audio:
|
||||
voice_name = f"indextts2:{reference_audio}"
|
||||
voice_rate = 1.0 # IndexTTS2 不支持速度调节
|
||||
voice_pitch = 1.0 # IndexTTS2 不支持音调调节
|
||||
|
||||
if not voice_name:
|
||||
st.error("请先配置语音设置")
|
||||
|
||||
@ -5,6 +5,39 @@ import os
|
||||
from app.config import config
|
||||
from app.utils import utils
|
||||
from loguru import logger
|
||||
from app.services.llm.unified_service import UnifiedLLMService
|
||||
|
||||
# 需要用户手动填写 Base URL 的 OpenAI 兼容网关及其默认接口
|
||||
OPENAI_COMPATIBLE_GATEWAY_BASE_URLS = {
|
||||
"siliconflow": "https://api.siliconflow.cn/v1",
|
||||
"openrouter": "https://openrouter.ai/api/v1",
|
||||
"moonshot": "https://api.moonshot.cn/v1",
|
||||
"gemini(openai)": "",
|
||||
}
|
||||
|
||||
|
||||
def build_base_url_help(provider: str, model_type: str) -> tuple[str, bool, str]:
|
||||
"""
|
||||
根据 provider 返回 Base URL 的帮助文案
|
||||
|
||||
Returns:
|
||||
help_text: 显示在输入框的帮助内容
|
||||
requires_base: 是否强制提示必须填写 Base URL
|
||||
placeholder: 推荐的默认值(可为空字符串)
|
||||
"""
|
||||
default_help = "自定义 API 端点(可选),当使用自建或第三方代理时需要填写"
|
||||
provider_key = (provider or "").lower()
|
||||
example_url = OPENAI_COMPATIBLE_GATEWAY_BASE_URLS.get(provider_key)
|
||||
|
||||
if example_url is not None:
|
||||
extra = f"\n推荐接口地址: {example_url}" if example_url else ""
|
||||
help_text = (
|
||||
f"{model_type} 选择的提供商基于 OpenAI 兼容网关,必须填写完整的接口地址。"
|
||||
f"{extra}"
|
||||
)
|
||||
return help_text, True, example_url
|
||||
|
||||
return default_help, False, ""
|
||||
|
||||
|
||||
def validate_api_key(api_key: str, provider: str) -> tuple[bool, str]:
|
||||
@ -316,9 +349,26 @@ def test_litellm_vision_model(api_key: str, base_url: str, model_name: str, tr)
|
||||
old_key = os.environ.get(env_var)
|
||||
os.environ[env_var] = api_key
|
||||
|
||||
# SiliconFlow 特殊处理:使用 OpenAI 兼容模式
|
||||
test_model_name = model_name
|
||||
if provider.lower() == "siliconflow":
|
||||
# 替换 provider 为 openai
|
||||
if "/" in model_name:
|
||||
test_model_name = f"openai/{model_name.split('/', 1)[1]}"
|
||||
else:
|
||||
test_model_name = f"openai/{model_name}"
|
||||
|
||||
# 确保设置了 base_url
|
||||
if not base_url:
|
||||
base_url = "https://api.siliconflow.cn/v1"
|
||||
|
||||
# 设置 OPENAI_API_KEY (SiliconFlow 使用 OpenAI 协议)
|
||||
os.environ["OPENAI_API_KEY"] = api_key
|
||||
os.environ["OPENAI_API_BASE"] = base_url
|
||||
|
||||
try:
|
||||
# 创建测试图片(1x1 白色像素)
|
||||
test_image = Image.new('RGB', (1, 1), color='white')
|
||||
# 创建测试图片(64x64 白色像素,避免某些模型对极小图片的限制)
|
||||
test_image = Image.new('RGB', (64, 64), color='white')
|
||||
img_buffer = io.BytesIO()
|
||||
test_image.save(img_buffer, format='JPEG')
|
||||
img_bytes = img_buffer.getvalue()
|
||||
@ -340,7 +390,7 @@ def test_litellm_vision_model(api_key: str, base_url: str, model_name: str, tr)
|
||||
|
||||
# 准备参数
|
||||
completion_kwargs = {
|
||||
"model": model_name,
|
||||
"model": test_model_name,
|
||||
"messages": messages,
|
||||
"temperature": 0.1,
|
||||
"max_tokens": 50
|
||||
@ -363,6 +413,11 @@ def test_litellm_vision_model(api_key: str, base_url: str, model_name: str, tr)
|
||||
os.environ[env_var] = old_key
|
||||
else:
|
||||
os.environ.pop(env_var, None)
|
||||
|
||||
# 清理临时设置的 OpenAI 环境变量
|
||||
if provider.lower() == "siliconflow":
|
||||
os.environ.pop("OPENAI_API_KEY", None)
|
||||
os.environ.pop("OPENAI_API_BASE", None)
|
||||
|
||||
except Exception as e:
|
||||
error_msg = str(e)
|
||||
@ -415,6 +470,23 @@ def test_litellm_text_model(api_key: str, base_url: str, model_name: str, tr) ->
|
||||
old_key = os.environ.get(env_var)
|
||||
os.environ[env_var] = api_key
|
||||
|
||||
# SiliconFlow 特殊处理:使用 OpenAI 兼容模式
|
||||
test_model_name = model_name
|
||||
if provider.lower() == "siliconflow":
|
||||
# 替换 provider 为 openai
|
||||
if "/" in model_name:
|
||||
test_model_name = f"openai/{model_name.split('/', 1)[1]}"
|
||||
else:
|
||||
test_model_name = f"openai/{model_name}"
|
||||
|
||||
# 确保设置了 base_url
|
||||
if not base_url:
|
||||
base_url = "https://api.siliconflow.cn/v1"
|
||||
|
||||
# 设置 OPENAI_API_KEY (SiliconFlow 使用 OpenAI 协议)
|
||||
os.environ["OPENAI_API_KEY"] = api_key
|
||||
os.environ["OPENAI_API_BASE"] = base_url
|
||||
|
||||
try:
|
||||
# 构建测试请求
|
||||
messages = [
|
||||
@ -423,7 +495,7 @@ def test_litellm_text_model(api_key: str, base_url: str, model_name: str, tr) ->
|
||||
|
||||
# 准备参数
|
||||
completion_kwargs = {
|
||||
"model": model_name,
|
||||
"model": test_model_name,
|
||||
"messages": messages,
|
||||
"temperature": 0.1,
|
||||
"max_tokens": 20
|
||||
@ -446,6 +518,11 @@ def test_litellm_text_model(api_key: str, base_url: str, model_name: str, tr) ->
|
||||
os.environ[env_var] = old_key
|
||||
else:
|
||||
os.environ.pop(env_var, None)
|
||||
|
||||
# 清理临时设置的 OpenAI 环境变量
|
||||
if provider.lower() == "siliconflow":
|
||||
os.environ.pop("OPENAI_API_KEY", None)
|
||||
os.environ.pop("OPENAI_API_BASE", None)
|
||||
|
||||
except Exception as e:
|
||||
error_msg = str(e)
|
||||
@ -469,23 +546,61 @@ def render_vision_llm_settings(tr):
|
||||
config.app["vision_llm_provider"] = "litellm"
|
||||
|
||||
# 获取已保存的 LiteLLM 配置
|
||||
vision_model_name = config.app.get("vision_litellm_model_name", "gemini/gemini-2.0-flash-lite")
|
||||
full_vision_model_name = config.app.get("vision_litellm_model_name", "gemini/gemini-2.0-flash-lite")
|
||||
vision_api_key = config.app.get("vision_litellm_api_key", "")
|
||||
vision_base_url = config.app.get("vision_litellm_base_url", "")
|
||||
|
||||
# 解析 provider 和 model
|
||||
default_provider = "gemini"
|
||||
default_model = "gemini-2.0-flash-lite"
|
||||
|
||||
if "/" in full_vision_model_name:
|
||||
parts = full_vision_model_name.split("/", 1)
|
||||
current_provider = parts[0]
|
||||
current_model = parts[1]
|
||||
else:
|
||||
current_provider = default_provider
|
||||
current_model = full_vision_model_name
|
||||
|
||||
# 定义支持的 provider 列表
|
||||
LITELLM_PROVIDERS = [
|
||||
"openai", "gemini", "deepseek", "qwen", "siliconflow", "moonshot",
|
||||
"anthropic", "azure", "ollama", "vertex_ai", "mistral", "codestral",
|
||||
"volcengine", "groq", "cohere", "together_ai", "fireworks_ai",
|
||||
"openrouter", "replicate", "huggingface", "xai", "deepgram", "vllm",
|
||||
"bedrock", "cloudflare"
|
||||
]
|
||||
|
||||
# 如果当前 provider 不在列表中,添加到列表头部
|
||||
if current_provider not in LITELLM_PROVIDERS:
|
||||
LITELLM_PROVIDERS.insert(0, current_provider)
|
||||
|
||||
# 渲染配置输入框
|
||||
st_vision_model_name = st.text_input(
|
||||
tr("Vision Model Name"),
|
||||
value=vision_model_name,
|
||||
help="LiteLLM 模型格式: provider/model\n\n"
|
||||
"常用示例:\n"
|
||||
"• gemini/gemini-2.0-flash-lite (推荐,速度快)\n"
|
||||
"• gemini/gemini-1.5-pro (高精度)\n"
|
||||
"• openai/gpt-4o, openai/gpt-4o-mini\n"
|
||||
"• qwen/qwen2.5-vl-32b-instruct\n"
|
||||
"• siliconflow/Qwen/Qwen2.5-VL-32B-Instruct\n\n"
|
||||
"支持 100+ providers,详见: https://docs.litellm.ai/docs/providers"
|
||||
)
|
||||
col1, col2 = st.columns([1, 2])
|
||||
with col1:
|
||||
selected_provider = st.selectbox(
|
||||
tr("Vision Model Provider"),
|
||||
options=LITELLM_PROVIDERS,
|
||||
index=LITELLM_PROVIDERS.index(current_provider) if current_provider in LITELLM_PROVIDERS else 0,
|
||||
key="vision_provider_select"
|
||||
)
|
||||
|
||||
with col2:
|
||||
model_name_input = st.text_input(
|
||||
tr("Vision Model Name"),
|
||||
value=current_model,
|
||||
help="输入模型名称(不包含 provider 前缀)\n\n"
|
||||
"常用示例:\n"
|
||||
"• gemini-2.0-flash-lite\n"
|
||||
"• gpt-4o\n"
|
||||
"• qwen-vl-max\n"
|
||||
"• Qwen/Qwen2.5-VL-32B-Instruct (SiliconFlow)\n\n"
|
||||
"支持 100+ providers,详见: https://docs.litellm.ai/docs/providers",
|
||||
key="vision_model_input"
|
||||
)
|
||||
|
||||
# 组合完整的模型名称
|
||||
st_vision_model_name = f"{selected_provider}/{model_name_input}" if selected_provider and model_name_input else ""
|
||||
|
||||
st_vision_api_key = st.text_input(
|
||||
tr("Vision API Key"),
|
||||
@ -499,23 +614,25 @@ def render_vision_llm_settings(tr):
|
||||
"• SiliconFlow: https://cloud.siliconflow.cn/account/ak"
|
||||
)
|
||||
|
||||
vision_base_help, vision_base_required, vision_placeholder = build_base_url_help(
|
||||
selected_provider, "视频分析模型"
|
||||
)
|
||||
st_vision_base_url = st.text_input(
|
||||
tr("Vision Base URL"),
|
||||
value=vision_base_url,
|
||||
help="自定义 API 端点(可选)\n\n"
|
||||
"留空使用默认端点。可用于:\n"
|
||||
"• 代理地址(如通过 CloudFlare)\n"
|
||||
"• 私有部署的模型服务\n"
|
||||
"• 自定义网关\n\n"
|
||||
"示例: https://your-proxy.com/v1"
|
||||
help=vision_base_help,
|
||||
placeholder=vision_placeholder or None
|
||||
)
|
||||
if vision_base_required and not st_vision_base_url:
|
||||
info_example = vision_placeholder or "https://your-openai-compatible-endpoint/v1"
|
||||
st.info(f"请在上方填写 OpenAI 兼容网关地址,例如:{info_example}")
|
||||
|
||||
# 添加测试连接按钮
|
||||
if st.button(tr("Test Connection"), key="test_vision_connection"):
|
||||
test_errors = []
|
||||
if not st_vision_api_key:
|
||||
test_errors.append("请先输入 API 密钥")
|
||||
if not st_vision_model_name:
|
||||
if not model_name_input:
|
||||
test_errors.append("请先输入模型名称")
|
||||
|
||||
if test_errors:
|
||||
@ -545,6 +662,7 @@ def render_vision_llm_settings(tr):
|
||||
|
||||
# 验证模型名称
|
||||
if st_vision_model_name:
|
||||
# 这里的验证逻辑可能需要微调,因为我们现在是自动组合的
|
||||
is_valid, error_msg = validate_litellm_model_name(st_vision_model_name, "视频分析")
|
||||
if is_valid:
|
||||
config.app["vision_litellm_model_name"] = st_vision_model_name
|
||||
@ -580,6 +698,8 @@ def render_vision_llm_settings(tr):
|
||||
if config_changed and not validation_errors:
|
||||
try:
|
||||
config.save_config()
|
||||
# 清除缓存,确保下次使用新配置
|
||||
UnifiedLLMService.clear_cache()
|
||||
if st_vision_api_key or st_vision_base_url or st_vision_model_name:
|
||||
st.success(f"视频分析模型配置已保存(LiteLLM)")
|
||||
except Exception as e:
|
||||
@ -698,24 +818,61 @@ def render_text_llm_settings(tr):
|
||||
config.app["text_llm_provider"] = "litellm"
|
||||
|
||||
# 获取已保存的 LiteLLM 配置
|
||||
text_model_name = config.app.get("text_litellm_model_name", "deepseek/deepseek-chat")
|
||||
full_text_model_name = config.app.get("text_litellm_model_name", "deepseek/deepseek-chat")
|
||||
text_api_key = config.app.get("text_litellm_api_key", "")
|
||||
text_base_url = config.app.get("text_litellm_base_url", "")
|
||||
|
||||
# 解析 provider 和 model
|
||||
default_provider = "deepseek"
|
||||
default_model = "deepseek-chat"
|
||||
|
||||
if "/" in full_text_model_name:
|
||||
parts = full_text_model_name.split("/", 1)
|
||||
current_provider = parts[0]
|
||||
current_model = parts[1]
|
||||
else:
|
||||
current_provider = default_provider
|
||||
current_model = full_text_model_name
|
||||
|
||||
# 定义支持的 provider 列表
|
||||
LITELLM_PROVIDERS = [
|
||||
"openai", "gemini", "deepseek", "qwen", "siliconflow", "moonshot",
|
||||
"anthropic", "azure", "ollama", "vertex_ai", "mistral", "codestral",
|
||||
"volcengine", "groq", "cohere", "together_ai", "fireworks_ai",
|
||||
"openrouter", "replicate", "huggingface", "xai", "deepgram", "vllm",
|
||||
"bedrock", "cloudflare"
|
||||
]
|
||||
|
||||
# 如果当前 provider 不在列表中,添加到列表头部
|
||||
if current_provider not in LITELLM_PROVIDERS:
|
||||
LITELLM_PROVIDERS.insert(0, current_provider)
|
||||
|
||||
# 渲染配置输入框
|
||||
st_text_model_name = st.text_input(
|
||||
tr("Text Model Name"),
|
||||
value=text_model_name,
|
||||
help="LiteLLM 模型格式: provider/model\n\n"
|
||||
"常用示例:\n"
|
||||
"• deepseek/deepseek-chat (推荐,性价比高)\n"
|
||||
"• gemini/gemini-2.0-flash (速度快)\n"
|
||||
"• openai/gpt-4o, openai/gpt-4o-mini\n"
|
||||
"• qwen/qwen-plus, qwen/qwen-turbo\n"
|
||||
"• siliconflow/deepseek-ai/DeepSeek-R1\n"
|
||||
"• moonshot/moonshot-v1-8k\n\n"
|
||||
"支持 100+ providers,详见: https://docs.litellm.ai/docs/providers"
|
||||
)
|
||||
col1, col2 = st.columns([1, 2])
|
||||
with col1:
|
||||
selected_provider = st.selectbox(
|
||||
tr("Text Model Provider"),
|
||||
options=LITELLM_PROVIDERS,
|
||||
index=LITELLM_PROVIDERS.index(current_provider) if current_provider in LITELLM_PROVIDERS else 0,
|
||||
key="text_provider_select"
|
||||
)
|
||||
|
||||
with col2:
|
||||
model_name_input = st.text_input(
|
||||
tr("Text Model Name"),
|
||||
value=current_model,
|
||||
help="输入模型名称(不包含 provider 前缀)\n\n"
|
||||
"常用示例:\n"
|
||||
"• deepseek-chat\n"
|
||||
"• gpt-4o\n"
|
||||
"• gemini-2.0-flash\n"
|
||||
"• deepseek-ai/DeepSeek-R1 (SiliconFlow)\n\n"
|
||||
"支持 100+ providers,详见: https://docs.litellm.ai/docs/providers",
|
||||
key="text_model_input"
|
||||
)
|
||||
|
||||
# 组合完整的模型名称
|
||||
st_text_model_name = f"{selected_provider}/{model_name_input}" if selected_provider and model_name_input else ""
|
||||
|
||||
st_text_api_key = st.text_input(
|
||||
tr("Text API Key"),
|
||||
@ -731,23 +888,25 @@ def render_text_llm_settings(tr):
|
||||
"• Moonshot: https://platform.moonshot.cn/console/api-keys"
|
||||
)
|
||||
|
||||
text_base_help, text_base_required, text_placeholder = build_base_url_help(
|
||||
selected_provider, "文案生成模型"
|
||||
)
|
||||
st_text_base_url = st.text_input(
|
||||
tr("Text Base URL"),
|
||||
value=text_base_url,
|
||||
help="自定义 API 端点(可选)\n\n"
|
||||
"留空使用默认端点。可用于:\n"
|
||||
"• 代理地址(如通过 CloudFlare)\n"
|
||||
"• 私有部署的模型服务\n"
|
||||
"• 自定义网关\n\n"
|
||||
"示例: https://your-proxy.com/v1"
|
||||
help=text_base_help,
|
||||
placeholder=text_placeholder or None
|
||||
)
|
||||
if text_base_required and not st_text_base_url:
|
||||
info_example = text_placeholder or "https://your-openai-compatible-endpoint/v1"
|
||||
st.info(f"请在上方填写 OpenAI 兼容网关地址,例如:{info_example}")
|
||||
|
||||
# 添加测试连接按钮
|
||||
if st.button(tr("Test Connection"), key="test_text_connection"):
|
||||
test_errors = []
|
||||
if not st_text_api_key:
|
||||
test_errors.append("请先输入 API 密钥")
|
||||
if not st_text_model_name:
|
||||
if not model_name_input:
|
||||
test_errors.append("请先输入模型名称")
|
||||
|
||||
if test_errors:
|
||||
@ -812,6 +971,8 @@ def render_text_llm_settings(tr):
|
||||
if text_config_changed and not text_validation_errors:
|
||||
try:
|
||||
config.save_config()
|
||||
# 清除缓存,确保下次使用新配置
|
||||
UnifiedLLMService.clear_cache()
|
||||
if st_text_api_key or st_text_base_url or st_text_model_name:
|
||||
st.success(f"文案生成模型配置已保存(LiteLLM)")
|
||||
except Exception as e:
|
||||
|
||||
@ -49,90 +49,160 @@ def render_script_panel(tr):
|
||||
|
||||
def render_script_file(tr, params):
|
||||
"""渲染脚本文件选择"""
|
||||
script_list = [
|
||||
(tr("None"), ""),
|
||||
(tr("Auto Generate"), "auto"),
|
||||
(tr("Short Generate"), "short"),
|
||||
(tr("Short Drama Summary"), "summary"),
|
||||
(tr("Upload Script"), "upload_script")
|
||||
]
|
||||
# 定义功能模式
|
||||
MODE_FILE = "file_selection"
|
||||
MODE_AUTO = "auto"
|
||||
MODE_SHORT = "short"
|
||||
MODE_SUMMARY = "summary"
|
||||
|
||||
# 获取已有脚本文件
|
||||
suffix = "*.json"
|
||||
script_dir = utils.script_dir()
|
||||
files = glob.glob(os.path.join(script_dir, suffix))
|
||||
file_list = []
|
||||
# 模式选项映射
|
||||
mode_options = {
|
||||
tr("Select/Upload Script"): MODE_FILE,
|
||||
tr("Auto Generate"): MODE_AUTO,
|
||||
tr("Short Generate"): MODE_SHORT,
|
||||
tr("Short Drama Summary"): MODE_SUMMARY,
|
||||
}
|
||||
|
||||
# 获取当前状态
|
||||
current_path = st.session_state.get('video_clip_json_path', '')
|
||||
|
||||
# 确定当前选中的模式索引
|
||||
default_index = 0
|
||||
mode_keys = list(mode_options.keys())
|
||||
|
||||
if current_path == "auto":
|
||||
default_index = mode_keys.index(tr("Auto Generate"))
|
||||
elif current_path == "short":
|
||||
default_index = mode_keys.index(tr("Short Generate"))
|
||||
elif current_path == "summary":
|
||||
default_index = mode_keys.index(tr("Short Drama Summary"))
|
||||
else:
|
||||
default_index = mode_keys.index(tr("Select/Upload Script"))
|
||||
|
||||
for file in files:
|
||||
file_list.append({
|
||||
"name": os.path.basename(file),
|
||||
"file": file,
|
||||
"ctime": os.path.getctime(file)
|
||||
})
|
||||
# 1. 渲染功能选择下拉框
|
||||
# 使用 segmented_control 替代 selectbox,提供更好的视觉体验
|
||||
default_mode_label = mode_keys[default_index]
|
||||
|
||||
# 定义回调函数来处理状态更新
|
||||
def update_script_mode():
|
||||
# 获取当前选中的标签
|
||||
selected_label = st.session_state.script_mode_selection
|
||||
if selected_label:
|
||||
# 更新实际的 path 状态
|
||||
new_mode = mode_options[selected_label]
|
||||
st.session_state.video_clip_json_path = new_mode
|
||||
params.video_clip_json_path = new_mode
|
||||
else:
|
||||
# 如果用户取消选择(segmented_control 允许取消),恢复到默认或上一个状态
|
||||
# 这里我们强制保持当前状态,或者重置为默认
|
||||
st.session_state.script_mode_selection = default_mode_label
|
||||
|
||||
file_list.sort(key=lambda x: x["ctime"], reverse=True)
|
||||
for file in file_list:
|
||||
display_name = file['file'].replace(config.root_dir, "")
|
||||
script_list.append((display_name, file['file']))
|
||||
|
||||
# 找到保存的脚本文件在列表中的索引
|
||||
saved_script_path = st.session_state.get('video_clip_json_path', '')
|
||||
selected_index = 0
|
||||
for i, (_, path) in enumerate(script_list):
|
||||
if path == saved_script_path:
|
||||
selected_index = i
|
||||
break
|
||||
|
||||
selected_script_index = st.selectbox(
|
||||
tr("Script Files"),
|
||||
index=selected_index,
|
||||
options=range(len(script_list)),
|
||||
format_func=lambda x: script_list[x][0]
|
||||
# 渲染组件
|
||||
selected_mode_label = st.segmented_control(
|
||||
tr("Video Type"),
|
||||
options=mode_keys,
|
||||
default=default_mode_label,
|
||||
key="script_mode_selection",
|
||||
on_change=update_script_mode
|
||||
)
|
||||
|
||||
# 处理未选择的情况(虽然有default,但在某些交互下可能为空)
|
||||
if not selected_mode_label:
|
||||
selected_mode_label = default_mode_label
|
||||
|
||||
selected_mode = mode_options[selected_mode_label]
|
||||
|
||||
script_path = script_list[selected_script_index][1]
|
||||
st.session_state['video_clip_json_path'] = script_path
|
||||
params.video_clip_json_path = script_path
|
||||
# 2. 根据选择的模式处理逻辑
|
||||
if selected_mode == MODE_FILE:
|
||||
# --- 文件选择模式 ---
|
||||
script_list = [
|
||||
(tr("None"), ""),
|
||||
(tr("Upload Script"), "upload_script")
|
||||
]
|
||||
|
||||
# 处理脚本上传
|
||||
if script_path == "upload_script":
|
||||
uploaded_file = st.file_uploader(
|
||||
tr("Upload Script File"),
|
||||
type=["json"],
|
||||
accept_multiple_files=False,
|
||||
# 获取已有脚本文件
|
||||
suffix = "*.json"
|
||||
script_dir = utils.script_dir()
|
||||
files = glob.glob(os.path.join(script_dir, suffix))
|
||||
file_list = []
|
||||
|
||||
for file in files:
|
||||
file_list.append({
|
||||
"name": os.path.basename(file),
|
||||
"file": file,
|
||||
"ctime": os.path.getctime(file)
|
||||
})
|
||||
|
||||
file_list.sort(key=lambda x: x["ctime"], reverse=True)
|
||||
for file in file_list:
|
||||
display_name = file['file'].replace(config.root_dir, "")
|
||||
script_list.append((display_name, file['file']))
|
||||
|
||||
# 找到保存的脚本文件在列表中的索引
|
||||
# 如果当前path是特殊值(auto/short/summary),则重置为空
|
||||
saved_script_path = current_path if current_path not in [MODE_AUTO, MODE_SHORT, MODE_SUMMARY] else ""
|
||||
|
||||
selected_index = 0
|
||||
for i, (_, path) in enumerate(script_list):
|
||||
if path == saved_script_path:
|
||||
selected_index = i
|
||||
break
|
||||
|
||||
selected_script_index = st.selectbox(
|
||||
tr("Script Files"),
|
||||
index=selected_index,
|
||||
options=range(len(script_list)),
|
||||
format_func=lambda x: script_list[x][0],
|
||||
key="script_file_selection"
|
||||
)
|
||||
|
||||
if uploaded_file is not None:
|
||||
try:
|
||||
# 读取上传的JSON内容并验证格式
|
||||
script_content = uploaded_file.read().decode('utf-8')
|
||||
json_data = json.loads(script_content)
|
||||
script_path = script_list[selected_script_index][1]
|
||||
st.session_state['video_clip_json_path'] = script_path
|
||||
params.video_clip_json_path = script_path
|
||||
|
||||
# 保存到脚本目录
|
||||
script_file_path = os.path.join(script_dir, uploaded_file.name)
|
||||
file_name, file_extension = os.path.splitext(uploaded_file.name)
|
||||
# 处理脚本上传
|
||||
if script_path == "upload_script":
|
||||
uploaded_file = st.file_uploader(
|
||||
tr("Upload Script File"),
|
||||
type=["json"],
|
||||
accept_multiple_files=False,
|
||||
)
|
||||
|
||||
# 如果文件已存在,添加时间戳
|
||||
if os.path.exists(script_file_path):
|
||||
timestamp = time.strftime("%Y%m%d%H%M%S")
|
||||
file_name_with_timestamp = f"{file_name}_{timestamp}"
|
||||
script_file_path = os.path.join(script_dir, file_name_with_timestamp + file_extension)
|
||||
if uploaded_file is not None:
|
||||
try:
|
||||
# 读取上传的JSON内容并验证格式
|
||||
script_content = uploaded_file.read().decode('utf-8')
|
||||
json_data = json.loads(script_content)
|
||||
|
||||
# 写入文件
|
||||
with open(script_file_path, "w", encoding='utf-8') as f:
|
||||
json.dump(json_data, f, ensure_ascii=False, indent=2)
|
||||
# 保存到脚本目录
|
||||
script_file_path = os.path.join(script_dir, uploaded_file.name)
|
||||
file_name, file_extension = os.path.splitext(uploaded_file.name)
|
||||
|
||||
# 更新状态
|
||||
st.success(tr("Script Uploaded Successfully"))
|
||||
st.session_state['video_clip_json_path'] = script_file_path
|
||||
params.video_clip_json_path = script_file_path
|
||||
time.sleep(1)
|
||||
st.rerun()
|
||||
# 如果文件已存在,添加时间戳
|
||||
if os.path.exists(script_file_path):
|
||||
timestamp = time.strftime("%Y%m%d%H%M%S")
|
||||
file_name_with_timestamp = f"{file_name}_{timestamp}"
|
||||
script_file_path = os.path.join(script_dir, file_name_with_timestamp + file_extension)
|
||||
|
||||
except json.JSONDecodeError:
|
||||
st.error(tr("Invalid JSON format"))
|
||||
except Exception as e:
|
||||
st.error(f"{tr('Upload failed')}: {str(e)}")
|
||||
# 写入文件
|
||||
with open(script_file_path, "w", encoding='utf-8') as f:
|
||||
json.dump(json_data, f, ensure_ascii=False, indent=2)
|
||||
|
||||
# 更新状态
|
||||
st.success(tr("Script Uploaded Successfully"))
|
||||
st.session_state['video_clip_json_path'] = script_file_path
|
||||
params.video_clip_json_path = script_file_path
|
||||
time.sleep(1)
|
||||
st.rerun()
|
||||
|
||||
except json.JSONDecodeError:
|
||||
st.error(tr("Invalid JSON format"))
|
||||
except Exception as e:
|
||||
st.error(f"{tr('Upload failed')}: {str(e)}")
|
||||
else:
|
||||
# --- 功能生成模式 ---
|
||||
st.session_state['video_clip_json_path'] = selected_mode
|
||||
params.video_clip_json_path = selected_mode
|
||||
|
||||
|
||||
def render_video_file(tr, params):
|
||||
|
||||
@ -10,6 +10,7 @@ def render_subtitle_panel(tr):
|
||||
"""渲染字幕设置面板"""
|
||||
with st.container(border=True):
|
||||
st.write(tr("Subtitle Settings"))
|
||||
st.info("💡 提示:目前仅 **edge-tts** 引擎支持自动生成字幕,其他 TTS 引擎暂不支持。")
|
||||
|
||||
# 检查是否选择了 SoulVoice qwen3_tts引擎
|
||||
from app.services import voice
|
||||
@ -150,9 +151,10 @@ def render_style_settings(tr):
|
||||
|
||||
def get_subtitle_params():
|
||||
"""获取字幕参数"""
|
||||
font_name = st.session_state.get('font_name') or "SimHei"
|
||||
return {
|
||||
'subtitle_enabled': st.session_state.get('subtitle_enabled', True),
|
||||
'font_name': st.session_state.get('font_name', ''),
|
||||
'font_name': font_name,
|
||||
'font_size': st.session_state.get('font_size', 60),
|
||||
'text_fore_color': st.session_state.get('text_fore_color', '#FFFFFF'),
|
||||
'subtitle_position': st.session_state.get('subtitle_position', 'bottom'),
|
||||
|
||||
@ -152,7 +152,7 @@
|
||||
"API rate limit exceeded. Please wait about an hour and try again.": "API 调用次数已达到限制,请等待约一小时后再试。",
|
||||
"Resources exhausted. Please try again later.": "资源已耗尽,请稍后再试。",
|
||||
"Transcription Failed": "转录失败",
|
||||
"Short Generate": "短剧混剪 (实验)",
|
||||
"Short Generate": "短剧混剪",
|
||||
"Generate Short Video Script": "AI生成短剧混剪脚本",
|
||||
"Adjust the volume of the original audio": "调整原始音频的音量",
|
||||
"Original Volume": "视频音量",
|
||||
@ -161,6 +161,8 @@
|
||||
"Frame Interval (seconds) (More keyframes consume more tokens)": "帧间隔 (秒) (更多关键帧消耗更多令牌)",
|
||||
"Batch Size": "批处理大小",
|
||||
"Batch Size (More keyframes consume more tokens)": "批处理大小, 每批处理越少消耗 token 越多",
|
||||
"Short Drama Summary": "短剧解说"
|
||||
"Short Drama Summary": "短剧解说",
|
||||
"Video Type": "视频类型",
|
||||
"Select/Upload Script": "选择/上传脚本"
|
||||
}
|
||||
}
|
||||
@ -144,32 +144,3 @@ def get_batch_files(keyframe_files, result, batch_size=5):
|
||||
batch_start = result['batch_index'] * batch_size
|
||||
batch_end = min(batch_start + batch_size, len(keyframe_files))
|
||||
return keyframe_files[batch_start:batch_end]
|
||||
|
||||
|
||||
def chekc_video_config(video_params):
|
||||
"""
|
||||
检查视频分析配置
|
||||
"""
|
||||
headers = {
|
||||
'accept': 'application/json',
|
||||
'Content-Type': 'application/json'
|
||||
}
|
||||
session = requests.Session()
|
||||
retry_strategy = Retry(
|
||||
total=3,
|
||||
backoff_factor=1,
|
||||
status_forcelist=[500, 502, 503, 504]
|
||||
)
|
||||
adapter = HTTPAdapter(max_retries=retry_strategy)
|
||||
session.mount("https://", adapter)
|
||||
try:
|
||||
session.post(
|
||||
f"https://dev.narratoai.cn/api/v1/admin/external-api-config/services",
|
||||
headers=headers,
|
||||
json=video_params,
|
||||
timeout=30,
|
||||
verify=True
|
||||
)
|
||||
return True
|
||||
except Exception as e:
|
||||
return False
|
||||
|
||||
@ -10,7 +10,7 @@ from datetime import datetime
|
||||
|
||||
from app.config import config
|
||||
from app.utils import utils, video_processor
|
||||
from webui.tools.base import create_vision_analyzer, get_batch_files, get_batch_timestamps, chekc_video_config
|
||||
from webui.tools.base import create_vision_analyzer, get_batch_files, get_batch_timestamps
|
||||
|
||||
|
||||
def generate_script_docu(params):
|
||||
@ -398,7 +398,6 @@ def generate_script_docu(params):
|
||||
"text_model_name": text_model,
|
||||
"text_base_url": text_base_url
|
||||
})
|
||||
chekc_video_config(llm_params)
|
||||
# 整理帧分析数据
|
||||
markdown_output = parse_frame_analysis_to_markdown(analysis_json_path)
|
||||
|
||||
|
||||
@ -8,7 +8,6 @@ import streamlit as st
|
||||
from loguru import logger
|
||||
|
||||
from app.config import config
|
||||
from webui.tools.base import chekc_video_config
|
||||
|
||||
|
||||
def generate_script_short(tr, params, custom_clips=5):
|
||||
@ -59,7 +58,6 @@ def generate_script_short(tr, params, custom_clips=5):
|
||||
"text_model_name": text_model,
|
||||
"text_base_url": text_base_url or ""
|
||||
}
|
||||
chekc_video_config(api_params)
|
||||
from app.services.SDP.generate_script_short import generate_script
|
||||
script = generate_script(
|
||||
srt_path=srt_path,
|
||||
|
||||
@ -8,7 +8,8 @@ def get_fonts_cache(font_dir):
|
||||
fonts = []
|
||||
for root, dirs, files in os.walk(font_dir):
|
||||
for file in files:
|
||||
if file.endswith(".ttf") or file.endswith(".ttc"):
|
||||
# 支持常见字体格式,少字体时也能被UI识别
|
||||
if file.lower().endswith((".ttf", ".ttc", ".otf")):
|
||||
fonts.append(file)
|
||||
fonts.sort()
|
||||
st.session_state['fonts_cache'] = fonts
|
||||
@ -30,4 +31,4 @@ def get_songs_cache(song_dir):
|
||||
if file.endswith(".mp3"):
|
||||
songs.append(file)
|
||||
st.session_state['songs_cache'] = songs
|
||||
return st.session_state['songs_cache']
|
||||
return st.session_state['songs_cache']
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user