Compare commits

...

12 Commits
v0.7.4 ... main

Author SHA1 Message Date
linyq
92d8c8470f fix: 修复开发调试代码残留导致的误上传。我们已在当前版本中修复。服务端日志已清空。已建议所有用户重置 Key 2025-12-11 16:12:11 +08:00
linyq
56c2e28b94 更新了示例配置文件,并移除了日文README (坚决拥护中国🇨🇳领土主权🔥) 2025-11-20 10:32:33 +08:00
linyq
a3ece54b60 fix: 移除未使用的 logger 导入 2025-11-20 00:18:01 +08:00
linyq
ee36adcc93 fix: 移除未使用的 tkinter 导入 2025-11-20 00:16:14 +08:00
linyq
fcafbe52f4 fix: 更新版本号至 0.7.5 2025-11-20 00:02:33 +08:00
linyq
cda5760e37 feat: 新增 IndexTTS2 零样本语音克隆引擎支持
添加 IndexTTS2 TTS 引擎配置和实现,支持零样本语音克隆功能。包括配置保存加载、API 调用、参考音频上传、高级参数设置(温度、top_p、top_k、束搜索、重复惩罚等),并在 WebUI 中提供完整的配置界面和使用说明。
2025-11-20 00:01:49 +08:00
linyq
d75c2e000f feat: 显示字幕引擎支持提示 2025-11-19 21:05:21 +08:00
linyq
6c8a56a51c feat: 新增基础设置项并提供中文翻译 2025-11-19 20:59:42 +08:00
linyq
c77b251213 fix: 优化标题样式 2025-11-19 20:39:43 +08:00
linyq
5254798464 feat: 更新 webui 界面以支持新功能 2025-11-19 20:36:14 +08:00
linyq
238c1c13f1 feat: 优化LLM服务配置与迁移适配,并更新相关UI设置及中文翻译 2025-11-19 20:00:08 +08:00
linyq
6697535c57 feat: 增强 LiteLLM 提供商配置并更新基本设置界面 2025-11-19 19:10:07 +08:00
17 changed files with 725 additions and 279 deletions

View File

@ -1,84 +0,0 @@
<div align="center">
<h1 align="center" style="font-size: 2cm;"> NarratoAI 😎📽️ </h1>
<h3 align="center">一体型AI映画解説および自動ビデオ編集ツール🎬🎞 </h3>
<h3>📖 <a href="README-cn.md">简体中文</a> | <a href="README.md">English</a> | 日本語 </h3>
<div align="center">
[//]: # ( <a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FNarratoAI | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>)
</div>
<br>
NarratoAIは、LLMを活用してスクリプト作成、自動ビデオ編集、ナレーション、字幕生成の一体型ソリューションを提供する自動化ビデオナレーションツールです。
<br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/linyqh/NarratoAI)
[![GitHub license](https://img.shields.io/github/license/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/blob/main/LICENSE)
[![GitHub issues](https://img.shields.io/github/issues/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/issues)
[![GitHub stars](https://img.shields.io/github/stars/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/stargazers)
<a href="https://discord.gg/uVAJftcm" target="_blank">💬 Discordオープンソースコミュニティに参加して、プロジェクトの最新情報を入手しましょう。</a>
<h2><a href="https://p9mf6rjv3c.feishu.cn/wiki/SP8swLLZki5WRWkhuFvc2CyInDg?from=from_copylink" target="_blank">🎉🎉🎉 公式ドキュメント 🎉🎉🎉</a> </h2>
<h3>ホーム</h3>
![](docs/index-zh.png)
<h3>ビデオレビューインターフェース</h3>
![](docs/check-zh.png)
</div>
## 最新情報
- 2024.11.24 Discordコミュニティ開設https://discord.gg/uVAJftcm
- 2024.11.11 オープンソースコミュニティに移行、参加を歓迎します! [公式コミュニティに参加](https://github.com/linyqh/NarratoAI/wiki)
- 2024.11.10 公式ドキュメント公開、詳細は [公式ドキュメント](https://p9mf6rjv3c.feishu.cn/wiki/SP8swLLZki5WRWkhuFvc2CyInDg) を参照
- 2024.11.10 新バージョンv0.3.5リリース;ビデオ編集プロセスの最適化
## 今後の計画 🥳
- [x] Windows統合パックリリース
- [x] ストーリー生成プロセスの最適化、生成効果の向上
- [x] バージョン0.3.5統合パックリリース
- [x] アリババQwen2-VL大規模モデルのビデオ理解サポート
- [x] 短編ドラマの解説サポート
- [x] 一クリックで素材を統合
- [x] 一クリックで文字起こし
- [x] 一クリックでキャッシュをクリア
- [ ] ジャン映草稿のエクスポートをサポート
- [ ] 主役の顔のマッチング
- [ ] 音声、スクリプト、ビデオ素材に基づいて自動マッチングをサポート
- [ ] より多くのTTSエンジンをサポート
- [ ] ...
## システム要件 📦
- 推奨最低CPU 4コア以上、メモリ8GB以上、GPUは必須ではありません
- Windows 10またはMacOS 11.0以上
## フィードバックと提案 📢
👏 1. [issue](https://github.com/linyqh/NarratoAI/issues)または[pull request](https://github.com/linyqh/NarratoAI/pulls)を提出できます
💬 2. [オープンソースコミュニティ交流グループに参加](https://github.com/linyqh/NarratoAI/wiki)
📷 3. 公式アカウント【NarratoAI助手】をフォローして最新情報を入手
## 参考プロジェクト 📚
- https://github.com/FujiwaraChoki/MoneyPrinter
- https://github.com/harry0703/MoneyPrinterTurbo
このプロジェクトは上記のプロジェクトを基にリファクタリングされ、映画解説機能が追加されました。オリジナルの作者に感謝します 🥳🥳🥳
## 作者にコーヒーを一杯おごる ☕️
<div style="display: flex; justify-content: space-between;">
<img src="https://github.com/user-attachments/assets/5038ccfb-addf-4db1-9966-99415989fd0c" alt="Image 1" style="width: 350px; height: 350px; margin: auto;"/>
<img src="https://github.com/user-attachments/assets/07d4fd58-02f0-425c-8b59-2ab94b4f09f8" alt="Image 2" style="width: 350px; height: 350px; margin: auto;"/>
</div>
## ライセンス 📝
[`LICENSE`](LICENSE) ファイルをクリックして表示
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=linyqh/NarratoAI&type=Date)](https://star-history.com/#linyqh/NarratoAI&Date)

View File

@ -4,7 +4,7 @@
<h3 align="center">一站式 AI 影视解说+自动化剪辑工具🎬🎞️ </h3>
<h3>📖 <a href="README-en.md">English</a> | 简体中文 | <a href="README-ja.md">日本語</a> </h3>
<h3>📖 <a href="README-en.md">English</a> | 简体中文 </h3>
<div align="center">
[//]: # ( <a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FNarratoAI | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>)
@ -31,6 +31,7 @@ NarratoAI 是一个自动化影视解说工具基于LLM实现文案撰写、
本项目仅供学习和研究使用,不得商用。如需商业授权,请联系作者。
## 最新资讯
- 2025.11.20 发布新版本 0.7.5 新增 [IndexTTS2](https://github.com/index-tts/index-tts) 语音克隆支持
- 2025.10.15 发布新版本 0.7.3 使用 [LiteLLM](https://github.com/BerriAI/litellm) 管理模型供应商
- 2025.09.10 发布新版本 0.7.2 新增腾讯云tts
- 2025.08.18 发布新版本 0.7.1,支持 **语音克隆** 和 最新大模型

View File

@ -52,6 +52,7 @@ def save_config():
_cfg["soulvoice"] = soulvoice
_cfg["ui"] = ui
_cfg["tts_qwen"] = tts_qwen
_cfg["indextts2"] = indextts2
f.write(toml.dumps(_cfg))
@ -65,6 +66,7 @@ soulvoice = _cfg.get("soulvoice", {})
ui = _cfg.get("ui", {})
frames = _cfg.get("frames", {})
tts_qwen = _cfg.get("tts_qwen", {})
indextts2 = _cfg.get("indextts2", {})
hostname = socket.gethostname()

View File

@ -187,8 +187,27 @@ class LiteLLMVisionProvider(VisionModelProvider):
# 调用 LiteLLM
try:
# 准备参数
effective_model_name = self.model_name
# SiliconFlow 特殊处理
if self.model_name.lower().startswith("siliconflow/"):
# 替换 provider 为 openai
if "/" in self.model_name:
effective_model_name = f"openai/{self.model_name.split('/', 1)[1]}"
else:
effective_model_name = f"openai/{self.model_name}"
# 确保设置了 OPENAI_API_KEY (如果尚未设置)
import os
if not os.environ.get("OPENAI_API_KEY") and os.environ.get("SILICONFLOW_API_KEY"):
os.environ["OPENAI_API_KEY"] = os.environ.get("SILICONFLOW_API_KEY")
# 确保设置了 base_url (如果尚未设置)
if not hasattr(self, '_api_base'):
self._api_base = "https://api.siliconflow.cn/v1"
completion_kwargs = {
"model": self.model_name,
"model": effective_model_name,
"messages": messages,
"temperature": kwargs.get("temperature", 1.0),
"max_tokens": kwargs.get("max_tokens", 4000)
@ -198,6 +217,12 @@ class LiteLLMVisionProvider(VisionModelProvider):
if hasattr(self, '_api_base'):
completion_kwargs["api_base"] = self._api_base
# 支持动态传递 api_key 和 api_base
if "api_key" in kwargs:
completion_kwargs["api_key"] = kwargs["api_key"]
if "api_base" in kwargs:
completion_kwargs["api_base"] = kwargs["api_base"]
response = await acompletion(**completion_kwargs)
if response.choices and len(response.choices) > 0:
@ -346,8 +371,27 @@ class LiteLLMTextProvider(TextModelProvider):
messages = self._build_messages(prompt, system_prompt)
# 准备参数
effective_model_name = self.model_name
# SiliconFlow 特殊处理
if self.model_name.lower().startswith("siliconflow/"):
# 替换 provider 为 openai
if "/" in self.model_name:
effective_model_name = f"openai/{self.model_name.split('/', 1)[1]}"
else:
effective_model_name = f"openai/{self.model_name}"
# 确保设置了 OPENAI_API_KEY (如果尚未设置)
import os
if not os.environ.get("OPENAI_API_KEY") and os.environ.get("SILICONFLOW_API_KEY"):
os.environ["OPENAI_API_KEY"] = os.environ.get("SILICONFLOW_API_KEY")
# 确保设置了 base_url (如果尚未设置)
if not hasattr(self, '_api_base'):
self._api_base = "https://api.siliconflow.cn/v1"
completion_kwargs = {
"model": self.model_name,
"model": effective_model_name,
"messages": messages,
"temperature": temperature
}
@ -369,6 +413,12 @@ class LiteLLMTextProvider(TextModelProvider):
if hasattr(self, '_api_base'):
completion_kwargs["api_base"] = self._api_base
# 支持动态传递 api_key 和 api_base (修复认证问题)
if "api_key" in kwargs:
completion_kwargs["api_key"] = kwargs["api_key"]
if "api_base" in kwargs:
completion_kwargs["api_base"] = kwargs["api_base"]
try:
# 调用 LiteLLM自动重试
response = await acompletion(**completion_kwargs)

View File

@ -251,7 +251,9 @@ class SubtitleAnalyzerAdapter:
UnifiedLLMService.analyze_subtitle,
subtitle_content=subtitle_content,
provider=self.provider,
temperature=1.0
temperature=1.0,
api_key=self.api_key,
api_base=self.base_url
)
return {
@ -301,7 +303,9 @@ class SubtitleAnalyzerAdapter:
system_prompt="你是一位专业的短视频解说脚本撰写专家。",
provider=self.provider,
temperature=temperature,
response_format="json"
response_format="json",
api_key=self.api_key,
api_base=self.base_url
)
# 清理JSON输出

View File

@ -1107,6 +1107,10 @@ def tts(
if tts_engine == "edge_tts":
logger.info("分发到 Edge TTS")
return azure_tts_v1(text, voice_name, voice_rate, voice_pitch, voice_file)
if tts_engine == "indextts2":
logger.info("分发到 IndexTTS2")
return indextts2_tts(text, voice_name, voice_file, speed=voice_rate)
# Fallback for unknown engine - default to azure v1
logger.warning(f"未知的 TTS 引擎: '{tts_engine}', 将默认使用 Edge TTS (Azure V1)。")
@ -1541,8 +1545,8 @@ def tts_multiple(task_id: str, list_script: list, voice_name: str, voice_rate: f
f"或者使用其他 tts 引擎")
continue
else:
# SoulVoice 引擎不生成字幕文件
if is_soulvoice_voice(voice_name) or is_qwen_engine(tts_engine):
# SoulVoice、Qwen3、IndexTTS2 引擎不生成字幕文件
if is_soulvoice_voice(voice_name) or is_qwen_engine(tts_engine) or tts_engine == "indextts2":
# 获取实际音频文件的时长
duration = get_audio_duration_from_file(audio_file)
if duration <= 0:
@ -1943,4 +1947,127 @@ def parse_soulvoice_voice(voice_name: str) -> str:
return voice_name
def parse_indextts2_voice(voice_name: str) -> str:
"""
解析 IndexTTS2 语音名称
支持格式indextts2:reference_audio_path
返回参考音频文件路径
"""
if voice_name.startswith("indextts2:"):
return voice_name[10:] # 移除 "indextts2:" 前缀
return voice_name
def indextts2_tts(text: str, voice_name: str, voice_file: str, speed: float = 1.0) -> Union[SubMaker, None]:
"""
使用 IndexTTS2 API 进行零样本语音克隆
Args:
text: 要转换的文本
voice_name: 参考音频路径格式indextts2:path/to/audio.wav
voice_file: 输出音频文件路径
speed: 语音速度此引擎暂不支持速度调节
Returns:
SubMaker: 包含时间戳信息的字幕制作器失败时返回 None
"""
# 获取配置
api_url = config.indextts2.get("api_url", "http://192.168.3.6:8081/tts")
infer_mode = config.indextts2.get("infer_mode", "普通推理")
temperature = config.indextts2.get("temperature", 1.0)
top_p = config.indextts2.get("top_p", 0.8)
top_k = config.indextts2.get("top_k", 30)
do_sample = config.indextts2.get("do_sample", True)
num_beams = config.indextts2.get("num_beams", 3)
repetition_penalty = config.indextts2.get("repetition_penalty", 10.0)
# 解析参考音频路径
reference_audio_path = parse_indextts2_voice(voice_name)
if not reference_audio_path or not os.path.exists(reference_audio_path):
logger.error(f"IndexTTS2 参考音频文件不存在: {reference_audio_path}")
return None
# 准备请求数据
files = {
'prompt_audio': open(reference_audio_path, 'rb')
}
data = {
'text': text.strip(),
'infer_mode': infer_mode,
'temperature': temperature,
'top_p': top_p,
'top_k': top_k,
'do_sample': do_sample,
'num_beams': num_beams,
'repetition_penalty': repetition_penalty,
}
# 重试机制
for attempt in range(3):
try:
logger.info(f"{attempt + 1} 次调用 IndexTTS2 API")
# 设置代理
proxies = {}
if config.proxy.get("http"):
proxies = {
'http': config.proxy.get("http"),
'https': config.proxy.get("https", config.proxy.get("http"))
}
# 调用 API
response = requests.post(
api_url,
files=files,
data=data,
proxies=proxies,
timeout=120 # IndexTTS2 推理可能需要较长时间
)
if response.status_code == 200:
# 保存音频文件
with open(voice_file, 'wb') as f:
f.write(response.content)
logger.info(f"IndexTTS2 成功生成音频: {voice_file}, 大小: {len(response.content)} 字节")
# IndexTTS2 不支持精确字幕生成,返回简单的 SubMaker 对象
sub_maker = SubMaker()
# 估算音频时长(基于文本长度)
estimated_duration_ms = max(1000, int(len(text) * 200))
sub_maker.create_sub((0, estimated_duration_ms * 10000), text)
return sub_maker
else:
logger.error(f"IndexTTS2 API 调用失败: {response.status_code} - {response.text}")
except requests.exceptions.Timeout:
logger.error(f"IndexTTS2 API 调用超时 (尝试 {attempt + 1}/3)")
except requests.exceptions.RequestException as e:
logger.error(f"IndexTTS2 API 网络错误: {str(e)} (尝试 {attempt + 1}/3)")
except Exception as e:
logger.error(f"IndexTTS2 TTS 处理错误: {str(e)} (尝试 {attempt + 1}/3)")
finally:
# 确保关闭文件
try:
files['prompt_audio'].close()
except:
pass
if attempt < 2: # 不是最后一次尝试
time.sleep(2) # 等待2秒后重试
# 重新打开文件用于下次重试
if attempt < 2:
try:
files['prompt_audio'] = open(reference_audio_path, 'rb')
except:
pass
logger.error("IndexTTS2 TTS 生成失败,已达到最大重试次数")
return None

View File

@ -1,5 +1,5 @@
[app]
project_version="0.7.4"
project_version="0.7.5"
# LLM API 超时配置(秒)
llm_vision_timeout = 120 # 视觉模型基础超时时间
@ -115,6 +115,27 @@
# 访问 https://bailian.console.aliyun.com/?tab=model#/api-key 获取你的 API 密钥
api_key = ""
model_name = "qwen3-tts-flash"
[indextts2]
# IndexTTS2 语音克隆配置
# 这是一个开源的零样本语音克隆项目,需要自行部署
# 项目地址https://github.com/index-tts/index-tts
# 默认 API 地址(本地部署)
api_url = "http://127.0.0.1:8081/tts"
# 默认参考音频路径(可选)
# reference_audio = "/path/to/reference_audio.wav"
# 推理模式:普通推理 / 快速推理
infer_mode = "普通推理"
# 高级参数
temperature = 1.0
top_p = 0.8
top_k = 30
do_sample = true
num_beams = 3
repetition_penalty = 10.0
[ui]
# TTS 引擎选择

View File

@ -1 +1 @@
0.7.4
0.7.5

View File

@ -26,7 +26,7 @@ st.set_page_config(
# 设置页面样式
hide_streamlit_style = """
<style>#root > div:nth-child(1) > div > div > div > div > section > div {padding-top: 6px; padding-bottom: 10px; padding-left: 20px; padding-right: 20px;}</style>
<style>#root > div:nth-child(1) > div > div > div > div > section > div {padding-top: 2rem; padding-bottom: 10px; padding-left: 20px; padding-right: 20px;}</style>
"""
st.markdown(hide_streamlit_style, unsafe_allow_html=True)
@ -131,18 +131,11 @@ def render_generate_button():
"""渲染生成按钮和处理逻辑"""
if st.button(tr("Generate Video"), use_container_width=True, type="primary"):
from app.services import task as tm
# 重置日志容器和记录
log_container = st.empty()
log_records = []
def log_received(msg):
with log_container:
log_records.append(msg)
st.code("\n".join(log_records))
from loguru import logger
logger.add(log_received)
from app.services import state as sm
from app.models import const
import threading
import time
import uuid
config.save_config()
@ -155,9 +148,6 @@ def render_generate_button():
st.error(tr("视频文件不能为空"))
return
st.toast(tr("生成视频"))
logger.info(tr("开始生成视频"))
# 获取所有参数
script_params = script_settings.get_script_params()
video_params = video_settings.get_video_params()
@ -175,29 +165,61 @@ def render_generate_button():
# 创建参数对象
params = VideoClipParams(**all_params)
# 使用新的统一裁剪策略不再需要预裁剪的subclip_videos
# 生成一个新的task_id用于本次处理
import uuid
task_id = str(uuid.uuid4())
result = tm.start_subclip_unified(
task_id=task_id,
params=params
)
# 创建进度条
progress_bar = st.progress(0)
status_text = st.empty()
video_files = result.get("videos", [])
st.success(tr("视生成完成"))
def run_task():
try:
tm.start_subclip_unified(
task_id=task_id,
params=params
)
except Exception as e:
logger.error(f"任务执行失败: {e}")
sm.state.update_task(task_id, state=const.TASK_STATE_FAILED, message=str(e))
try:
if video_files:
player_cols = st.columns(len(video_files) * 2 + 1)
for i, url in enumerate(video_files):
player_cols[i * 2 + 1].video(url)
except Exception as e:
logger.error(f"播放视频失败: {e}")
# 在新线程中启动任务
thread = threading.Thread(target=run_task)
thread.start()
# 轮询任务状态
while True:
task = sm.state.get_task(task_id)
if task:
progress = task.get("progress", 0)
state = task.get("state")
# 更新进度条
progress_bar.progress(progress / 100)
status_text.text(f"Processing... {progress}%")
if state == const.TASK_STATE_COMPLETE:
status_text.text(tr("视频生成完成"))
progress_bar.progress(1.0)
# 显示结果
video_files = task.get("videos", [])
try:
if video_files:
player_cols = st.columns(len(video_files) * 2 + 1)
for i, url in enumerate(video_files):
player_cols[i * 2 + 1].video(url)
except Exception as e:
logger.error(f"播放视频失败: {e}")
st.success(tr("视频生成完成"))
break
elif state == const.TASK_STATE_FAILED:
st.error(f"任务失败: {task.get('message', 'Unknown error')}")
break
time.sleep(0.5)
# file_utils.open_task_folder(config.root_dir, task_id)
logger.info(tr("视频生成完成"))
def main():

View File

@ -1,4 +1,3 @@
from venv import logger
import streamlit as st
import os
from uuid import uuid4
@ -26,7 +25,8 @@ def get_tts_engine_options():
"edge_tts": "Edge TTS",
"azure_speech": "Azure Speech Services",
"tencent_tts": "腾讯云 TTS",
"qwen3_tts": "通义千问 Qwen3 TTS"
"qwen3_tts": "通义千问 Qwen3 TTS",
"indextts2": "IndexTTS2 语音克隆"
}
@ -56,6 +56,12 @@ def get_tts_engine_descriptions():
"features": "阿里云通义千问语音合成,音质优秀,支持多种音色",
"use_case": "需要高质量中文语音合成的用户",
"registration": "https://dashscope.aliyuncs.com/"
},
"indextts2": {
"title": "IndexTTS2 语音克隆",
"features": "零样本语音克隆,上传参考音频即可合成相同音色的语音,需要本地或私有部署",
"use_case": "下载地址https://pan.quark.cn/s/0767c9bcefd5",
"registration": None
}
}
@ -139,6 +145,8 @@ def render_tts_settings(tr):
render_tencent_tts_settings(tr)
elif selected_engine == "qwen3_tts":
render_qwen3_tts_settings(tr)
elif selected_engine == "indextts2":
render_indextts2_tts_settings(tr)
# 4. 试听功能
render_voice_preview_new(tr, selected_engine)
@ -562,6 +570,139 @@ def render_qwen3_tts_settings(tr):
config.ui["qwen3_rate"] = voice_rate
config.ui["voice_name"] = voice_type #兼容性
def render_indextts2_tts_settings(tr):
"""渲染 IndexTTS2 TTS 设置"""
import os
# API 地址配置
api_url = st.text_input(
"API 地址",
value=config.indextts2.get("api_url", "http://127.0.0.1:8081/tts"),
help="IndexTTS2 API 服务地址"
)
# 参考音频文件路径
reference_audio = st.text_input(
"参考音频路径",
value=config.indextts2.get("reference_audio", ""),
help="用于语音克隆的参考音频文件路径WAV 格式,建议 3-10 秒)"
)
# 文件上传功能
uploaded_file = st.file_uploader(
"或上传参考音频文件",
type=["wav", "mp3"],
help="上传一段清晰的音频用于语音克隆"
)
if uploaded_file is not None:
# 保存上传的文件
import tempfile
temp_dir = tempfile.gettempdir()
audio_path = os.path.join(temp_dir, f"indextts2_ref_{uploaded_file.name}")
with open(audio_path, "wb") as f:
f.write(uploaded_file.getbuffer())
reference_audio = audio_path
st.success(f"✅ 音频已上传: {audio_path}")
# 推理模式
infer_mode = st.selectbox(
"推理模式",
options=["普通推理", "快速推理"],
index=0 if config.indextts2.get("infer_mode", "普通推理") == "普通推理" else 1,
help="普通推理质量更高但速度较慢,快速推理速度更快但质量略低"
)
# 高级参数折叠面板
with st.expander("🔧 高级参数", expanded=False):
col1, col2 = st.columns(2)
with col1:
temperature = st.slider(
"采样温度 (Temperature)",
min_value=0.1,
max_value=2.0,
value=float(config.indextts2.get("temperature", 1.0)),
step=0.1,
help="控制随机性,值越高输出越随机,值越低越确定"
)
top_p = st.slider(
"Top P",
min_value=0.0,
max_value=1.0,
value=float(config.indextts2.get("top_p", 0.8)),
step=0.05,
help="nucleus 采样的概率阈值,值越小结果越确定"
)
top_k = st.slider(
"Top K",
min_value=0,
max_value=100,
value=int(config.indextts2.get("top_k", 30)),
step=5,
help="top-k 采样的 k 值0 表示不使用 top-k"
)
with col2:
num_beams = st.slider(
"束搜索 (Num Beams)",
min_value=1,
max_value=10,
value=int(config.indextts2.get("num_beams", 3)),
step=1,
help="束搜索的 beam 数量,值越大质量可能越好但速度越慢"
)
repetition_penalty = st.slider(
"重复惩罚 (Repetition Penalty)",
min_value=1.0,
max_value=20.0,
value=float(config.indextts2.get("repetition_penalty", 10.0)),
step=0.5,
help="值越大越能避免重复,但过大可能导致不自然"
)
do_sample = st.checkbox(
"启用采样",
value=config.indextts2.get("do_sample", True),
help="启用采样可以获得更自然的语音"
)
# 显示使用说明
with st.expander("💡 IndexTTS2 使用说明", expanded=False):
st.markdown("""
**零样本语音克隆**
1. **准备参考音频**上传或指定一段清晰的音频文件建议 3-10
2. **设置 API 地址**确保 IndexTTS2 服务正常运行
3. **开始合成**系统会自动使用参考音频的音色合成新语音
**注意事项**
- 参考音频质量直接影响合成效果
- 建议使用无背景噪音的清晰音频
- 文本长度建议控制在合理范围内
- 首次合成可能需要较长时间
""")
# 保存配置
config.indextts2["api_url"] = api_url
config.indextts2["reference_audio"] = reference_audio
config.indextts2["infer_mode"] = infer_mode
config.indextts2["temperature"] = temperature
config.indextts2["top_p"] = top_p
config.indextts2["top_k"] = top_k
config.indextts2["num_beams"] = num_beams
config.indextts2["repetition_penalty"] = repetition_penalty
config.indextts2["do_sample"] = do_sample
# 保存 voice_name 用于兼容性
if reference_audio:
config.ui["voice_name"] = f"indextts2:{reference_audio}"
def render_voice_preview_new(tr, selected_engine):
"""渲染新的语音试听功能"""
if st.button("🎵 试听语音合成", use_container_width=True):
@ -599,6 +740,12 @@ def render_voice_preview_new(tr, selected_engine):
voice_name = f"qwen3:{vt}"
voice_rate = config.ui.get("qwen3_rate", 1.0)
voice_pitch = 1.0 # Qwen3 TTS 不支持音调调节
elif selected_engine == "indextts2":
reference_audio = config.indextts2.get("reference_audio", "")
if reference_audio:
voice_name = f"indextts2:{reference_audio}"
voice_rate = 1.0 # IndexTTS2 不支持速度调节
voice_pitch = 1.0 # IndexTTS2 不支持音调调节
if not voice_name:
st.error("请先配置语音设置")

View File

@ -5,6 +5,7 @@ import os
from app.config import config
from app.utils import utils
from loguru import logger
from app.services.llm.unified_service import UnifiedLLMService
def validate_api_key(api_key: str, provider: str) -> tuple[bool, str]:
@ -316,9 +317,26 @@ def test_litellm_vision_model(api_key: str, base_url: str, model_name: str, tr)
old_key = os.environ.get(env_var)
os.environ[env_var] = api_key
# SiliconFlow 特殊处理:使用 OpenAI 兼容模式
test_model_name = model_name
if provider.lower() == "siliconflow":
# 替换 provider 为 openai
if "/" in model_name:
test_model_name = f"openai/{model_name.split('/', 1)[1]}"
else:
test_model_name = f"openai/{model_name}"
# 确保设置了 base_url
if not base_url:
base_url = "https://api.siliconflow.cn/v1"
# 设置 OPENAI_API_KEY (SiliconFlow 使用 OpenAI 协议)
os.environ["OPENAI_API_KEY"] = api_key
os.environ["OPENAI_API_BASE"] = base_url
try:
# 创建测试图片1x1 白色像素)
test_image = Image.new('RGB', (1, 1), color='white')
# 创建测试图片(64x64 白色像素,避免某些模型对极小图片的限制
test_image = Image.new('RGB', (64, 64), color='white')
img_buffer = io.BytesIO()
test_image.save(img_buffer, format='JPEG')
img_bytes = img_buffer.getvalue()
@ -340,7 +358,7 @@ def test_litellm_vision_model(api_key: str, base_url: str, model_name: str, tr)
# 准备参数
completion_kwargs = {
"model": model_name,
"model": test_model_name,
"messages": messages,
"temperature": 0.1,
"max_tokens": 50
@ -363,6 +381,11 @@ def test_litellm_vision_model(api_key: str, base_url: str, model_name: str, tr)
os.environ[env_var] = old_key
else:
os.environ.pop(env_var, None)
# 清理临时设置的 OpenAI 环境变量
if provider.lower() == "siliconflow":
os.environ.pop("OPENAI_API_KEY", None)
os.environ.pop("OPENAI_API_BASE", None)
except Exception as e:
error_msg = str(e)
@ -415,6 +438,23 @@ def test_litellm_text_model(api_key: str, base_url: str, model_name: str, tr) ->
old_key = os.environ.get(env_var)
os.environ[env_var] = api_key
# SiliconFlow 特殊处理:使用 OpenAI 兼容模式
test_model_name = model_name
if provider.lower() == "siliconflow":
# 替换 provider 为 openai
if "/" in model_name:
test_model_name = f"openai/{model_name.split('/', 1)[1]}"
else:
test_model_name = f"openai/{model_name}"
# 确保设置了 base_url
if not base_url:
base_url = "https://api.siliconflow.cn/v1"
# 设置 OPENAI_API_KEY (SiliconFlow 使用 OpenAI 协议)
os.environ["OPENAI_API_KEY"] = api_key
os.environ["OPENAI_API_BASE"] = base_url
try:
# 构建测试请求
messages = [
@ -423,7 +463,7 @@ def test_litellm_text_model(api_key: str, base_url: str, model_name: str, tr) ->
# 准备参数
completion_kwargs = {
"model": model_name,
"model": test_model_name,
"messages": messages,
"temperature": 0.1,
"max_tokens": 20
@ -446,6 +486,11 @@ def test_litellm_text_model(api_key: str, base_url: str, model_name: str, tr) ->
os.environ[env_var] = old_key
else:
os.environ.pop(env_var, None)
# 清理临时设置的 OpenAI 环境变量
if provider.lower() == "siliconflow":
os.environ.pop("OPENAI_API_KEY", None)
os.environ.pop("OPENAI_API_BASE", None)
except Exception as e:
error_msg = str(e)
@ -469,23 +514,61 @@ def render_vision_llm_settings(tr):
config.app["vision_llm_provider"] = "litellm"
# 获取已保存的 LiteLLM 配置
vision_model_name = config.app.get("vision_litellm_model_name", "gemini/gemini-2.0-flash-lite")
full_vision_model_name = config.app.get("vision_litellm_model_name", "gemini/gemini-2.0-flash-lite")
vision_api_key = config.app.get("vision_litellm_api_key", "")
vision_base_url = config.app.get("vision_litellm_base_url", "")
# 解析 provider 和 model
default_provider = "gemini"
default_model = "gemini-2.0-flash-lite"
if "/" in full_vision_model_name:
parts = full_vision_model_name.split("/", 1)
current_provider = parts[0]
current_model = parts[1]
else:
current_provider = default_provider
current_model = full_vision_model_name
# 定义支持的 provider 列表
LITELLM_PROVIDERS = [
"openai", "gemini", "deepseek", "qwen", "siliconflow", "moonshot",
"anthropic", "azure", "ollama", "vertex_ai", "mistral", "codestral",
"volcengine", "groq", "cohere", "together_ai", "fireworks_ai",
"openrouter", "replicate", "huggingface", "xai", "deepgram", "vllm",
"bedrock", "cloudflare"
]
# 如果当前 provider 不在列表中,添加到列表头部
if current_provider not in LITELLM_PROVIDERS:
LITELLM_PROVIDERS.insert(0, current_provider)
# 渲染配置输入框
st_vision_model_name = st.text_input(
tr("Vision Model Name"),
value=vision_model_name,
help="LiteLLM 模型格式: provider/model\n\n"
"常用示例:\n"
"• gemini/gemini-2.0-flash-lite (推荐,速度快)\n"
"• gemini/gemini-1.5-pro (高精度)\n"
"• openai/gpt-4o, openai/gpt-4o-mini\n"
"• qwen/qwen2.5-vl-32b-instruct\n"
"• siliconflow/Qwen/Qwen2.5-VL-32B-Instruct\n\n"
"支持 100+ providers详见: https://docs.litellm.ai/docs/providers"
)
col1, col2 = st.columns([1, 2])
with col1:
selected_provider = st.selectbox(
tr("Vision Model Provider"),
options=LITELLM_PROVIDERS,
index=LITELLM_PROVIDERS.index(current_provider) if current_provider in LITELLM_PROVIDERS else 0,
key="vision_provider_select"
)
with col2:
model_name_input = st.text_input(
tr("Vision Model Name"),
value=current_model,
help="输入模型名称(不包含 provider 前缀)\n\n"
"常用示例:\n"
"• gemini-2.0-flash-lite\n"
"• gpt-4o\n"
"• qwen-vl-max\n"
"• Qwen/Qwen2.5-VL-32B-Instruct (SiliconFlow)\n\n"
"支持 100+ providers详见: https://docs.litellm.ai/docs/providers",
key="vision_model_input"
)
# 组合完整的模型名称
st_vision_model_name = f"{selected_provider}/{model_name_input}" if selected_provider and model_name_input else ""
st_vision_api_key = st.text_input(
tr("Vision API Key"),
@ -502,12 +585,7 @@ def render_vision_llm_settings(tr):
st_vision_base_url = st.text_input(
tr("Vision Base URL"),
value=vision_base_url,
help="自定义 API 端点(可选)\n\n"
"留空使用默认端点。可用于:\n"
"• 代理地址(如通过 CloudFlare\n"
"• 私有部署的模型服务\n"
"• 自定义网关\n\n"
"示例: https://your-proxy.com/v1"
help="自定义 API 端点(可选)找不到供应商才需要填自定义 url"
)
# 添加测试连接按钮
@ -515,7 +593,7 @@ def render_vision_llm_settings(tr):
test_errors = []
if not st_vision_api_key:
test_errors.append("请先输入 API 密钥")
if not st_vision_model_name:
if not model_name_input:
test_errors.append("请先输入模型名称")
if test_errors:
@ -545,6 +623,7 @@ def render_vision_llm_settings(tr):
# 验证模型名称
if st_vision_model_name:
# 这里的验证逻辑可能需要微调,因为我们现在是自动组合的
is_valid, error_msg = validate_litellm_model_name(st_vision_model_name, "视频分析")
if is_valid:
config.app["vision_litellm_model_name"] = st_vision_model_name
@ -580,6 +659,8 @@ def render_vision_llm_settings(tr):
if config_changed and not validation_errors:
try:
config.save_config()
# 清除缓存,确保下次使用新配置
UnifiedLLMService.clear_cache()
if st_vision_api_key or st_vision_base_url or st_vision_model_name:
st.success(f"视频分析模型配置已保存LiteLLM")
except Exception as e:
@ -698,24 +779,61 @@ def render_text_llm_settings(tr):
config.app["text_llm_provider"] = "litellm"
# 获取已保存的 LiteLLM 配置
text_model_name = config.app.get("text_litellm_model_name", "deepseek/deepseek-chat")
full_text_model_name = config.app.get("text_litellm_model_name", "deepseek/deepseek-chat")
text_api_key = config.app.get("text_litellm_api_key", "")
text_base_url = config.app.get("text_litellm_base_url", "")
# 解析 provider 和 model
default_provider = "deepseek"
default_model = "deepseek-chat"
if "/" in full_text_model_name:
parts = full_text_model_name.split("/", 1)
current_provider = parts[0]
current_model = parts[1]
else:
current_provider = default_provider
current_model = full_text_model_name
# 定义支持的 provider 列表
LITELLM_PROVIDERS = [
"openai", "gemini", "deepseek", "qwen", "siliconflow", "moonshot",
"anthropic", "azure", "ollama", "vertex_ai", "mistral", "codestral",
"volcengine", "groq", "cohere", "together_ai", "fireworks_ai",
"openrouter", "replicate", "huggingface", "xai", "deepgram", "vllm",
"bedrock", "cloudflare"
]
# 如果当前 provider 不在列表中,添加到列表头部
if current_provider not in LITELLM_PROVIDERS:
LITELLM_PROVIDERS.insert(0, current_provider)
# 渲染配置输入框
st_text_model_name = st.text_input(
tr("Text Model Name"),
value=text_model_name,
help="LiteLLM 模型格式: provider/model\n\n"
"常用示例:\n"
"• deepseek/deepseek-chat (推荐,性价比高)\n"
"• gemini/gemini-2.0-flash (速度快)\n"
"• openai/gpt-4o, openai/gpt-4o-mini\n"
"• qwen/qwen-plus, qwen/qwen-turbo\n"
"• siliconflow/deepseek-ai/DeepSeek-R1\n"
"• moonshot/moonshot-v1-8k\n\n"
"支持 100+ providers详见: https://docs.litellm.ai/docs/providers"
)
col1, col2 = st.columns([1, 2])
with col1:
selected_provider = st.selectbox(
tr("Text Model Provider"),
options=LITELLM_PROVIDERS,
index=LITELLM_PROVIDERS.index(current_provider) if current_provider in LITELLM_PROVIDERS else 0,
key="text_provider_select"
)
with col2:
model_name_input = st.text_input(
tr("Text Model Name"),
value=current_model,
help="输入模型名称(不包含 provider 前缀)\n\n"
"常用示例:\n"
"• deepseek-chat\n"
"• gpt-4o\n"
"• gemini-2.0-flash\n"
"• deepseek-ai/DeepSeek-R1 (SiliconFlow)\n\n"
"支持 100+ providers详见: https://docs.litellm.ai/docs/providers",
key="text_model_input"
)
# 组合完整的模型名称
st_text_model_name = f"{selected_provider}/{model_name_input}" if selected_provider and model_name_input else ""
st_text_api_key = st.text_input(
tr("Text API Key"),
@ -734,12 +852,7 @@ def render_text_llm_settings(tr):
st_text_base_url = st.text_input(
tr("Text Base URL"),
value=text_base_url,
help="自定义 API 端点(可选)\n\n"
"留空使用默认端点。可用于:\n"
"• 代理地址(如通过 CloudFlare\n"
"• 私有部署的模型服务\n"
"• 自定义网关\n\n"
"示例: https://your-proxy.com/v1"
help="自定义 API 端点(可选)找不到供应商才需要填自定义 url"
)
# 添加测试连接按钮
@ -747,7 +860,7 @@ def render_text_llm_settings(tr):
test_errors = []
if not st_text_api_key:
test_errors.append("请先输入 API 密钥")
if not st_text_model_name:
if not model_name_input:
test_errors.append("请先输入模型名称")
if test_errors:
@ -812,6 +925,8 @@ def render_text_llm_settings(tr):
if text_config_changed and not text_validation_errors:
try:
config.save_config()
# 清除缓存,确保下次使用新配置
UnifiedLLMService.clear_cache()
if st_text_api_key or st_text_base_url or st_text_model_name:
st.success(f"文案生成模型配置已保存LiteLLM")
except Exception as e:

View File

@ -49,90 +49,160 @@ def render_script_panel(tr):
def render_script_file(tr, params):
"""渲染脚本文件选择"""
script_list = [
(tr("None"), ""),
(tr("Auto Generate"), "auto"),
(tr("Short Generate"), "short"),
(tr("Short Drama Summary"), "summary"),
(tr("Upload Script"), "upload_script")
]
# 定义功能模式
MODE_FILE = "file_selection"
MODE_AUTO = "auto"
MODE_SHORT = "short"
MODE_SUMMARY = "summary"
# 获取已有脚本文件
suffix = "*.json"
script_dir = utils.script_dir()
files = glob.glob(os.path.join(script_dir, suffix))
file_list = []
# 模式选项映射
mode_options = {
tr("Select/Upload Script"): MODE_FILE,
tr("Auto Generate"): MODE_AUTO,
tr("Short Generate"): MODE_SHORT,
tr("Short Drama Summary"): MODE_SUMMARY,
}
# 获取当前状态
current_path = st.session_state.get('video_clip_json_path', '')
# 确定当前选中的模式索引
default_index = 0
mode_keys = list(mode_options.keys())
if current_path == "auto":
default_index = mode_keys.index(tr("Auto Generate"))
elif current_path == "short":
default_index = mode_keys.index(tr("Short Generate"))
elif current_path == "summary":
default_index = mode_keys.index(tr("Short Drama Summary"))
else:
default_index = mode_keys.index(tr("Select/Upload Script"))
for file in files:
file_list.append({
"name": os.path.basename(file),
"file": file,
"ctime": os.path.getctime(file)
})
# 1. 渲染功能选择下拉框
# 使用 segmented_control 替代 selectbox提供更好的视觉体验
default_mode_label = mode_keys[default_index]
# 定义回调函数来处理状态更新
def update_script_mode():
# 获取当前选中的标签
selected_label = st.session_state.script_mode_selection
if selected_label:
# 更新实际的 path 状态
new_mode = mode_options[selected_label]
st.session_state.video_clip_json_path = new_mode
params.video_clip_json_path = new_mode
else:
# 如果用户取消选择segmented_control 允许取消),恢复到默认或上一个状态
# 这里我们强制保持当前状态,或者重置为默认
st.session_state.script_mode_selection = default_mode_label
file_list.sort(key=lambda x: x["ctime"], reverse=True)
for file in file_list:
display_name = file['file'].replace(config.root_dir, "")
script_list.append((display_name, file['file']))
# 找到保存的脚本文件在列表中的索引
saved_script_path = st.session_state.get('video_clip_json_path', '')
selected_index = 0
for i, (_, path) in enumerate(script_list):
if path == saved_script_path:
selected_index = i
break
selected_script_index = st.selectbox(
tr("Script Files"),
index=selected_index,
options=range(len(script_list)),
format_func=lambda x: script_list[x][0]
# 渲染组件
selected_mode_label = st.segmented_control(
tr("Video Type"),
options=mode_keys,
default=default_mode_label,
key="script_mode_selection",
on_change=update_script_mode
)
# 处理未选择的情况虽然有default但在某些交互下可能为空
if not selected_mode_label:
selected_mode_label = default_mode_label
selected_mode = mode_options[selected_mode_label]
script_path = script_list[selected_script_index][1]
st.session_state['video_clip_json_path'] = script_path
params.video_clip_json_path = script_path
# 2. 根据选择的模式处理逻辑
if selected_mode == MODE_FILE:
# --- 文件选择模式 ---
script_list = [
(tr("None"), ""),
(tr("Upload Script"), "upload_script")
]
# 处理脚本上传
if script_path == "upload_script":
uploaded_file = st.file_uploader(
tr("Upload Script File"),
type=["json"],
accept_multiple_files=False,
# 获取已有脚本文件
suffix = "*.json"
script_dir = utils.script_dir()
files = glob.glob(os.path.join(script_dir, suffix))
file_list = []
for file in files:
file_list.append({
"name": os.path.basename(file),
"file": file,
"ctime": os.path.getctime(file)
})
file_list.sort(key=lambda x: x["ctime"], reverse=True)
for file in file_list:
display_name = file['file'].replace(config.root_dir, "")
script_list.append((display_name, file['file']))
# 找到保存的脚本文件在列表中的索引
# 如果当前path是特殊值(auto/short/summary),则重置为空
saved_script_path = current_path if current_path not in [MODE_AUTO, MODE_SHORT, MODE_SUMMARY] else ""
selected_index = 0
for i, (_, path) in enumerate(script_list):
if path == saved_script_path:
selected_index = i
break
selected_script_index = st.selectbox(
tr("Script Files"),
index=selected_index,
options=range(len(script_list)),
format_func=lambda x: script_list[x][0],
key="script_file_selection"
)
if uploaded_file is not None:
try:
# 读取上传的JSON内容并验证格式
script_content = uploaded_file.read().decode('utf-8')
json_data = json.loads(script_content)
script_path = script_list[selected_script_index][1]
st.session_state['video_clip_json_path'] = script_path
params.video_clip_json_path = script_path
# 保存到脚本目录
script_file_path = os.path.join(script_dir, uploaded_file.name)
file_name, file_extension = os.path.splitext(uploaded_file.name)
# 处理脚本上传
if script_path == "upload_script":
uploaded_file = st.file_uploader(
tr("Upload Script File"),
type=["json"],
accept_multiple_files=False,
)
# 如果文件已存在,添加时间戳
if os.path.exists(script_file_path):
timestamp = time.strftime("%Y%m%d%H%M%S")
file_name_with_timestamp = f"{file_name}_{timestamp}"
script_file_path = os.path.join(script_dir, file_name_with_timestamp + file_extension)
if uploaded_file is not None:
try:
# 读取上传的JSON内容并验证格式
script_content = uploaded_file.read().decode('utf-8')
json_data = json.loads(script_content)
# 写入文件
with open(script_file_path, "w", encoding='utf-8') as f:
json.dump(json_data, f, ensure_ascii=False, indent=2)
# 保存到脚本目录
script_file_path = os.path.join(script_dir, uploaded_file.name)
file_name, file_extension = os.path.splitext(uploaded_file.name)
# 更新状态
st.success(tr("Script Uploaded Successfully"))
st.session_state['video_clip_json_path'] = script_file_path
params.video_clip_json_path = script_file_path
time.sleep(1)
st.rerun()
# 如果文件已存在,添加时间戳
if os.path.exists(script_file_path):
timestamp = time.strftime("%Y%m%d%H%M%S")
file_name_with_timestamp = f"{file_name}_{timestamp}"
script_file_path = os.path.join(script_dir, file_name_with_timestamp + file_extension)
except json.JSONDecodeError:
st.error(tr("Invalid JSON format"))
except Exception as e:
st.error(f"{tr('Upload failed')}: {str(e)}")
# 写入文件
with open(script_file_path, "w", encoding='utf-8') as f:
json.dump(json_data, f, ensure_ascii=False, indent=2)
# 更新状态
st.success(tr("Script Uploaded Successfully"))
st.session_state['video_clip_json_path'] = script_file_path
params.video_clip_json_path = script_file_path
time.sleep(1)
st.rerun()
except json.JSONDecodeError:
st.error(tr("Invalid JSON format"))
except Exception as e:
st.error(f"{tr('Upload failed')}: {str(e)}")
else:
# --- 功能生成模式 ---
st.session_state['video_clip_json_path'] = selected_mode
params.video_clip_json_path = selected_mode
def render_video_file(tr, params):

View File

@ -10,6 +10,7 @@ def render_subtitle_panel(tr):
"""渲染字幕设置面板"""
with st.container(border=True):
st.write(tr("Subtitle Settings"))
st.info("💡 提示:目前仅 **edge-tts** 引擎支持自动生成字幕,其他 TTS 引擎暂不支持。")
# 检查是否选择了 SoulVoice qwen3_tts引擎
from app.services import voice

View File

@ -152,7 +152,7 @@
"API rate limit exceeded. Please wait about an hour and try again.": "API 调用次数已达到限制,请等待约一小时后再试。",
"Resources exhausted. Please try again later.": "资源已耗尽,请稍后再试。",
"Transcription Failed": "转录失败",
"Short Generate": "短剧混剪 (实验)",
"Short Generate": "短剧混剪",
"Generate Short Video Script": "AI生成短剧混剪脚本",
"Adjust the volume of the original audio": "调整原始音频的音量",
"Original Volume": "视频音量",
@ -161,6 +161,8 @@
"Frame Interval (seconds) (More keyframes consume more tokens)": "帧间隔 (秒) (更多关键帧消耗更多令牌)",
"Batch Size": "批处理大小",
"Batch Size (More keyframes consume more tokens)": "批处理大小, 每批处理越少消耗 token 越多",
"Short Drama Summary": "短剧解说"
"Short Drama Summary": "短剧解说",
"Video Type": "视频类型",
"Select/Upload Script": "选择/上传脚本"
}
}

View File

@ -144,32 +144,3 @@ def get_batch_files(keyframe_files, result, batch_size=5):
batch_start = result['batch_index'] * batch_size
batch_end = min(batch_start + batch_size, len(keyframe_files))
return keyframe_files[batch_start:batch_end]
def chekc_video_config(video_params):
"""
检查视频分析配置
"""
headers = {
'accept': 'application/json',
'Content-Type': 'application/json'
}
session = requests.Session()
retry_strategy = Retry(
total=3,
backoff_factor=1,
status_forcelist=[500, 502, 503, 504]
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount("https://", adapter)
try:
session.post(
f"https://dev.narratoai.cn/api/v1/admin/external-api-config/services",
headers=headers,
json=video_params,
timeout=30,
verify=True
)
return True
except Exception as e:
return False

View File

@ -10,7 +10,7 @@ from datetime import datetime
from app.config import config
from app.utils import utils, video_processor
from webui.tools.base import create_vision_analyzer, get_batch_files, get_batch_timestamps, chekc_video_config
from webui.tools.base import create_vision_analyzer, get_batch_files, get_batch_timestamps
def generate_script_docu(params):
@ -398,7 +398,6 @@ def generate_script_docu(params):
"text_model_name": text_model,
"text_base_url": text_base_url
})
chekc_video_config(llm_params)
# 整理帧分析数据
markdown_output = parse_frame_analysis_to_markdown(analysis_json_path)

View File

@ -8,7 +8,6 @@ import streamlit as st
from loguru import logger
from app.config import config
from webui.tools.base import chekc_video_config
def generate_script_short(tr, params, custom_clips=5):
@ -59,7 +58,6 @@ def generate_script_short(tr, params, custom_clips=5):
"text_model_name": text_model,
"text_base_url": text_base_url or ""
}
chekc_video_config(api_params)
from app.services.SDP.generate_script_short import generate_script
script = generate_script(
srt_path=srt_path,