优化文档

This commit is contained in:
linyq 2024-08-20 10:53:55 +08:00
parent f2d2fbb2ba
commit d67be7f98d
16 changed files with 344 additions and 281 deletions

View File

@ -1,173 +0,0 @@
<div align="center">
<h1 align="center" style="font-size: 2cm;"> NarratoAI 😎📽️ </h1>
<h3 align="center">An all-in-one AI-powered tool for film commentary and automated video editing.🎬🎞️ </h3>
<h3>📖 <a href="README-en.md">English</a> | 简体中文 </h3>
<div align="center">
[//]: # ( <a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FNarratoAI | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>)
</div>
<br>
NarratoAI is an automated video narration tool that provides an all-in-one solution for script writing, automated video editing, voice-over, and subtitle generation, powered by LLM to enhance efficient content creation.
<br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/linyqh/NarratoAI)
[![GitHub license](https://img.shields.io/github/license/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/blob/main/LICENSE)
[![GitHub issues](https://img.shields.io/github/issues/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/issues)
[![GitHub stars](https://img.shields.io/github/stars/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/stargazers)
[![Discord](https://img.shields.io/discord/1134848537704804432?style=for-the-badge)](https://discord.gg/WBKChhmZ)
<h3>Home</h3>
![](docs/index.png)
<h3>Video Review Interface</h3>
![](docs/check.png)
</div>
## System Requirements 📦
- Recommended minimum: CPU with 4 cores or more, 8GB RAM or more, GPU is not required
- Windows 10 or MacOS 11.0 or above
## Quick Start 🚀
### Apply for Google AI Studio Account
1. Visit https://aistudio.google.com/app/prompts/new_chat to apply for an account.
2. Click `Get API Key` to request an API Key.
3. Enter the obtained API Key into the `gemini_api_key` setting in the `config.example.toml` file.
### Configure Proxy VPN
> The method to configure VPN is not restricted, as long as you can access Google's network. Here, `clash` is used as an example.
1. Note the port of the clash service, usually `http://127.0.0.1:7890`.
2. If the port is not `7890`, modify the `VPN_PROXY_URL` in the `docker-compose.yml` file to your proxy address.
```yaml
environment:
- "VPN_PROXY_URL=http://host.docker.internal:7890" # Change to your proxy port; host.docker.internal represents the IP of the physical machine.
```
3. (Optional) Or modify the `proxy` settings in the `config.example.toml` file.
```toml
[proxy]
### Use a proxy to access the Pexels API
### Format: "http://<username>:<password>@<proxy>:<port>"
### Example: "http://user:pass@proxy:1234"
### Doc: https://requests.readthedocs.io/en/latest/user/advanced/#proxies
http = "http://xx.xx.xx.xx:7890"
https = "http://xx.xx.xx.xx:7890"
```
### Docker Deployment 🐳
#### ① Start Docker
```shell
cd NarratoAI
docker-compose up
```
#### ② Access the Web Interface
Open your browser and go to http://127.0.0.1:8501
#### ③ Access the API Documentation
Open your browser and go to http://127.0.0.1:8080/docs or http://127.0.0.1:8080/redoc
## Usage
#### 1. Basic Configuration, Select Model, Enter API Key, and Choose Model
> Currently, only the `Gemini` model is supported. Other modes will be added in future updates. Contributions are welcome via [PR](https://github.com/linyqh/NarratoAI/pulls) to join in the development 🎉🎉🎉
<div align="center">
<img src="docs/img001.png" alt="001" width="1000"/>
</div>
#### 2. Select the Video for Narration and Click to Generate Video Script
> A demo video is included in the platform. To use your own video, place the mp4 file in the `resource/videos` directory and refresh your browser.
> Note: The filename can be anything, but it must not contain Chinese characters, special characters, spaces, backslashes, etc.
<div align="center">
<img src="docs/img002.png" alt="002" width="400"/>
</div>
#### 3. Save the Script and Start Editing
> After saving the script, refresh the browser, and the newly generated `.json` script file will appear in the script file dropdown. Select the json file and video to start editing.
<div align="center">
<img src="docs/img003.png" alt="003" width="400"/>
</div>
#### 4. Review the Video; if there are segments that don't meet the rules, click to regenerate or manually edit them.
<div align="center">
<img src="docs/img004.png" alt="003" width="1000"/>
</div>
#### 5. Configure Basic Video Parameters
<div align="center">
<img src="docs/img005.png" alt="003" width="700"/>
</div>
#### 6. Start Generating
<div align="center">
<img src="docs/img006.png" alt="003" width="1000"/>
</div>
#### 7. Video Generation Complete
<div align="center">
<img src="docs/img007.png" alt="003" width="1000"/>
</div>
## Development 💻
1. Install Dependencies
```shell
conda create -n narratoai python=3.10
conda activate narratoai
cd narratoai
pip install -r requirements.txt
```
2. Install ImageMagick
###### Windows:
- Download https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-36-Q16-x64-static.exe
- Install the downloaded ImageMagick, ensuring you do not change the installation path
- Update `imagemagick_path` in the `config.toml` file to your actual installation path (typically `C:\Program Files\ImageMagick-7.1.1-Q16\magick.exe`)
###### MacOS:
```shell
brew install imagemagick
````
###### Ubuntu
```shell
sudo apt-get install imagemagick
```
###### CentOS
```shell
sudo yum install ImageMagick
```
3. initiate webui
```shell
streamlit run ./webui/Main.py --browser.serverAddress=127.0.0.1 --server.enableCORS=True --browser.gatherUsageStats=False
```
4. Access http://127.0.0.1:8501
## Feedback & Suggestions 📢
### 👏👏👏 You can submit [issues](https://github.com/linyqh/NarratoAI/issues) or [pull requests](https://github.com/linyqh/NarratoAI/pulls) 🎉🎉🎉
## Reference Projects 📚
- https://github.com/FujiwaraChoki/MoneyPrinter
- https://github.com/harry0703/MoneyPrinterTurbo
This project was refactored based on the above projects with the addition of video narration features. Thanks to the original authors for their open-source spirit 🥳🥳🥳
## License 📝
Click to view the [`LICENSE`](LICENSE) file
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=linyqh/NarratoAI&type=Date)](https://star-history.com/#linyqh/NarratoAI&Date)

175
README-zh.md Normal file
View File

@ -0,0 +1,175 @@
<div align="center">
<h1 align="center" style="font-size: 2cm;"> NarratoAI 😎📽️ </h1>
<h3 align="center">一站式 AI 影视解说+自动化剪辑工具🎬🎞️ </h3>
<h3>📖 <a href="README-en.md">English</a> | 简体中文 </h3>
<div align="center">
[//]: # ( <a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FNarratoAI | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>)
</div>
<br>
NarratoAI 是一个自动化影视解说工具基于LLM实现文案撰写、自动化视频剪辑、配音和字幕生成的一站式流程助力高效内容创作。
<br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/linyqh/NarratoAI)
[![GitHub license](https://img.shields.io/github/license/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/blob/main/LICENSE)
[![GitHub issues](https://img.shields.io/github/issues/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/issues)
[![GitHub stars](https://img.shields.io/github/stars/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/stargazers)
[![Discord](https://img.shields.io/discord/1134848537704804432?style=for-the-badge)](https://discord.gg/WBKChhmZ)
<h3>首页</h3>
![](docs/index-zh.png)
<h3>视频审查界面</h3>
![](docs/check-zh.png)
</div>
## 配置要求 📦
- 建议最低 CPU 4核或以上内存 8G 或以上,显卡非必须
- Windows 10 或 MacOS 11.0 以上系统
## 快速开始 🚀
### 申请 Google AI studio 账号
1. 访问 https://aistudio.google.com/app/prompts/new_chat 申请账号
2. 点击 `Get API Key` 申请 API Key
3. 申请的 API Key 填入 `config.example.toml` 文件中的 `gemini_api_key` 配置
### 配置 proxy VPN
> 配置vpn的方法不限只要能正常访问 Google 网络即可,本文采用的是 chash
1. 记住 clash 服务的端口,一般为 `http://127.0.0.1:7890`
2. 若端口不为 `7890`,请修改 `docker-compose.yml` 文件中的 `VPN_PROXY_URL` 为你的代理地址
```yaml
environment:
- "VPN_PROXY_URL=http://host.docker.internal:7890" # 修改为你的代理端口host.docker.internal表示物理机的IP
```
3. (可选)或者修改 `config.example.toml` 文件中的 `proxy` 配置
```toml
[proxy]
### Use a proxy to access the Pexels API
### Format: "http://<username>:<password>@<proxy>:<port>"
### Example: "http://user:pass@proxy:1234"
### Doc: https://requests.readthedocs.io/en/latest/user/advanced/#proxies
http = "http://xx.xx.xx.xx:7890"
https = "http://xx.xx.xx.xx:7890"
```
### docker部署🐳
#### ① 拉取项目启动Docker
```shell
git clone https://github.com/linyqh/NarratoAI.git
cd NarratoAI
docker-compose up
```
#### ② 访问Web界面
打开浏览器,访问 http://127.0.0.1:8501
#### ③ 访问API文档
打开浏览器,访问 http://127.0.0.1:8080/docs 或者 http://127.0.0.1:8080/redoc
## 使用方法
#### 1. 基础配置选择模型填入APIKey选择模型
> 目前暂时只支持 `Gemini` 模型,其他模式待后续更新,欢迎大家提交 [PR](https://github.com/linyqh/NarratoAI/pulls),参与开发 🎉🎉🎉
<div align="center">
<img src="docs/img001-zh.png" alt="001" width="1000"/>
</div>
#### 2. 选择需要解说的视频,点击生成视频脚本
> 平台内置了一个演示视频若要使用自己的视频将mp4文件放在 `resource/videos` 目录下,刷新浏览器即可,
> 注意:文件名随意,但文件名不能包含中文,特殊字符,空格,反斜杠等
<div align="center">
<img src="docs/img002-zh.png" alt="002" width="400"/>
</div>
#### 3. 保存脚本,开始剪辑
> 保存脚本后,刷新浏览器,在脚本文件的下拉框就会有新生成的 `.json` 脚本文件选择json文件和视频就可以开始剪辑了。
<div align="center">
<img src="docs/img003-zh.png" alt="003" width="400"/>
</div>
#### 4. 检查视频,若视频存在不符合规则的片段,可以点击重新生成或者手动编辑
<div align="center">
<img src="docs/img004-zh.png" alt="003" width="1000"/>
</div>
#### 5. 配置视频基本参数
<div align="center">
<img src="docs/img005-zh.png" alt="003" width="700"/>
</div>
#### 6. 开始生成
<div align="center">
<img src="docs/img006-zh.png" alt="003" width="1000"/>
</div>
#### 7. 视频生成完成
<div align="center">
<img src="docs/img007-zh.png" alt="003" width="1000"/>
</div>
## 开发 💻
1. 安装依赖
```shell
conda create -n narratoai python=3.10
conda activate narratoai
cd narratoai
pip install -r requirements.txt
```
2. 安装 ImageMagick
###### Windows:
- 下载 https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-36-Q16-x64-static.exe
- 安装下载好的 ImageMagick注意不要修改安装路径
- 修改 `配置文件 config.toml` 中的 `imagemagick_path` 为你的实际安装路径(一般在 `C:\Program Files\ImageMagick-7.1.1-Q16\magick.exe`
###### MacOS:
```shell
brew install imagemagick
````
###### Ubuntu
```shell
sudo apt-get install imagemagick
```
###### CentOS
```shell
sudo yum install ImageMagick
```
3. 启动 webui
```shell
streamlit run ./webui/Main.py --browser.serverAddress=127.0.0.1 --server.enableCORS=True --browser.gatherUsageStats=False
```
4. 访问 http://127.0.0.1:8501
## 反馈建议 📢
### 👏👏👏 可以提交 [issue](https://github.com/linyqh/NarratoAI/issues)或者 [pull request](https://github.com/linyqh/NarratoAI/pulls) 🎉🎉🎉
## 参考项目 📚
- https://github.com/FujiwaraChoki/MoneyPrinter
- https://github.com/harry0703/MoneyPrinterTurbo
该项目基于以上项目重构而来,增加了影视解说功能,感谢大佬的开源精神 🥳🥳🥳
## 许可证 📝
点击查看 [`LICENSE`](LICENSE) 文件
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=linyqh/NarratoAI&type=Date)](https://star-history.com/#linyqh/NarratoAI&Date)

127
README.md
View File

@ -1,7 +1,7 @@
<div align="center">
<h1 align="center" style="font-size: 2cm;"> NarratoAI 😎📽️ </h1>
<h3 align="center">一站式 AI 影视解说+自动化剪辑工具🎬🎞️ </h3>
<h3 align="center">An all-in-one AI-powered tool for film commentary and automated video editing.🎬🎞️ </h3>
<h3>📖 <a href="README-en.md">English</a> | 简体中文 </h3>
@ -10,7 +10,7 @@
[//]: # ( <a href="https://trendshift.io/repositories/8731" target="_blank"><img src="https://trendshift.io/api/badge/repositories/8731" alt="harry0703%2FNarratoAI | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>)
</div>
<br>
NarratoAI 是一个自动化影视解说工具基于LLM实现文案撰写、自动化视频剪辑、配音和字幕生成的一站式流程助力高效内容创作。
NarratoAI is an automated video narration tool that provides an all-in-one solution for script writing, automated video editing, voice-over, and subtitle generation, powered by LLM to enhance efficient content creation.
<br>
[![madewithlove](https://img.shields.io/badge/made_with-%E2%9D%A4-red?style=for-the-badge&labelColor=orange)](https://github.com/linyqh/NarratoAI)
@ -19,37 +19,37 @@ NarratoAI 是一个自动化影视解说工具基于LLM实现文案撰写、
[![GitHub stars](https://img.shields.io/github/stars/linyqh/NarratoAI?style=for-the-badge)](https://github.com/linyqh/NarratoAI/stargazers)
[![Discord](https://img.shields.io/discord/1134848537704804432?style=for-the-badge)](https://discord.gg/WBKChhmZ)
<h3>Home</h3>
<h3>首页</h3>
![](docs/index-en.png)
![](docs/index.png)
<h3>Video Review Interface</h3>
<h3>视频审查界面</h3>
![](docs/check.png)
![](docs/check-en.png)
</div>
## 配置要求 📦
## System Requirements 📦
- 建议最低 CPU 4核或以上内存 8G 或以上,显卡非必须
- Windows 10 或 MacOS 11.0 以上系统
- Recommended minimum: CPU with 4 cores or more, 8GB RAM or more, GPU is not required
- Windows 10 or MacOS 11.0 or above
## 快速开始 🚀
### 申请 Google AI studio 账号
1. 访问 https://aistudio.google.com/app/prompts/new_chat 申请账号
2. 点击 `Get API Key` 申请 API Key
3. 申请的 API Key 填入 `config.example.toml` 文件中的 `gemini_api_key` 配置
## Quick Start 🚀
### Apply for Google AI Studio Account
1. Visit https://aistudio.google.com/app/prompts/new_chat to apply for an account.
2. Click `Get API Key` to request an API Key.
3. Enter the obtained API Key into the `gemini_api_key` setting in the `config.example.toml` file.
### 配置 proxy VPN
> 配置vpn的方法不限只要能正常访问 Google 网络即可,本文采用的是 chash
1. 记住 clash 服务的端口,一般为 `http://127.0.0.1:7890`
2. 若端口不为 `7890`,请修改 `docker-compose.yml` 文件中的 `VPN_PROXY_URL` 为你的代理地址
### Configure Proxy VPN
> The method to configure VPN is not restricted, as long as you can access Google's network. Here, `clash` is used as an example.
1. Note the port of the clash service, usually `http://127.0.0.1:7890`.
2. If the port is not `7890`, modify the `VPN_PROXY_URL` in the `docker-compose.yml` file to your proxy address.
```yaml
environment:
- "VPN_PROXY_URL=http://host.docker.internal:7890" # 修改为你的代理端口host.docker.internal表示物理机的IP
```
3. (可选)或者修改 `config.example.toml` 文件中的 `proxy` 配置
- "VPN_PROXY_URL=http://host.docker.internal:7890" # Change to your proxy port; host.docker.internal represents the IP of the physical machine.
```
3. (Optional) Or modify the `proxy` settings in the `config.example.toml` file.
```toml
[proxy]
### Use a proxy to access the Pexels API
@ -60,76 +60,76 @@ NarratoAI 是一个自动化影视解说工具基于LLM实现文案撰写、
http = "http://xx.xx.xx.xx:7890"
https = "http://xx.xx.xx.xx:7890"
```
### docker部署🐳
#### ① 垃取项目启动Docker
### Docker Deployment 🐳
#### ① clone project, Start Docker
```shell
git clone https://github.com/linyqh/NarratoAI.git
cd NarratoAI
docker-compose up
```
#### ② 访问Web界面
#### ② Access the Web Interface
打开浏览器,访问 http://127.0.0.1:8501
Open your browser and go to http://127.0.0.1:8501
#### ③ 访问API文档
#### ③ Access the API Documentation
打开浏览器,访问 http://127.0.0.1:8080/docs 或者 http://127.0.0.1:8080/redoc
Open your browser and go to http://127.0.0.1:8080/docs or http://127.0.0.1:8080/redoc
## 使用方法
#### 1. 基础配置选择模型填入APIKey选择模型
> 目前暂时只支持 `Gemini` 模型,其他模式待后续更新,欢迎大家提交 [PR](https://github.com/linyqh/NarratoAI/pulls),参与开发 🎉🎉🎉
## Usage
#### 1. Basic Configuration, Select Model, Enter API Key, and Choose Model
> Currently, only the `Gemini` model is supported. Other modes will be added in future updates. Contributions are welcome via [PR](https://github.com/linyqh/NarratoAI/pulls) to join in the development 🎉🎉🎉
<div align="center">
<img src="docs/img001.png" alt="001" width="1000"/>
<img src="docs/img001-en.png" alt="001" width="1000"/>
</div>
#### 2. 选择需要解说的视频,点击生成视频脚本
> 平台内置了一个演示视频若要使用自己的视频将mp4文件放在 `resource/videos` 目录下,刷新浏览器即可,
> 注意:文件名随意,但文件名不能包含中文,特殊字符,空格,反斜杠等
#### 2. Select the Video for Narration and Click to Generate Video Script
> A demo video is included in the platform. To use your own video, place the mp4 file in the `resource/videos` directory and refresh your browser.
> Note: The filename can be anything, but it must not contain Chinese characters, special characters, spaces, backslashes, etc.
<div align="center">
<img src="docs/img002.png" alt="002" width="400"/>
<img src="docs/img002-en.png" alt="002" width="400"/>
</div>
#### 3. 保存脚本,开始剪辑
> 保存脚本后,刷新浏览器,在脚本文件的下拉框就会有新生成的 `.json` 脚本文件选择json文件和视频就可以开始剪辑了。
#### 3. Save the Script and Start Editing
> After saving the script, refresh the browser, and the newly generated `.json` script file will appear in the script file dropdown. Select the json file and video to start editing.
<div align="center">
<img src="docs/img003.png" alt="003" width="400"/>
<img src="docs/img003-en.png" alt="003" width="400"/>
</div>
#### 4. 检查视频,若视频存在不符合规则的片段,可以点击重新生成或者手动编辑
#### 4. Review the Video; if there are segments that don't meet the rules, click to regenerate or manually edit them.
<div align="center">
<img src="docs/img004.png" alt="003" width="1000"/>
<img src="docs/img004-en.png" alt="003" width="1000"/>
</div>
#### 5. 配置视频基本参数
#### 5. Configure Basic Video Parameters
<div align="center">
<img src="docs/img005.png" alt="003" width="700"/>
<img src="docs/img005-en.png" alt="003" width="700"/>
</div>
#### 6. 开始生成
#### 6. Start Generating
<div align="center">
<img src="docs/img006.png" alt="003" width="1000"/>
<img src="docs/img006-en.png" alt="003" width="1000"/>
</div>
#### 7. 视频生成完成
#### 7. Video Generation Complete
<div align="center">
<img src="docs/img007.png" alt="003" width="1000"/>
<img src="docs/img007-en.png" alt="003" width="1000"/>
</div>
## 开发 💻
1. 安装依赖
## Development 💻
1. Install Dependencies
```shell
conda create -n narratoai python=3.10
conda activate narratoai
cd narratoai
pip install -r requirements.txt
```
2. 安装 ImageMagick
2. Install ImageMagick
###### Windows:
- 下载 https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-36-Q16-x64-static.exe
- 安装下载好的 ImageMagick注意不要修改安装路径
- 修改 `配置文件 config.toml` 中的 `imagemagick_path` 为你的实际安装路径(一般在 `C:\Program Files\ImageMagick-7.1.1-Q16\magick.exe`
- Download https://imagemagick.org/archive/binaries/ImageMagick-7.1.1-36-Q16-x64-static.exe
- Install the downloaded ImageMagick, ensuring you do not change the installation path
- Update `imagemagick_path` in the `config.toml` file to your actual installation path (typically `C:\Program Files\ImageMagick-7.1.1-Q16\magick.exe`)
###### MacOS:
@ -148,28 +148,27 @@ sudo apt-get install imagemagick
```shell
sudo yum install ImageMagick
```
3. 启动 webui
3. initiate webui
```shell
streamlit run ./webui/Main.py --browser.serverAddress=127.0.0.1 --server.enableCORS=True --browser.gatherUsageStats=False
```
4. 访问 http://127.0.0.1:8501
4. Access http://127.0.0.1:8501
## Feedback & Suggestions 📢
## 反馈建议 📢
### 👏👏👏 You can submit [issues](https://github.com/linyqh/NarratoAI/issues) or [pull requests](https://github.com/linyqh/NarratoAI/pulls) 🎉🎉🎉
### 👏👏👏 可以提交 [issue](https://github.com/linyqh/NarratoAI/issues)或者 [pull request](https://github.com/linyqh/NarratoAI/pulls) 🎉🎉🎉
## 参考项目 📚
## Reference Projects 📚
- https://github.com/FujiwaraChoki/MoneyPrinter
- https://github.com/harry0703/MoneyPrinterTurbo
该项目基于以上项目重构而来,增加了影视解说功能,感谢大佬的开源精神 🥳🥳🥳
This project was refactored based on the above projects with the addition of video narration features. Thanks to the original authors for their open-source spirit 🥳🥳🥳
## 许可证 📝
## License 📝
点击查看 [`LICENSE`](LICENSE) 文件
Click to view the [`LICENSE`](LICENSE) file
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=linyqh/NarratoAI&type=Date)](https://star-history.com/#linyqh/NarratoAI&Date)

View File

@ -1,7 +1,7 @@
import logging
import re
import os
import json
import traceback
from typing import List
from loguru import logger
from openai import OpenAI
@ -387,7 +387,7 @@ Please note that you must use English for generating video search terms; Chinese
return search_terms
def gemini_video2json(video_origin_name: str, video_origin_path: str, video_plot: str) -> str:
def gemini_video2json(video_origin_name: str, video_origin_path: str, video_plot: str, language: str) -> str:
'''
使用 gemini-1.5-pro 进行影视解析
Args:
@ -405,34 +405,46 @@ def gemini_video2json(video_origin_name: str, video_origin_path: str, video_plot
model = gemini.GenerativeModel(model_name=model_name)
prompt = """
# Role: 影视解说专家
# 角色设定:
你是一位影视解说专家擅长根据剧情描述视频的画面和故事生成一段有趣且吸引人的解说文案你特别熟悉 tiktok/抖音 风格的影视解说文案创作
## Background:
擅长根据剧情描述视频的画面和故事能够生成一段非常有趣的解说文案
# 任务目标:
1. 根据给定的剧情描述详细描述视频画面并展开叙述尤其是对重要画面进行细致刻画
2. 生成风格符合 tiktok/抖音 的影视解说文案使其节奏快内容抓人
3. 最终结果以 JSON 格式输出字段包含
"picture"画面描述
"timestamp"时间戳表示画面出现的时间-画面结束的时间
"narration"对应的解说文案
## Goals:
1. 根据剧情描述视频的画面和故事并对重要的画面进行展开叙述
2. 根据剧情内容生成符合 tiktok/抖音 风格的影视解说文案
3. 将结果直接以json格式输出给用户需要包含字段 picture 画面描述 timestamp 时间戳 narration 解说文案
4. 剧情内容如下{%s}
# 输入示例:
```text
在一个黑暗的小巷中主角缓慢走进四周静谧无声只有远处隐隐传来猫的叫声突然背后出现一个神秘的身影
```
## Skills
- 精通 tiktok/抖音 等短视频影视解说文案撰写
- 能够理解视频中的故事和画面表现
- 能精准匹配视频中的画面和时间戳
- 能精准把控旁白和时长
- 精通中文
- 精通JSON数据格式
# 输出格式:
```json
[
{
"picture": "黑暗的小巷中,主角缓慢走进,四周静谧无声,远处有模糊的猫叫声。",
"timestamp": "00:00-00:17",
"narration": "昏暗的小巷里,他独自前行,空气中透着一丝不安,隐约中能听到远处的猫叫声。 "
},
{
"picture": "主角背后突然出现一个神秘的身影,气氛骤然紧张。",
"timestamp": "00:17-00:39",
"narration": "就在他以为安全时,一个身影悄无声息地出现在他身后,危险一步步逼近! "
}
...
]
```
# 提示:
- 生成的解说文案应简洁有力符合短视频平台用户的偏好
- 叙述中应有强烈的代入感和悬念以吸引观众持续观看
- 文案语言为%s
- 剧情内容如下%s (若为空则忽略)
""" % (language, video_plot)
## Constrains
- 解说文案的时长要和时间戳的时长尽量匹配
- 忽略视频中关于广告的内容
- 忽略视频中片头和片尾
- 不得在脚本中包含任何类型的 Markdown 或格式
## Format
- 对应JSON的key为picture timestamp narration
""" % video_plot
logger.debug(f"视频名称: {video_origin_name}")
try:
gemini_video_file = gemini.upload_file(video_origin_path)
@ -444,9 +456,9 @@ def gemini_video2json(video_origin_name: str, video_origin_path: str, video_plot
logger.debug(f"视频当前状态(ACTIVE才可用): {gemini_video_file.state.name}")
if gemini_video_file.state.name == "FAILED":
raise ValueError(gemini_video_file.state.name)
except:
logger.error("上传视频至 Google cloud 失败, 请检查 VPN 配置和 APIKey 是否正确")
raise TimeoutError("上传视频至 Google cloud 失败, 请检查 VPN 配置和 APIKey 是否正确")
except Exception as err:
logger.error(f"上传视频至 Google cloud 失败, 请检查 VPN 配置和 APIKey 是否正确 \n{traceback.format_exc()}")
raise TimeoutError(f"上传视频至 Google cloud 失败, 请检查 VPN 配置和 APIKey 是否正确; {err}")
streams = model.generate_content([prompt, gemini_video_file], stream=True)
response = []
@ -460,8 +472,14 @@ def gemini_video2json(video_origin_name: str, video_origin_path: str, video_plot
if __name__ == "__main__":
juqin = ""
res = gemini_video2json("test", "/NarratoAI/resource/videos/test.mp4", juqin)
video_subject = "摔跤吧!爸爸 Dangal"
video_path = "/NarratoAI/resource/videos/test.mp4"
video_plot = '''
马哈维亚阿米尔· Aamir Khan 曾经是一名前途无量的摔跤运动员在放弃了职业生涯后他最大的遗憾就是没有能够替国家赢得金牌马哈维亚将这份希望寄托在了尚未出生的儿子身上哪知道妻子接连给他生了两个女儿取名吉塔法缇玛·萨那·纱卡 Fatima Sana Shaikh 和巴比塔桑亚·玛荷塔 Sanya Malhotra 让马哈维亚没有想到的是两个姑娘展现出了杰出的摔跤天赋让他幡然醒悟就算是女孩也能够昂首挺胸的站在比赛场上为了国家和她们自己赢得荣誉
就这样在马哈维亚的指导下吉塔和巴比塔开始了艰苦的训练两人进步神速很快就因为在比赛中连连获胜而成为了当地的名人为了获得更多的机会吉塔进入了国家体育学院学习在那里她将面对更大的诱惑和更多的选择
'''
language = "zh-CN"
res = gemini_video2json(video_subject, video_path, video_plot, language)
print(res)
# video_subject = "生命的意义是什么"
@ -475,3 +493,38 @@ if __name__ == "__main__":
# )
# print("######################")
# print(search_terms)
# prompt = """
# # Role: 影视解说专家
#
# ## Background:
# 擅长根据剧情描述视频的画面和故事,能够生成一段非常有趣的解说文案。
#
# ## Goals:
# 1. 根据剧情描述视频的画面和故事,并对重要的画面进行展开叙述
# 2. 根据剧情内容,生成符合 tiktok/抖音 风格的影视解说文案
# 3. 将结果直接以json格式输出给用户需要包含字段 picture 画面描述, timestamp 时间戳, narration 解说文案
# 4. 剧情内容如下:{%s}
#
# ## Skills
# - 精通 tiktok/抖音 等短视频影视解说文案撰写
# - 能够理解视频中的故事和画面表现
# - 能精准匹配视频中的画面和时间戳
# - 能精准把控旁白和时长
# - 精通中文
# - 精通JSON数据格式
#
# ## Constrains
# - 解说文案的时长要和时间戳的时长尽量匹配
# - 忽略视频中关于广告的内容
# - 忽略视频中片头和片尾
# - 不得在脚本中包含任何类型的 Markdown 或格式
#
# ## Format
# - 对应JSON的key为picture timestamp narration
#
# # Initialization:
# - video subject: {video_subject}
# - number of paragraphs: {paragraph_number}
# """.strip()
# if language:
# prompt += f"\n- language: {language}"

View File

@ -20,7 +20,7 @@ services:
dockerfile: Dockerfile
container_name: "api"
ports:
- "8080:8080"
- "8502:8080"
command: [ "python3", "main.py" ]
volumes: *common-volumes
restart: always

Binary file not shown.

Before

Width:  |  Height:  |  Size: 434 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 132 KiB

View File

Before

Width:  |  Height:  |  Size: 54 KiB

After

Width:  |  Height:  |  Size: 54 KiB

View File

Before

Width:  |  Height:  |  Size: 54 KiB

After

Width:  |  Height:  |  Size: 54 KiB

View File

Before

Width:  |  Height:  |  Size: 696 KiB

After

Width:  |  Height:  |  Size: 696 KiB

View File

Before

Width:  |  Height:  |  Size: 144 KiB

After

Width:  |  Height:  |  Size: 144 KiB

View File

Before

Width:  |  Height:  |  Size: 143 KiB

After

Width:  |  Height:  |  Size: 143 KiB

View File

Before

Width:  |  Height:  |  Size: 300 KiB

After

Width:  |  Height:  |  Size: 300 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 95 KiB

View File

@ -4,6 +4,7 @@ import glob
import json
import time
import datetime
import traceback
# 将项目的根目录添加到系统路径中,以允许从项目导入模块
root_dir = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
@ -252,7 +253,7 @@ with left_panel:
video_languages = [
(tr("Auto Detect"), ""),
]
for code in ["zh-CN", "zh-TW", "de-DE", "en-US", "vi-VN"]:
for code in ["zh-CN", "en-US", "zh-TW"]:
video_languages.append((code, code))
selected_index = st.selectbox(tr("Script Language"),
@ -351,7 +352,8 @@ with left_panel:
script = llm.gemini_video2json(
video_origin_name=params.video_origin_path.split("\\")[-1],
video_origin_path=params.video_origin_path,
video_plot=video_plot
video_plot=video_plot,
language=params.video_language,
)
st.session_state['video_clip_json'] = script
cleaned_string = script.strip("```json").strip("```")
@ -376,19 +378,21 @@ with left_panel:
st.error(tr("请输入视频脚本"))
st.stop()
with st.spinner(tr("保存脚本")):
with st.spinner(tr("Save Script")):
script_dir = utils.script_dir()
# 获取当前时间戳,形如 2024-0618-171820
timestamp = datetime.datetime.now().strftime("%Y-%m%d-%H%M%S")
save_path = os.path.join(script_dir, f"{timestamp}.json")
# 尝试解析输入的 JSON 数据
input_json = str(video_clip_json_details).replace("'", '"')
# input_json = str(video_clip_json_details).replace("'", '"')
input_json = str(video_clip_json_details)
logger.error(input_json)
input_json = input_json.strip('```json').strip('```')
try:
data = json.loads(input_json)
except:
raise ValueError("视频脚本格式错误,请检查脚本是否符合 JSON 格式")
except Exception as err:
raise ValueError(f"视频脚本格式错误,请检查脚本是否符合 JSON 格式{err} \n\n{traceback.format_exc()}")
# 检查是否是一个列表
if not isinstance(data, list):
@ -682,7 +686,7 @@ with right_panel:
params.stroke_width = st.slider(tr("Stroke Width"), 0.0, 10.0, 1.5)
# 视频编辑面板
with st.expander(tr("视频审查"), expanded=False):
with st.expander(tr("Video Check"), expanded=False):
try:
video_list = st.session_state['video_script_list']
except KeyError as e:
@ -714,13 +718,13 @@ with st.expander(tr("视频审查"), expanded=False):
# 可编辑的输入框
text_panels = st.columns(2)
with text_panels[0]:
text1 = st.text_area("时间戳", value=initial_timestamp, height=20)
text1 = st.text_area(tr("timestamp"), value=initial_timestamp, height=20)
with text_panels[1]:
text2 = st.text_area("画面描述", value=initial_picture, height=20)
text3 = st.text_area("解说旁白", value=initial_narration, height=100)
text2 = st.text_area(tr("Picture description"), value=initial_picture, height=20)
text3 = st.text_area(tr("Narration"), value=initial_narration, height=100)
# 清空文本框按钮
if st.button("重新生成", key=f"button_{index}"):
if st.button(tr("Rebuild"), key=f"button_{index}"):
print(123123)
# with st.spinner(tr("大模型生成中...")):

View File

@ -85,6 +85,11 @@
"TTS Provider": "语音合成提供商",
"Hide Log": "隐藏日志",
"Upload Local Files": "上传本地文件",
"File Uploaded Successfully": "文件上传成功"
"Video Check": "视频审查",
"File Uploaded Successfully": "文件上传成功",
"timestamp": "时间戳",
"Picture description": "图片描述",
"Narration": "视频文案",
"Rebuild": "重新生成"
}
}