上传文件至「/」
This commit is contained in:
151
README.md
Normal file
151
README.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# AI 字幕 & 配音终极工具套件 (AI Subtitle & Dubbing Ultimate Toolkit)
|
||||
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
|
||||
这是一个为视频创作者、翻译组和内容生产者设计的桌面端应用程序套件,旨在通过 AI 技术全方位优化和自动化字幕处理与配音合成流程。从文本内容的初步精炼,到听感的节奏优化,再到最终的 AI 配音合成,本套件提供了一站式解决方案。
|
||||
|
||||
## 🌟 核心功能
|
||||
|
||||
* **智能文本精炼**: 利用本地大语言模型 (Ollama) 自动润色、精炼字幕文本,去除口语化、冗余词,使其更符合书面语和专业配音风格。
|
||||
* **听感节奏优化**: 独创的算法,基于黄金语速模型,智能地压缩、拆分、缝合字幕,解决语速过快或过慢的问题,极大提升观众的听感体验。
|
||||
* **交互式精细编辑**: 提供可视化界面,让你逐句对比、修改、润色字幕,并可随时调用 AI 进行单句优化,实现对最终成品的高度掌控。
|
||||
* **一键式 AI 配音**: 集成 Microsoft Edge TTS,能够根据优化后的字幕时间轴,自动生成高质量、语速匹配的 AI 配音,并将其与视频一键合成。
|
||||
* **高度可定制化**: 所有核心参数(如黄金语速、停顿时间、LLM 模型等)均可配置,满足不同场景下的个性化需求。
|
||||
* **本地化与隐私**: 所有 AI 计算(LLM)均通过 Ollama 在本地完成,无需将你的字幕内容上传到云端,确保数据安全和隐私。
|
||||
|
||||
## 🛠️ 工具套件构成
|
||||
|
||||
本套件包含以下几个独立的、但可串联使用的图形化工具:
|
||||
|
||||
| 文件名 | 工具名称 | 主要功能 |
|
||||
| :-------------------------------- | :-------------------------- | :----------------------------------------------------------- |
|
||||
| `srt_refiner_v1.py` | **字幕内容批量精炼师** | **[第一步]** 对整个 SRT 文件进行全自动、批量的文本内容润色。 |
|
||||
| `srt_interactive_refiner_v1.0.py` | **交互式字幕编辑器** | **[第二步]** 手动精修和校对字幕,可对单句进行 AI 润色或自定义修改。 |
|
||||
| `srt_ultimate_optimizer_v4.0.py` | **终极字幕优化器** | **[第三步]** 核心工具,对字幕的**时间轴和结构**进行深度优化,调整语速和节奏。 |
|
||||
| `srt_optimizer_v2.py` | **字幕听感优化器 (轻量版)** | **[备选]** `ultimate` 的无 LLM 替代版,纯粹基于规则进行节奏优化,速度更快。 |
|
||||
| `main_v1.py` | **AI 视频配音合成工具** | **[第四步]** 读取最终优化好的 SRT 文件,生成配音并与视频合成。 |
|
||||
|
||||
## 🚀 技术栈
|
||||
|
||||
* **GUI 框架**: `Python` + `Tkinter`
|
||||
* **UI 主题**: `sv-ttk` (提供现代化的深色/浅色主题)
|
||||
* **大语言模型 (LLM) 支持**: `Ollama` (本地部署,支持 Gemma, Llama, Qwen 等模型)
|
||||
* **文本转语音 (TTS)**: `edge-tts` (调用微软 Edge 浏览器的高质量 TTS 引擎)
|
||||
* **音视频处理**: `FFmpeg` (业界标准的音视频处理库)
|
||||
* **异步处理**: `asyncio`, `threading`, `queue` (确保 GUI 流畅不卡顿)
|
||||
|
||||
## 📦 安装与配置
|
||||
|
||||
在开始使用前,请确保你的系统已完成以下环境配置。
|
||||
|
||||
### 1. 先决条件
|
||||
|
||||
* **Python 3.8+**: [下载 Python](https://www.python.org/downloads/)
|
||||
* **FFmpeg**:
|
||||
* 必须安装并将其添加到系统的环境变量 (PATH) 中。
|
||||
* Windows 用户可以从 [gyan.dev](https://www.gyan.dev/ffmpeg/builds/) 下载,解压后将 `bin` 目录路径添加到环境变量。
|
||||
* macOS 用户可通过 Homebrew 安装: `brew install ffmpeg`
|
||||
* Linux 用户可通过包管理器安装: `sudo apt-get install ffmpeg`
|
||||
* **Ollama (可选,但强烈推荐)**:
|
||||
* 用于驱动文本精炼和部分优化功能。
|
||||
* 访问 [Ollama 官网](https://ollama.com/) 下载并安装。
|
||||
* 安装后,必须至少拉取一个模型,推荐使用 `qwen:14b` 或 `llama3`:
|
||||
```bash
|
||||
ollama pull qwen:14b
|
||||
```
|
||||
|
||||
### 2. 安装 Python 依赖库
|
||||
|
||||
为了方便管理,建议将以下内容保存为 `requirements.txt` 文件:
|
||||
|
||||
```
|
||||
# requirements.txt
|
||||
sv-ttk
|
||||
jieba
|
||||
requests
|
||||
edge-tts
|
||||
pysrt
|
||||
```
|
||||
|
||||
然后通过 pip 一键安装:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
## 📖 使用指南与推荐工作流程
|
||||
|
||||
每个工具都可以独立运行,但遵循以下流程可以获得最佳效果。
|
||||
|
||||
### 运行方式
|
||||
|
||||
在你的终端或命令行中,使用 `python` 命令运行指定的脚本文件,例如:
|
||||
|
||||
```bash
|
||||
python srt_refiner_v1.py
|
||||
```
|
||||
|
||||
### 推荐工作流程
|
||||
|
||||
假设你有一个从视频自动生成的原始字幕文件 `raw.srt`。
|
||||
|
||||
#### **第 1 步:内容批量精炼 (可选)**
|
||||
|
||||
* **工具**: `srt_refiner_v1.py` (字幕内容批量精炼师)
|
||||
* **目的**: 快速去除原始字幕中的口语化表达和错误。
|
||||
* **操作**:
|
||||
1. 运行 `python srt_refiner_v1.py`。
|
||||
2. 加载 `raw.srt` 文件。
|
||||
3. 选择一个已下载的 Ollama 模型。
|
||||
4. 点击 "开始精炼",等待处理完成。
|
||||
5. 保存为 `refined.srt`。
|
||||
|
||||
#### **第 2 步:交互式精修校对**
|
||||
|
||||
* **工具**: `srt_interactive_refiner_v1.0.py` (交互式字幕编辑器)
|
||||
* **目的**: 手动检查、修改和完善字幕文本,确保内容准确无误。
|
||||
* **操作**:
|
||||
1. 运行 `python srt_interactive_refiner_v1.0.py`。
|
||||
2. 加载 `refined.srt`。
|
||||
3. 在右侧工作区逐行检查字幕。你可以:
|
||||
* 直接在下方的编辑框中修改文本。
|
||||
* 点击 "润色" 按钮,让 AI 单独优化当前行。
|
||||
* 点击 "还原文本" 恢复到原始版本。
|
||||
* 点击 "删除此行" 移除不必要的字幕。
|
||||
4. 全部检查完毕后,点击 "另存为...",保存为 `final_text.srt`。
|
||||
|
||||
#### **第 3 步:优化听感与节奏**
|
||||
|
||||
* **工具**: `srt_ultimate_optimizer_v4.0.py` (终极字幕优化器)
|
||||
* **目的**: 这是最关键的一步,它不修改文本,而是重塑字幕的**时间结构**,使其听起来更自然、流畅。
|
||||
* **操作**:
|
||||
1. 运行 `python srt_ultimate_optimizer_v4.0.py`。
|
||||
2. 加载 `final_text.srt`。
|
||||
3. 勾选 "启用LLM进行文本缩减"(如果需要),并选择模型。
|
||||
4. 点击 "开始优化"。软件会自动执行以下操作:
|
||||
* **压缩**: 将语速过慢的短句时长缩短到黄金语速。
|
||||
* **借时**: 当句子过快时,向后借用空隙时间。
|
||||
* **缩减**: 如果借时不够,尝试用 LLM 缩短句子。
|
||||
* **拆分**: 将过长或过快的句子,按标点或语义拆分成多个短句,并加入自然停顿。
|
||||
* **缝合**: 将时长过短、字数过少的碎片化字幕合并成一句。
|
||||
5. 优化完成后,在右侧预览效果,然后点击 "另存为...",保存为 `optimized.srt`。
|
||||
|
||||
> **轻量版备选方案**: 如果你不想使用 LLM 或者追求更快的处理速度,可以使用 `srt_optimizer_v2.py`。它只包含基于规则的压缩、拆分和缝合功能。
|
||||
|
||||
#### **第 4 步:生成 AI 配音并合成视频**
|
||||
|
||||
* **工具**: `main_v1.py` (AI 视频配音合成工具)
|
||||
* **目的**: 将最终完美的字幕转化为语音,并与原始视频合成。
|
||||
* **操作**:
|
||||
1. 运行 `python main_v1.py`。
|
||||
2. 选择 `optimized.srt` 作为 SRT 文件。
|
||||
3. 选择你的原始视频文件。
|
||||
4. 选择一个喜欢的配音员声音,可以点击 "试听"。
|
||||
5. 根据需要配置其他选项(如保留原声、烧录字幕等)。
|
||||
6. 点击 "开始生成"。工具会:
|
||||
* 为每一句字幕并发生成语音片段,并根据时长自动调整语速。
|
||||
* 将所有语音片段精确地合并成一条完整的音轨。
|
||||
* 最后,将新音轨与视频合成,输出最终的配音版视频。
|
||||
|
||||
---
|
||||
|
||||
434
main_v1.py
Normal file
434
main_v1.py
Normal file
@@ -0,0 +1,434 @@
|
||||
import tkinter as tk
|
||||
from tkinter import filedialog, ttk, scrolledtext, messagebox
|
||||
import sv_ttk
|
||||
import asyncio
|
||||
import edge_tts
|
||||
import os
|
||||
import re
|
||||
import subprocess
|
||||
import threading
|
||||
import queue
|
||||
from datetime import timedelta
|
||||
import json
|
||||
import uuid
|
||||
import platform
|
||||
|
||||
# --- 默认配置 ---
|
||||
DEFAULT_CONFIG = { "voice": "zh-CN-XiaoxiaoNeural", "volume": -16.0, "format": "mp3", "base_cps": 5.0, "max_concurrency": 10, "audition_text": "你好,这是一个声音样本试听。", "output_dir": "", "merge_video": False, "keep_original_audio": False, "video_path": "", "keep_intermediate_audio": False, "burn_subtitles": False, "encoder": "CPU (libx264)", "subtitle_fontsize": 14 }
|
||||
CONFIG_FILE = "config.json"
|
||||
|
||||
# --- 自定义控件 ---
|
||||
class CollapsiblePane(ttk.Frame):
|
||||
def __init__(self, parent, text="", initial_state='expanded'):
|
||||
super().__init__(parent); self.columnconfigure(0, weight=1); self.text = text
|
||||
self._variable = tk.BooleanVar(value=(initial_state == 'expanded'))
|
||||
self.button = ttk.Button(self, text=f"▼ {self.text}", command=self.toggle, style="TButton")
|
||||
self.button.grid(row=0, column=0, sticky="ew")
|
||||
self.content_frame = ttk.Frame(self, padding=(10, 5))
|
||||
self._variable.trace_add("write", self._update_button_text); self._update_content()
|
||||
def toggle(self): self._variable.set(not self.get()); self._update_content()
|
||||
def get(self): return self._variable.get()
|
||||
def _update_content(self):
|
||||
if self.get(): self.content_frame.grid(row=1, column=0, sticky="ew")
|
||||
else: self.content_frame.grid_remove()
|
||||
def _update_button_text(self, *args): self.button.config(text=f"{'▼' if self.get() else '▶'} {self.text}")
|
||||
|
||||
# --- 辅助函数和类 ---
|
||||
class SrtEntry:
|
||||
def __init__(self, index, start, end, text):
|
||||
self.index=int(index); self.start=self._to_timedelta(start); self.end=self._to_timedelta(end); self.text=text.strip(); self.duration=(self.end-self.start).total_seconds()
|
||||
@staticmethod
|
||||
def _to_timedelta(time_str):
|
||||
t=time_str.replace(',','.'); p=t.split('.'); m=p[0]; ms=int(p[1]) if len(p)>1 else 0
|
||||
h,mi,s=map(int,m.split(':')); return timedelta(hours=h,minutes=mi,seconds=s,milliseconds=ms)
|
||||
def parse_srt(srt_content):
|
||||
entries = []; p = re.compile(r'(\d+)\s*[\r\n]+(\d{2}:\d{2}:\d{2}[,.]\d{3})\s*-->\s*(\d{2}:\d{2}:\d{2}[,.]\d{3})\s*[\r\n]+([\s\S]+?)(?=(?:\r\n?|\n){2,}|$)', re.UNICODE)
|
||||
for m in p.finditer(srt_content): entries.append(SrtEntry(*m.groups()))
|
||||
return entries
|
||||
|
||||
def srt_to_ass(srt_content, style_options):
|
||||
style_options['fontsize'] = str(style_options.get('fontsize', 40)) # 确保是字符串
|
||||
ass_header = f"""[Script Info]
|
||||
Title: Generated by AI Video Dubbing Tool; ScriptType: v4.00+; WrapStyle: 0; PlayResX: 1920; PlayResY: 1080
|
||||
[V4+ Styles]
|
||||
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
|
||||
Style: {style_options['name']},{style_options['fontname']},{style_options['fontsize']},{style_options['primary_colour']},{style_options['secondary_colour']},{style_options['outline_colour']},{style_options['back_colour']},{style_options['bold']},{style_options['italic']},{style_options['underline']},{style_options['strikeout']},{style_options['scale_x']},{style_options['scale_y']},{style_options['spacing']},{style_options['angle']},{style_options['border_style']},{style_options['outline']},{style_options['shadow']},{style_options['alignment']},{style_options['margin_l']},{style_options['margin_r']},{style_options['margin_v']},{style_options['encoding']}
|
||||
[Events]
|
||||
Format: Layer, Start, End, Style, Name, MarginL, MarginR, MarginV, Effect, Text
|
||||
"""
|
||||
ass_lines = []
|
||||
srt_pattern = re.compile(r'(\d+)\n(\d{2}:\d{2}:\d{2},\d{3}) --> (\d{2}:\d{2}:\d{2},\d{3})\n([\s\S]*?)(?=\n\n|\Z)', re.MULTILINE)
|
||||
for match in srt_pattern.finditer(srt_content):
|
||||
start_time, end_time, text = match.group(2), match.group(3), match.group(4)
|
||||
start_ass = start_time.replace(',', '.')[:-1]; end_ass = end_time.replace(',', '.')[:-1]
|
||||
text_ass = text.strip().replace('\n', '\\N')
|
||||
ass_lines.append(f"Dialogue: 0,{start_ass},{end_ass},{style_options['name']},,0,0,0,,{text_ass}")
|
||||
return ass_header + "\n".join(ass_lines)
|
||||
|
||||
# --- 核心处理逻辑 ---
|
||||
class Processor:
|
||||
def __init__(self, srt_path, config, gui_queue):
|
||||
self.srt_path = srt_path
|
||||
self.config = config
|
||||
self.gui_queue = gui_queue
|
||||
self.base_name = os.path.splitext(os.path.basename(srt_path))[0]
|
||||
output_dir = self.config['output_dir'] or os.path.dirname(os.path.abspath(srt_path))
|
||||
self.output_dir = output_dir
|
||||
self.cache_dir = os.path.join(self.output_dir, f"{self.base_name}_cache")
|
||||
self.audio_output_path = os.path.join(self.output_dir, f"{self.base_name}.{self.config['format']}")
|
||||
self.is_cancelled = threading.Event()
|
||||
|
||||
def log(self, message): self.gui_queue.put({"type": "log", "data": message})
|
||||
def update_progress(self, current, total, status_text=""): self.gui_queue.put({"type": "progress", "current": current, "total": total, "status": status_text})
|
||||
|
||||
async def _generate_clip_worker(self, entry, semaphore):
|
||||
async with semaphore:
|
||||
if self.is_cancelled.is_set():
|
||||
return (entry.index, "cancelled")
|
||||
|
||||
clip_path = os.path.join(self.cache_dir, f"{entry.index}.mp3")
|
||||
if os.path.exists(clip_path) and os.path.getsize(clip_path) > 0:
|
||||
self.log(f"片段 {entry.index} 缓存已存在.")
|
||||
return (entry.index, "success")
|
||||
|
||||
clean_text = re.sub(r'\s+', '', entry.text)
|
||||
char_count = len(clean_text)
|
||||
if char_count < 2 or entry.duration <= 0.2:
|
||||
self.log(f"!! 警告: 片段 {entry.index} 因文本过短或时长不足而被跳过.")
|
||||
return (entry.index, "skipped")
|
||||
|
||||
rate_p = round(((char_count / entry.duration) / self.config['base_cps'] - 1) * 100)
|
||||
rate_p = max(-90, min(200, rate_p))
|
||||
rate_s = f"+{rate_p}%" if rate_p >= 0 else f"{rate_p}%"
|
||||
|
||||
self.log(f"生成片段 {entry.index}: 时长={entry.duration:.2f}s, 字数={char_count}, 速率={rate_s}")
|
||||
|
||||
max_retries = 3
|
||||
for attempt in range(max_retries):
|
||||
if self.is_cancelled.is_set():
|
||||
return (entry.index, "cancelled")
|
||||
try:
|
||||
comm = edge_tts.Communicate(entry.text, self.config['voice'], rate=rate_s)
|
||||
await comm.save(clip_path)
|
||||
# 即使 await 成功,也再次确认文件有效性
|
||||
if not os.path.exists(clip_path) or os.path.getsize(clip_path) == 0:
|
||||
raise ValueError("生成的音频文件为空或无效.")
|
||||
return (entry.index, "success") # 成功后立即返回
|
||||
except Exception as e:
|
||||
# <--- 关键优化: 异常后复核文件是否存在 ---
|
||||
# 检查是否因为网络抖动,库报错了但文件已成功写入
|
||||
if os.path.exists(clip_path) and os.path.getsize(clip_path) > 0:
|
||||
self.log(f"片段 {entry.index} 捕获到异常,但文件已成功生成,视为成功。")
|
||||
return (entry.index, "success")
|
||||
|
||||
# 如果文件确实没生成,执行智能重试
|
||||
backoff_time = 2 ** (attempt + 1) # 2, 4, 8 秒
|
||||
self.log(f"!! 错误: 片段 {entry.index} (尝试 {attempt + 1}/{max_retries}): {e}")
|
||||
if attempt < max_retries - 1:
|
||||
self.log(f" 将在 {backoff_time} 秒后重试...")
|
||||
await asyncio.sleep(backoff_time)
|
||||
|
||||
self.log(f"!! 严重: 片段 {entry.index} 在多次尝试后仍然失败,将被跳过。")
|
||||
return (entry.index, "failed")
|
||||
|
||||
async def _generate_all_clips_async(self, entries):
|
||||
self.log(f"\n--- 阶段1: 生成语音 (并发数: {self.config['max_concurrency']}) ---")
|
||||
sem = asyncio.Semaphore(self.config['max_concurrency'])
|
||||
tasks = [self._generate_clip_worker(e, sem) for e in entries]
|
||||
|
||||
successful_clips = 0
|
||||
failed_clips = []
|
||||
|
||||
for i, f in enumerate(asyncio.as_completed(tasks)):
|
||||
if self.is_cancelled.is_set():
|
||||
raise InterruptedError("任务取消")
|
||||
|
||||
index, status = await f
|
||||
if status == "success":
|
||||
successful_clips += 1
|
||||
elif status == "failed":
|
||||
failed_clips.append(index)
|
||||
|
||||
self.update_progress(i + 1, self.total_steps, f"生成片段 {i+1}/{len(entries)}")
|
||||
|
||||
if failed_clips:
|
||||
self.log(f"\n!! 警告: 以下语音片段生成失败: {', '.join(map(str, sorted(failed_clips)))}")
|
||||
|
||||
if successful_clips == 0:
|
||||
raise RuntimeError("没有任何语音片段成功生成,任务中止。")
|
||||
|
||||
self.log(f"语音生成完成: {successful_clips} 个成功, {len(failed_clips)} 个失败。")
|
||||
|
||||
def _merge_media_sync(self, entries):
|
||||
self.log(f"\n--- 阶段2: 合并音频 ---")
|
||||
clips = sorted([e for e in entries if os.path.exists(os.path.join(self.cache_dir, f"{e.index}.mp3")) and os.path.getsize(os.path.join(self.cache_dir, f"{e.index}.mp3")) > 0], key=lambda e: e.index)
|
||||
|
||||
if not clips:
|
||||
self.log("!! 错误: 没有可用的音频片段进行合并。")
|
||||
raise RuntimeError("没有可用的音频片段。")
|
||||
|
||||
cmd=['ffmpeg','-y']
|
||||
filters=[]
|
||||
inputs=""
|
||||
for i,e in enumerate(clips):
|
||||
cmd.extend(['-i', os.path.join(self.cache_dir, f"{e.index}.mp3")])
|
||||
delay=int(e.start.total_seconds()*1000)
|
||||
filters.append(f"[{i}:a]adelay={delay}|{delay}[a{i}]")
|
||||
inputs += f"[a{i}]"
|
||||
|
||||
filters.extend([f"{inputs}amix=inputs={len(clips)}:normalize=0[merged]", f"[merged]loudnorm=I={self.config['volume']}:LRA=11:TP=-1.5"])
|
||||
cmd.extend(['-filter_complex', ";".join(filters), self.audio_output_path])
|
||||
if self.run_ffmpeg_sync(cmd, "合并音频") != 0: raise RuntimeError("音频合并失败")
|
||||
self.log("音频合并成功!")
|
||||
|
||||
if self.is_cancelled.is_set(): raise InterruptedError("任务取消")
|
||||
|
||||
if self.config['merge_video']:
|
||||
self.log("\n--- 阶段3: 与视频合成 ---"); self.update_progress(len(entries), self.total_steps, "视频合成中...")
|
||||
video_path=self.config['video_path']; v_ext=os.path.splitext(video_path)[1]; v_out_path=os.path.join(self.output_dir,f"{self.base_name}_dubbed{v_ext}"); cmd=['ffmpeg','-y','-i',video_path,'-i',self.audio_output_path]
|
||||
video_codec, audio_codec = self.get_codecs()
|
||||
vf_options = []
|
||||
if self.config['burn_subtitles']:
|
||||
ass_path = self.prepare_ass_subtitle(); escaped_ass_path = ass_path.replace('\\', '/').replace(':', '\\:')
|
||||
vf_options.append(f"subtitles='{escaped_ass_path}'"); self.log(f"准备烧录字幕: {os.path.basename(ass_path)}")
|
||||
if self.config['keep_original_audio']:
|
||||
self.log("模式: 创建双音轨"); locale=self.config['voice'].split('-')[0]
|
||||
cmd.extend(['-map','0:v:0','-map','0:a:0','-map','1:a:0', '-metadata:s:a:0','language=und','-metadata:s:a:0','title=Original', '-metadata:s:a:1',f'language={locale}','-metadata:s:a:1','title=AI Dubbing', '-disposition:a:1','default'])
|
||||
else: self.log("模式: 替换音轨"); cmd.extend(['-map','0:v:0','-map','1:a:0'])
|
||||
cmd.extend(['-c:v', video_codec, '-c:a', audio_codec])
|
||||
if vf_options: cmd.extend(['-vf', ",".join(vf_options)])
|
||||
cmd.append(v_out_path)
|
||||
if self.run_ffmpeg_sync(cmd, "视频合成") != 0: raise RuntimeError("视频合成失败")
|
||||
self.log(f"视频合成成功! 输出: {v_out_path}")
|
||||
|
||||
if self.config['merge_video'] and not self.config['keep_intermediate_audio']:
|
||||
try: self.log("删除中间音频文件..."); os.remove(self.audio_output_path)
|
||||
except OSError: pass
|
||||
|
||||
def get_codecs(self):
|
||||
encoder_map = {"CPU (libx264)": "libx264", "NVIDIA (h264_nvenc)": "h264_nvenc", "AMD (h264_amf)": "h264_amf", "Intel (h264_qsv)": "h264_qsv"}
|
||||
audio_codec = "aac"
|
||||
if self.config['burn_subtitles']: return encoder_map.get(self.config['encoder'], "libx264"), audio_codec
|
||||
else: return "copy", "copy"
|
||||
def prepare_ass_subtitle(self):
|
||||
style = {'name': 'Default','fontname': '微软雅黑','fontsize': self.config['subtitle_fontsize'],'primary_colour': '&H00FFFFFF','secondary_colour': '&H000000FF','outline_colour': '&H00000000','back_colour': '&H00000000','bold': '0','italic': '0','underline': '0','strikeout': '0','scale_x': '100','scale_y': '100','spacing': '0','angle': '0','border_style': '1','outline': '2','shadow': '1','alignment': '2','margin_l': '10','margin_r': '10','margin_v': '30','encoding': '1'}
|
||||
ass_content = srt_to_ass(self.srt_content, style); ass_path = os.path.join(self.cache_dir, f"{self.base_name}.ass")
|
||||
with open(ass_path, 'w', encoding='utf-8') as f: f.write(ass_content)
|
||||
return ass_path
|
||||
|
||||
def run_ffmpeg_sync(self, cmd, stage="FFmpeg"):
|
||||
self.log(f"执行 {stage} 命令...");
|
||||
creationflags = 0
|
||||
if platform.system() == "Windows": creationflags = subprocess.CREATE_NO_WINDOW
|
||||
process = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', creationflags=creationflags)
|
||||
if process.returncode != 0:
|
||||
self.log(f"!! {stage} 失败,返回码: {process.returncode}");
|
||||
self.log("--- FFmpeg 错误输出 ---"); self.log(process.stderr); self.log("------------------------")
|
||||
return process.returncode
|
||||
|
||||
def run(self):
|
||||
try:
|
||||
self.log("--- 开始处理 ---"); os.makedirs(self.cache_dir, exist_ok=True)
|
||||
with open(self.srt_path, 'r', encoding='utf-8-sig') as f: self.srt_content=f.read()
|
||||
entries = parse_srt(self.srt_content)
|
||||
if not entries: raise RuntimeError("解析SRT失败")
|
||||
self.log(f"解析到 {len(entries)} 条字幕.")
|
||||
self.total_steps=len(entries) + (1 if self.config['merge_video'] else 0)
|
||||
asyncio.run(self._generate_all_clips_async(entries))
|
||||
if self.is_cancelled.is_set(): raise InterruptedError("任务取消")
|
||||
self._merge_media_sync(entries)
|
||||
self.update_progress(self.total_steps, self.total_steps, "全部完成!"); self.log("\n--- 全部完成!---")
|
||||
self.gui_queue.put({"type":"finish", "success":True, "output_dir":self.output_dir})
|
||||
except Exception as e:
|
||||
if not isinstance(e, (InterruptedError, RuntimeError)): self.log(f"发生未知严重错误: {e}")
|
||||
self.gui_queue.put({"type": "finish", "success": False})
|
||||
|
||||
# --- GUI 应用 ---
|
||||
# [ App 类及以下所有代码保持不变,无需修改 ]
|
||||
class App:
|
||||
def __init__(self, root):
|
||||
self.root = root; self.root.title("AI视频配音合成工具 V15.3 (终极优化版)"); self.root.geometry("800x800")
|
||||
self.load_config(); self.gui_queue = queue.Queue()
|
||||
self.vars = { "srt_path": tk.StringVar(), "output_dir": tk.StringVar(value=self.config["output_dir"]), "voice": tk.StringVar(value=self.config["voice"]), "volume": tk.DoubleVar(value=self.config["volume"]), "format": tk.StringVar(value=self.config["format"]), "status": tk.StringVar(value="准备就绪"), "merge_video": tk.BooleanVar(value=self.config["merge_video"]), "keep_original_audio": tk.BooleanVar(value=self.config["keep_original_audio"]), "video_path": tk.StringVar(value=self.config["video_path"]), "keep_intermediate_audio": tk.BooleanVar(value=self.config["keep_intermediate_audio"]), "burn_subtitles": tk.BooleanVar(value=self.config["burn_subtitles"]), "encoder": tk.StringVar(value=self.config["encoder"]), "subtitle_fontsize": tk.IntVar(value=self.config["subtitle_fontsize"]) }
|
||||
self.build_ui(); sv_ttk.set_theme("dark")
|
||||
self.root.after(100, self.load_voices); self.root.after(100, self.process_queue); self.root.protocol("WM_DELETE_WINDOW", self.on_closing)
|
||||
def log(self, message): self.gui_queue.put({"type": "log", "data": message})
|
||||
def build_ui(self):
|
||||
main_frame = ttk.Frame(self.root, padding=10); main_frame.pack(fill=tk.BOTH, expand=True); main_frame.rowconfigure(4, weight=1); main_frame.columnconfigure(0, weight=1)
|
||||
pane1 = CollapsiblePane(main_frame, "基础设置"); pane1.grid(row=0, column=0, sticky="ew", pady=(0, 5))
|
||||
f1=ttk.Frame(pane1.content_frame); f1.pack(fill=tk.X,expand=True,pady=(0,5)); ttk.Label(f1,text="SRT文件:").pack(side=tk.LEFT,padx=(0,5)); ttk.Entry(f1,textvariable=self.vars['srt_path']).pack(side=tk.LEFT,fill=tk.X,expand=True,padx=(0,5)); ttk.Button(f1,text="选择...",command=self.select_srt_file).pack(side=tk.LEFT)
|
||||
f2=ttk.Frame(pane1.content_frame); f2.pack(fill=tk.X,expand=True); ttk.Label(f2,text="输出目录:").pack(side=tk.LEFT,padx=(0,5)); ttk.Entry(f2,textvariable=self.vars['output_dir']).pack(side=tk.LEFT,fill=tk.X,expand=True,padx=(0,5)); ttk.Button(f2,text="选择...",command=self.select_output_dir).pack(side=tk.LEFT,padx=(0,5)); self.open_dir_button=ttk.Button(f2,text="打开",command=self.open_output_dir,state="disabled"); self.open_dir_button.pack(side=tk.LEFT)
|
||||
pane3 = CollapsiblePane(main_frame, "音频参数"); pane3.grid(row=1, column=0, sticky="ew", pady=(0, 5))
|
||||
f3=ttk.Frame(pane3.content_frame); f3.pack(fill=tk.X,expand=True,pady=2); ttk.Label(f3,text="配音员:").pack(side=tk.LEFT,padx=(0,10)); self.voice_combo=ttk.Combobox(f3,textvariable=self.vars['voice'],state="readonly",width=30); self.voice_combo.pack(side=tk.LEFT,fill=tk.X,expand=True); self.audition_button=ttk.Button(f3,text="试听",command=self.audition_voice,state="disabled"); self.audition_button.pack(side=tk.LEFT,padx=(5,0))
|
||||
f4=ttk.Frame(pane3.content_frame); f4.pack(fill=tk.X,expand=True,pady=2); ttk.Label(f4,text="响度:").pack(side=tk.LEFT,padx=(0,10)); self.volume_scale=ttk.Scale(f4,from_=-24,to=-12,orient=tk.HORIZONTAL,variable=self.vars['volume'],command=lambda v:self.volume_label.config(text=f"{float(v):.1f} LUFS")); self.volume_scale.pack(side=tk.LEFT,fill=tk.X,expand=True); self.volume_label=ttk.Label(f4,text=f"{self.vars['volume'].get():.1f} LUFS",width=10); self.volume_label.pack(side=tk.LEFT,padx=(5,0))
|
||||
f5=ttk.Frame(pane3.content_frame); f5.pack(fill=tk.X,expand=True,pady=2); ttk.Label(f5,text="格式:").pack(side=tk.LEFT,padx=(0,10)); self.format_combo=ttk.Combobox(f5,textvariable=self.vars['format'],values=["mp3","wav","aac"],state="readonly"); self.format_combo.pack(side=tk.LEFT,fill=tk.X,expand=True)
|
||||
pane2 = CollapsiblePane(main_frame, "视频合成 (可选)", initial_state='collapsed'); pane2.grid(row=2, column=0, sticky="ew", pady=(0, 5))
|
||||
ttk.Checkbutton(pane2.content_frame, text="启用视频合成功能", variable=self.vars['merge_video']).pack(anchor='w')
|
||||
f6=ttk.Frame(pane2.content_frame); f6.pack(fill=tk.X,expand=True,pady=5); ttk.Label(f6,text="视频文件:").pack(side=tk.LEFT,padx=(0,5)); ttk.Entry(f6,textvariable=self.vars['video_path']).pack(side=tk.LEFT,fill=tk.X,expand=True,padx=(0,5)); ttk.Button(f6,text="选择...",command=self.select_video_file).pack(side=tk.LEFT)
|
||||
f7=ttk.Frame(pane2.content_frame); f7.pack(fill=tk.X,expand=True); ttk.Checkbutton(f7,text="保留原始音轨",variable=self.vars['keep_original_audio']).pack(side=tk.LEFT); ttk.Checkbutton(f7,text="保留独立音频",variable=self.vars['keep_intermediate_audio']).pack(side=tk.LEFT,padx=(20,0))
|
||||
self.pane4 = CollapsiblePane(main_frame, "字幕烧录 (可选)", initial_state='collapsed'); self.pane4.grid(row=3, column=0, sticky="ew", pady=(0, 10))
|
||||
ttk.Checkbutton(self.pane4.content_frame, text="将字幕烧录到视频画面中", variable=self.vars['burn_subtitles']).pack(anchor='w')
|
||||
self.encoder_frame = ttk.Frame(self.pane4.content_frame); self.encoder_frame.pack(fill=tk.X, expand=True, pady=5)
|
||||
f_enc1=ttk.Frame(self.encoder_frame); f_enc1.pack(fill=tk.X, expand=True); ttk.Label(f_enc1, text="编码器:").pack(side=tk.LEFT, padx=(0,5)); self.encoder_combo = ttk.Combobox(f_enc1, textvariable=self.vars['encoder'], state="readonly"); self.encoder_combo.pack(side=tk.LEFT, fill=tk.X, expand=True)
|
||||
f_enc2=ttk.Frame(self.encoder_frame); f_enc2.pack(fill=tk.X, expand=True, pady=5); ttk.Label(f_enc2, text="字幕字号:").pack(side=tk.LEFT, padx=(0,5)); ttk.Spinbox(f_enc2, from_=10, to=100, textvariable=self.vars['subtitle_fontsize'], width=10).pack(side=tk.LEFT); self.preview_button = ttk.Button(f_enc2, text="生成5秒预览", command=self.create_preview); self.preview_button.pack(side=tk.LEFT, padx=(10,0))
|
||||
log_pane=ttk.Frame(main_frame); log_pane.grid(row=4,column=0,sticky="nsew"); log_pane.rowconfigure(1,weight=1); log_pane.columnconfigure(0,weight=1)
|
||||
f8=ttk.Frame(log_pane); f8.pack(fill=tk.X,pady=(0,5)); self.progress_bar=ttk.Progressbar(f8,orient=tk.HORIZONTAL); self.progress_bar.pack(fill=tk.X,expand=True,side=tk.LEFT,padx=(0,10)); self.status_label=ttk.Label(f8,textvariable=self.vars['status']); self.status_label.pack(side=tk.LEFT)
|
||||
self.log_text=scrolledtext.ScrolledText(log_pane,wrap=tk.WORD,state="disabled"); self.log_text.pack(fill=tk.BOTH,expand=True)
|
||||
f9=ttk.Frame(main_frame); f9.grid(row=5,column=0,sticky="ew",pady=(10,0)); f9.columnconfigure(0,weight=1)
|
||||
self.start_button=ttk.Button(f9,text="开始生成",command=self.start_processing,style="Accent.TButton"); self.start_button.pack(side=tk.RIGHT)
|
||||
self.cancel_button=ttk.Button(f9,text="取消任务",command=self.cancel_processing,state="disabled"); self.cancel_button.pack(side=tk.RIGHT,padx=(0,5))
|
||||
def load_config(self):
|
||||
try:
|
||||
with open(CONFIG_FILE,'r',encoding='utf-8') as f: self.config=json.load(f)
|
||||
except(FileNotFoundError,json.JSONDecodeError): self.config=DEFAULT_CONFIG
|
||||
for key,value in DEFAULT_CONFIG.items(): self.config.setdefault(key,value)
|
||||
def save_config(self):
|
||||
for key, var in self.vars.items(): self.config[key] = var.get().split(' ')[0] if key == "voice" else var.get()
|
||||
with open(CONFIG_FILE,'w',encoding='utf-8') as f: json.dump(self.config, f, indent=4, ensure_ascii=False)
|
||||
def on_closing(self): self.save_config(); self.root.destroy()
|
||||
def select_srt_file(self):
|
||||
path=filedialog.askopenfilename(filetypes=[("SRT Subtitles","*.srt")]);
|
||||
if path:
|
||||
self.vars['srt_path'].set(path); srt_dir=os.path.dirname(path); base_name=os.path.splitext(os.path.basename(path))[0]
|
||||
if not self.vars['output_dir'].get(): self.vars['output_dir'].set(srt_dir); self.open_dir_button.config(state="normal")
|
||||
for ext in ['.mp4','.mkv','.avi','.mov','.webm']:
|
||||
if os.path.exists(os.path.join(srt_dir,base_name+ext)): self.vars['video_path'].set(os.path.join(srt_dir,base_name+ext)); break
|
||||
def select_output_dir(self):
|
||||
path=filedialog.askdirectory(title="选择输出目录")
|
||||
if path: self.vars['output_dir'].set(path); self.open_dir_button.config(state="normal")
|
||||
def open_output_dir(self):
|
||||
path=self.vars['output_dir'].get()
|
||||
if path and os.path.isdir(path):
|
||||
if platform.system() == "Windows": os.startfile(path)
|
||||
else: subprocess.Popen(['open', path])
|
||||
else: messagebox.showwarning("目录无效", "输出目录不存在或无效.")
|
||||
def select_video_file(self):
|
||||
path=filedialog.askopenfilename(filetypes=[("Video Files","*.mp4 *.mkv *.avi *.mov"),("All Files", "*.*")])
|
||||
if path: self.vars['video_path'].set(path)
|
||||
def audition_voice(self):
|
||||
voice=self.vars['voice'].get().split(' ')[0]
|
||||
if not voice: return
|
||||
self.audition_button.config(state="disabled"); self.log(f"正在试听 {voice}...")
|
||||
def _audition():
|
||||
try:
|
||||
out_dir = os.path.dirname(os.path.abspath(__file__))
|
||||
cache=os.path.join(out_dir,"_audition_temp_cache"); os.makedirs(cache,exist_ok=True); tmp_file=os.path.join(cache,"_audition_temp.mp3")
|
||||
async def save():
|
||||
comm = edge_tts.Communicate(self.config['audition_text'], voice)
|
||||
await comm.save(tmp_file)
|
||||
asyncio.run(save())
|
||||
if os.path.exists(tmp_file) and os.path.getsize(tmp_file) > 0:
|
||||
if platform.system() == "Windows": os.startfile(tmp_file)
|
||||
else: subprocess.Popen(['xdg-open', tmp_file])
|
||||
else: raise Exception("生成的试听文件为空。")
|
||||
except Exception as e: self.gui_queue.put({"type":"log", "data":f"!! 试听失败: {e}"})
|
||||
finally: self.gui_queue.put({"type":"audition_done"})
|
||||
threading.Thread(target=_audition,daemon=True).start()
|
||||
|
||||
def create_preview(self):
|
||||
srt_path = self.vars['srt_path'].get(); video_path = self.vars['video_path'].get()
|
||||
if not srt_path or not os.path.exists(srt_path): messagebox.showerror("错误","请先选择有效的SRT文件。"); return
|
||||
if not video_path or not os.path.exists(video_path): messagebox.showerror("错误","请先选择有效的视频文件。"); return
|
||||
|
||||
self.preview_button.config(state="disabled"); self.log("正在生成预览切片...")
|
||||
def _preview():
|
||||
try:
|
||||
with open(srt_path, 'r', encoding='utf-8-sig') as f: content=f.read()
|
||||
entries = parse_srt(content)
|
||||
if not entries: raise Exception("SRT文件为空或格式错误。")
|
||||
|
||||
preview_entry = sorted(entries, key=lambda e: e.duration, reverse=True)[0]
|
||||
|
||||
out_dir = self.vars['output_dir'].get() or os.path.dirname(srt_path)
|
||||
preview_file = os.path.join(out_dir, "_preview.mp4")
|
||||
|
||||
style = {'name': 'Default','fontname': '微软雅黑','fontsize': self.vars['subtitle_fontsize'].get(),'primary_colour': '&H00FFFFFF','secondary_colour': '&H000000FF','outline_colour': '&H00000000','back_colour': '&H00000000','bold': '0','italic': '0','underline': '0','strikeout': '0','scale_x': '100','scale_y': '100','spacing': '0','angle': '0','border_style': '1','outline': '2','shadow': '1','alignment': '2','margin_l': '10','margin_r': '10','margin_v': '30','encoding': '1'}
|
||||
ass_content = srt_to_ass(content, style)
|
||||
ass_path = os.path.join(out_dir, "_preview.ass")
|
||||
with open(ass_path, 'w', encoding='utf-8') as f: f.write(ass_content)
|
||||
|
||||
start_time = max(0, preview_entry.start.total_seconds() - 1)
|
||||
duration = preview_entry.duration + 2
|
||||
|
||||
escaped_ass_path = ass_path.replace('\\', '/').replace(':', '\\:')
|
||||
vf = f"subtitles='{escaped_ass_path}'"
|
||||
|
||||
cmd = ['ffmpeg', '-y', '-ss', str(start_time), '-t', str(duration), '-i', video_path, '-vf', vf, '-c:v', 'libx264', '-preset', 'ultrafast', '-an', preview_file]
|
||||
|
||||
creationflags = subprocess.CREATE_NO_WINDOW if platform.system() == "Windows" else 0
|
||||
result = subprocess.run(cmd, capture_output=True, text=True, encoding='utf-8', creationflags=creationflags)
|
||||
|
||||
if result.returncode != 0: raise Exception(f"FFmpeg预览生成失败:\n{result.stderr}")
|
||||
|
||||
if os.path.exists(preview_file):
|
||||
if platform.system() == "Windows": os.startfile(preview_file)
|
||||
else: subprocess.Popen(['open', preview_file])
|
||||
|
||||
except Exception as e: self.gui_queue.put({"type":"log", "data":f"!! 预览失败: {e}"})
|
||||
finally: self.gui_queue.put({"type":"preview_done"})
|
||||
threading.Thread(target=_preview, daemon=True).start()
|
||||
|
||||
def start_processing(self):
|
||||
if not self.vars['srt_path'].get() or not os.path.exists(self.vars['srt_path'].get()): messagebox.showerror("错误","请选择有效的SRT文件."); return
|
||||
if not self.vars['output_dir'].get() or not os.path.isdir(self.vars['output_dir'].get()): messagebox.showerror("错误","请选择有效的输出目录."); return
|
||||
if self.vars['merge_video'].get() and (not self.vars['video_path'].get() or not os.path.exists(self.vars['video_path'].get())): messagebox.showerror("错误","请选择有效的视频文件进行合并."); return
|
||||
self.start_button.config(state="disabled"); self.cancel_button.config(state="normal")
|
||||
self.log_text.config(state="normal"); self.log_text.delete(1.0,tk.END); self.log_text.config(state="disabled")
|
||||
current_config=self.config.copy()
|
||||
for key, var in self.vars.items(): current_config[key] = var.get().split(' ')[0] if key == "voice" else var.get()
|
||||
processor=Processor(self.vars['srt_path'].get(),current_config,self.gui_queue); self.processor_thread=threading.Thread(target=processor.run,daemon=True); self.processor_thread.processor_instance=processor; self.processor_thread.start()
|
||||
|
||||
def cancel_processing(self):
|
||||
if self.processor_thread and self.processor_thread.is_alive(): self.log("正在发送取消信号..."); self.processor_thread.processor_instance.is_cancelled.set()
|
||||
|
||||
def load_voices(self):
|
||||
self.log("正在获取可用配音员列表...")
|
||||
def _load():
|
||||
try:
|
||||
encoders = ["CPU (libx264)"]
|
||||
try:
|
||||
result = subprocess.run(['ffmpeg', '-hide_banner', '-encoders'], capture_output=True, text=True, encoding='utf-8', timeout=5)
|
||||
if 'h264_nvenc' in result.stdout: encoders.append("NVIDIA (h264_nvenc)")
|
||||
if 'h264_amf' in result.stdout: encoders.append("AMD (h264_amf)")
|
||||
if 'h264_qsv' in result.stdout: encoders.append("Intel (h264_qsv)")
|
||||
except Exception: pass
|
||||
|
||||
voices = asyncio.run(edge_tts.list_voices())
|
||||
self.gui_queue.put({"type":"init_data", "voices":voices, "encoders":encoders})
|
||||
except Exception as e:
|
||||
self.gui_queue.put({"type":"log", "data":f"!! 致命错误: 获取配音员列表失败: {e}"})
|
||||
if "401" in str(e): self.gui_queue.put({"type":"log", "data":"!! [建议]: 认证失败, 请升级 'pip install --upgrade edge-tts'"})
|
||||
threading.Thread(target=_load, daemon=True).start()
|
||||
|
||||
def process_queue(self):
|
||||
try:
|
||||
while True:
|
||||
msg=self.gui_queue.get_nowait()
|
||||
msg_type = msg.get("type")
|
||||
if msg_type=="log": self.log_text.config(state="normal"); self.log_text.insert(tk.END, msg["data"] + "\n"); self.log_text.see(tk.END); self.log_text.config(state="disabled")
|
||||
elif msg_type=="progress": self.progress_bar['maximum']=msg['total']; self.progress_bar['value']=msg['current']; self.vars['status'].set(msg['status'])
|
||||
elif msg_type=="finish":
|
||||
self.start_button.config(state="normal"); self.cancel_button.config(state="disabled")
|
||||
if msg["success"]: self.vars['status'].set("任务完成!"); self.vars['output_dir'].set(msg["output_dir"]); self.open_dir_button.config(state="normal"); messagebox.showinfo("成功", "任务已成功完成!")
|
||||
else: self.vars['status'].set("任务失败或被取消"); messagebox.showwarning("任务中断", "任务失败或被用户取消。")
|
||||
elif msg_type=="init_data":
|
||||
voices_list=msg["voices"]; d_names=[f"{v['ShortName']} - {v['Gender']} ({v['Locale']})" for v in voices_list]
|
||||
self.voice_combo['values']=d_names; saved_voice=self.config['voice']
|
||||
for i,name in enumerate(d_names):
|
||||
if name.startswith(saved_voice): self.voice_combo.current(i); break
|
||||
else:
|
||||
if d_names: self.voice_combo.current(0)
|
||||
self.log("配音员列表加载完毕。"); self.audition_button.config(state="normal")
|
||||
encoders = msg["encoders"]; self.encoder_combo['values'] = encoders
|
||||
encoder_priority = ["NVIDIA", "AMD", "Intel", "CPU"]
|
||||
best_encoder = next((enc for prio in encoder_priority for enc in encoders if prio in enc), encoders[0])
|
||||
if self.vars['encoder'].get() not in encoders: self.vars['encoder'].set(best_encoder); self.log(f"自动选择最佳编码器: {best_encoder}")
|
||||
self.log(f"检测到可用编码器: {', '.join(encoders)}")
|
||||
elif msg_type=="audition_done": self.audition_button.config(state="normal"); self.log("试听完毕。")
|
||||
elif msg_type=="preview_done": self.preview_button.config(state="normal"); self.log("预览生成完毕。")
|
||||
except queue.Empty: pass
|
||||
finally: self.root.after(100, self.process_queue)
|
||||
|
||||
if __name__ == "__main__":
|
||||
if platform.system() == "Windows": asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
|
||||
root = tk.Tk(); app = App(root); root.mainloop()
|
||||
7
requirements.txt
Normal file
7
requirements.txt
Normal file
@@ -0,0 +1,7 @@
|
||||
requests
|
||||
edge-tts
|
||||
sv-ttk
|
||||
jieba
|
||||
srt
|
||||
playsound
|
||||
pysrt
|
||||
395
srt_interactive_refiner_v1.0.py
Normal file
395
srt_interactive_refiner_v1.0.py
Normal file
@@ -0,0 +1,395 @@
|
||||
# filename: srt_interactive_refiner_v2.0.py
|
||||
|
||||
import tkinter as tk
|
||||
from tkinter import filedialog, ttk, messagebox, scrolledtext
|
||||
import sv_ttk
|
||||
import re
|
||||
from datetime import timedelta
|
||||
import os
|
||||
import requests
|
||||
import json
|
||||
import threading
|
||||
import queue
|
||||
import copy
|
||||
|
||||
# --- 配置区 ---
|
||||
OLLAMA_HOST = "http://127.0.0.1:11434"
|
||||
|
||||
# --- SRT核心类 和 解析函数 ---
|
||||
class SrtEntry:
|
||||
def __init__(self, index, start_td, end_td, text, original_index=None):
|
||||
self.index = index
|
||||
self.start_td = start_td
|
||||
self.end_td = end_td
|
||||
self.text = text.strip()
|
||||
self.original_index = original_index if original_index is not None else index
|
||||
|
||||
@property
|
||||
def start_str(self): return self._td_to_str(self.start_td)
|
||||
@property
|
||||
def end_str(self): return self._td_to_str(self.end_td)
|
||||
|
||||
@staticmethod
|
||||
def _td_to_str(td):
|
||||
total_seconds = int(td.total_seconds())
|
||||
ms = int((td.total_seconds() - total_seconds) * 1000)
|
||||
h, m, s = total_seconds // 3600, (total_seconds % 3600) // 60, total_seconds % 60
|
||||
return f"{h:02d}:{m:02d}:{s:02d},{ms:03d}"
|
||||
|
||||
def to_srt_block(self):
|
||||
return f"{self.index}\n{self.start_str} --> {self.end_str}\n{self.text}\n\n"
|
||||
|
||||
def parse_srt(content):
|
||||
entries = []
|
||||
pattern = re.compile(r'(\d+)\n(\d{2}:\d{2}:\d{2},\d{3})\s*-->\s*(\d{2}:\d{2}:\d{2},\d{3})\n([\s\S]*?)(?=\n\n|\Z)', re.MULTILINE)
|
||||
def to_td(time_str):
|
||||
h, m, s, ms = map(int, re.split('[:,]', time_str))
|
||||
return timedelta(hours=h, minutes=m, seconds=s, milliseconds=ms)
|
||||
for match in pattern.finditer(content):
|
||||
index = int(match.group(1))
|
||||
entries.append(SrtEntry(index, to_td(match.group(2)), to_td(match.group(3)), match.group(4)))
|
||||
return entries
|
||||
|
||||
# --- GUI 应用 ---
|
||||
class App:
|
||||
def __init__(self, root):
|
||||
self.root = root
|
||||
self.root.title("交互式字幕编辑器 V2.0 (最终稳定版)")
|
||||
self.root.geometry("1400x800")
|
||||
|
||||
self.srt_path = ""
|
||||
self.original_entries = []
|
||||
self.working_entries = []
|
||||
self.current_selected_work_iid = None
|
||||
self.current_selected_orig_iid = None
|
||||
|
||||
self.gui_queue = queue.Queue()
|
||||
self.is_ollama_available = False
|
||||
self.vars = { "ollama_model": tk.StringVar(), "status": tk.StringVar(value="准备就绪") }
|
||||
|
||||
self.build_ui()
|
||||
sv_ttk.set_theme("dark")
|
||||
|
||||
self.root.after(100, self.process_queue)
|
||||
self.root.after(100, self.load_ollama_models)
|
||||
|
||||
def build_ui(self):
|
||||
main_pane = ttk.PanedWindow(self.root, orient=tk.HORIZONTAL)
|
||||
main_pane.pack(fill=tk.BOTH, expand=True, padx=10, pady=10)
|
||||
|
||||
left_pane = ttk.Frame(main_pane)
|
||||
main_pane.add(left_pane, weight=2)
|
||||
|
||||
control_frame = ttk.Frame(left_pane)
|
||||
control_frame.pack(fill=tk.X, pady=(0, 10))
|
||||
|
||||
self.load_btn = ttk.Button(control_frame, text="加载SRT", command=self.load_srt)
|
||||
self.load_btn.pack(side=tk.LEFT, padx=(0, 5))
|
||||
self.save_btn = ttk.Button(control_frame, text="另存为...", command=self.save_srt, state="disabled")
|
||||
self.save_btn.pack(side=tk.LEFT, padx=(0, 5))
|
||||
self.refine_all_btn = ttk.Button(control_frame, text="🚀 一键润色", command=self.refine_all_lines, state="disabled", style="Accent.TButton")
|
||||
self.refine_all_btn.pack(side=tk.LEFT, padx=(10,5))
|
||||
ttk.Label(control_frame, text="Ollama:").pack(side=tk.LEFT, padx=(10, 5))
|
||||
self.model_combo = ttk.Combobox(control_frame, textvariable=self.vars['ollama_model'], state="readonly", width=25)
|
||||
self.model_combo.pack(side=tk.LEFT, fill=tk.X, expand=True)
|
||||
|
||||
orig_frame = ttk.Labelframe(left_pane, text="原始字幕 (对照区)")
|
||||
orig_frame.pack(fill=tk.BOTH, expand=True)
|
||||
self.tree_orig = self.create_treeview(orig_frame)
|
||||
|
||||
right_pane = ttk.Frame(main_pane)
|
||||
main_pane.add(right_pane, weight=3)
|
||||
|
||||
work_frame = ttk.Labelframe(right_pane, text="工作区 (可编辑)")
|
||||
work_frame.pack(fill=tk.BOTH, expand=True)
|
||||
self.tree_work = self.create_treeview(work_frame)
|
||||
self.tree_work.bind("<<TreeviewSelect>>", self.on_tree_select)
|
||||
self.tree_work.tag_configure("modified", background="#3a6b3a")
|
||||
|
||||
editor_frame = ttk.Labelframe(right_pane, text="单句编辑器")
|
||||
editor_frame.pack(fill=tk.X, pady=(10, 0))
|
||||
|
||||
self.editor_text = scrolledtext.ScrolledText(editor_frame, height=4, wrap=tk.WORD, state="disabled")
|
||||
self.editor_text.pack(fill=tk.X, padx=5, pady=5)
|
||||
|
||||
button_bar = ttk.Frame(editor_frame)
|
||||
button_bar.pack(fill=tk.X, padx=5, pady=(0, 5))
|
||||
button_bar.columnconfigure((0, 1, 2, 3), weight=1)
|
||||
|
||||
self.refine_btn = ttk.Button(button_bar, text="润色", command=self.refine_current_line, state="disabled")
|
||||
self.refine_btn.grid(row=0, column=0, sticky="ew", padx=(0, 2))
|
||||
self.revert_btn = ttk.Button(button_bar, text="还原文本", command=self.revert_current_line_text, state="disabled")
|
||||
self.revert_btn.grid(row=0, column=1, sticky="ew", padx=2)
|
||||
|
||||
self.apply_btn = ttk.Button(button_bar, text="✅ 应用更改", command=self.apply_editor_changes, state="disabled", style="Accent.TButton")
|
||||
self.apply_btn.grid(row=1, column=0, columnspan=2, sticky="ew", padx=(0,2), pady=(5,0))
|
||||
|
||||
self.delete_btn = ttk.Button(button_bar, text="❌ 删除此行", command=self.delete_current_line, state="disabled")
|
||||
self.delete_btn.grid(row=1, column=2, columnspan=2, sticky="ew", padx=2, pady=(5,0))
|
||||
|
||||
status_bar = ttk.Frame(self.root)
|
||||
status_bar.pack(side=tk.BOTTOM, fill=tk.X, padx=10, pady=(0, 5))
|
||||
self.progress_bar = ttk.Progressbar(status_bar, orient=tk.HORIZONTAL, mode='determinate')
|
||||
self.progress_bar.pack(side=tk.LEFT, fill=tk.X, expand=True, padx=(0, 10))
|
||||
ttk.Label(status_bar, textvariable=self.vars['status']).pack(side=tk.LEFT)
|
||||
|
||||
def create_treeview(self, parent):
|
||||
cols = ("#0", "开始", "结束", "文本")
|
||||
tree = ttk.Treeview(parent, columns=cols[1:], show="headings")
|
||||
for col in cols: tree.heading(col, text=col, anchor="w")
|
||||
tree.column("#0", width=40, anchor="center"); tree.column("开始", width=90, anchor="w")
|
||||
tree.column("结束", width=90, anchor="w"); tree.column("文本", width=400, anchor="w")
|
||||
vsb = ttk.Scrollbar(parent, orient="vertical", command=tree.yview); hsb = ttk.Scrollbar(parent, orient="horizontal", command=tree.xview)
|
||||
tree.configure(yscrollcommand=vsb.set, xscrollcommand=hsb.set)
|
||||
vsb.pack(side=tk.RIGHT, fill=tk.Y); hsb.pack(side=tk.BOTTOM, fill=tk.X); tree.pack(side=tk.LEFT, fill=tk.BOTH, expand=True)
|
||||
return tree
|
||||
|
||||
def load_srt(self):
|
||||
path = filedialog.askopenfilename(filetypes=[("SRT Subtitles", "*.srt")])
|
||||
if not path: return
|
||||
self.srt_path = path
|
||||
try:
|
||||
with open(path, 'r', encoding='utf-8-sig') as f: content = f.read()
|
||||
self.original_entries = parse_srt(content)
|
||||
self.working_entries = copy.deepcopy(self.original_entries)
|
||||
self.populate_tree(self.tree_orig, self.original_entries)
|
||||
self.repopulate_work_tree()
|
||||
|
||||
self.save_btn.config(state="normal")
|
||||
self.refine_all_btn.config(state="normal" if self.is_ollama_available else "disabled")
|
||||
|
||||
self.current_selected_work_iid = None
|
||||
self.editor_text.config(state="normal"); self.editor_text.delete(1.0, tk.END); self.editor_text.config(state="disabled")
|
||||
self.set_editor_buttons_state(False)
|
||||
|
||||
self.vars['status'].set(f"已加载 {len(self.original_entries)} 条字幕。")
|
||||
except Exception as e: messagebox.showerror("加载失败", f"无法加载或解析文件: {e}")
|
||||
|
||||
def populate_tree(self, tree, entries):
|
||||
tree.delete(*tree.get_children())
|
||||
for entry in entries:
|
||||
values = (entry.start_str, entry.end_str, entry.text.replace('\n', ' '))
|
||||
tree.insert("", "end", text=str(entry.original_index), values=values, iid=str(entry.original_index))
|
||||
|
||||
def repopulate_work_tree(self):
|
||||
last_selected = self.current_selected_work_iid
|
||||
self.tree_work.delete(*self.tree_work.get_children())
|
||||
for i, entry in enumerate(self.working_entries):
|
||||
entry.index = i + 1
|
||||
values = (entry.start_str, entry.end_str, entry.text.replace('\n', ' '))
|
||||
|
||||
is_modified = entry.text != self.original_entries[entry.original_index - 1].text
|
||||
tags = ("modified",) if is_modified else ()
|
||||
self.tree_work.insert("", "end", text=str(entry.index), values=values, iid=str(entry.index), tags=tags)
|
||||
|
||||
if last_selected and self.tree_work.exists(str(last_selected)):
|
||||
self.tree_work.selection_set(str(last_selected))
|
||||
self.tree_work.see(str(last_selected))
|
||||
|
||||
|
||||
def on_tree_select(self, event):
|
||||
selection = self.tree_work.selection()
|
||||
if not selection: return
|
||||
|
||||
work_iid = int(selection[0])
|
||||
if work_iid > len(self.working_entries): return
|
||||
self.current_selected_work_iid = work_iid
|
||||
|
||||
orig_iid = self.working_entries[work_iid - 1].original_index
|
||||
self.current_selected_orig_iid = orig_iid
|
||||
|
||||
entry_text = self.working_entries[work_iid - 1].text
|
||||
self.editor_text.config(state="normal")
|
||||
self.editor_text.delete(1.0, tk.END)
|
||||
self.editor_text.insert(tk.END, entry_text)
|
||||
self.set_editor_buttons_state(True)
|
||||
|
||||
if self.tree_orig.exists(str(orig_iid)):
|
||||
self.tree_orig.selection_set(str(orig_iid))
|
||||
self.tree_orig.see(str(orig_iid))
|
||||
|
||||
def set_editor_buttons_state(self, is_enabled):
|
||||
state = "normal" if is_enabled else "disabled"
|
||||
is_ready = self.is_ollama_available and is_enabled
|
||||
self.refine_btn.config(state="normal" if is_ready else "disabled")
|
||||
self.revert_btn.config(state=state)
|
||||
self.apply_btn.config(state=state)
|
||||
self.delete_btn.config(state=state)
|
||||
|
||||
def save_srt(self):
|
||||
if not self.working_entries: return
|
||||
self.apply_editor_changes()
|
||||
|
||||
original_basename = os.path.splitext(os.path.basename(self.srt_path))[0]
|
||||
save_path = filedialog.asksaveasfilename(defaultextension=".srt", initialfile=f"{original_basename}_edited.srt", filetypes=[("SRT Subtitles", "*.srt")])
|
||||
if not save_path: return
|
||||
try:
|
||||
for i, entry in enumerate(self.working_entries):
|
||||
entry.index = i + 1
|
||||
with open(save_path, 'w', encoding='utf-8') as f:
|
||||
for entry in self.working_entries: f.write(entry.to_srt_block())
|
||||
messagebox.showinfo("保存成功", f"文件已保存至:\n{save_path}")
|
||||
except Exception as e: messagebox.showerror("保存失败", f"无法保存文件: {e}")
|
||||
|
||||
def apply_editor_changes(self):
|
||||
if self.current_selected_work_iid is not None:
|
||||
entry_index_in_list = self.current_selected_work_iid - 1
|
||||
if entry_index_in_list < len(self.working_entries):
|
||||
new_text = self.editor_text.get(1.0, tk.END).strip()
|
||||
if new_text != self.working_entries[entry_index_in_list].text:
|
||||
self.working_entries[entry_index_in_list].text = new_text
|
||||
self.repopulate_work_tree()
|
||||
self.vars['status'].set(f"第 {self.current_selected_work_iid} 行已更新。")
|
||||
|
||||
def delete_current_line(self):
|
||||
if self.current_selected_work_iid is None: return
|
||||
entry_index_in_list = self.current_selected_work_iid - 1
|
||||
entry_to_delete = self.working_entries[entry_index_in_list]
|
||||
|
||||
if messagebox.askyesno("确认删除", f"确定要删除第 {entry_to_delete.index} 行吗?\n'{entry_to_delete.text[:50]}...'"):
|
||||
self.working_entries.pop(entry_index_in_list)
|
||||
self.repopulate_work_tree()
|
||||
self.current_selected_work_iid = None
|
||||
self.editor_text.config(state="normal"); self.editor_text.delete(1.0, tk.END); self.editor_text.config(state="disabled")
|
||||
self.set_editor_buttons_state(False)
|
||||
self.vars['status'].set(f"原始行 {entry_to_delete.original_index} 已删除。")
|
||||
|
||||
def refine_current_line(self):
|
||||
if self.current_selected_work_iid is None: return
|
||||
self.apply_editor_changes()
|
||||
original_text = self.original_entries[self.current_selected_orig_iid - 1].text
|
||||
self.set_buttons_state(False)
|
||||
threading.Thread(target=self._call_llm_for_refine, args=(original_text, self.vars['ollama_model'].get(), self.current_selected_work_iid), daemon=True).start()
|
||||
|
||||
def refine_all_lines(self):
|
||||
if not self.working_entries: return
|
||||
if messagebox.askyesno("确认", f"即将对全部字幕进行润色,这会覆盖您当前的修改。是否继续?"):
|
||||
self.set_buttons_state(False)
|
||||
self.progress_bar['value'] = 0; self.progress_bar['maximum'] = len(self.original_entries)
|
||||
threading.Thread(target=self._refine_all_worker, args=(self.vars['ollama_model'].get(),), daemon=True).start()
|
||||
|
||||
def _refine_all_worker(self, model_name):
|
||||
for i, entry in enumerate(self.original_entries):
|
||||
self.vars['status'].set(f"正在处理 {i+1}/{len(self.original_entries)}...")
|
||||
self.progress_bar['value'] = i + 1
|
||||
refined_text = self._call_llm_for_refine_sync(entry.text, model_name)
|
||||
|
||||
if refined_text:
|
||||
self.gui_queue.put({"type": "batch_line_refined", "data": {"orig_iid": entry.index, "text": refined_text}})
|
||||
else:
|
||||
self.gui_queue.put({"type": "batch_line_refined", "data": {"orig_iid": entry.index, "text": entry.text, "no_change": True}})
|
||||
self.gui_queue.put({"type": "batch_finish"})
|
||||
|
||||
def revert_current_line_text(self):
|
||||
if self.current_selected_orig_iid is None: return
|
||||
original_text = self.original_entries[self.current_selected_orig_iid - 1].text
|
||||
self.editor_text.delete(1.0, tk.END); self.editor_text.insert(tk.END, original_text)
|
||||
|
||||
def _call_llm_for_refine(self, text, model, work_iid):
|
||||
refined_text = self._call_llm_for_refine_sync(text, model)
|
||||
if refined_text: self.gui_queue.put({"type": "line_refined", "data": {"iid": work_iid, "text": refined_text}})
|
||||
else: self.gui_queue.put({"type": "refine_failed", "data": work_iid})
|
||||
|
||||
def _call_llm_for_refine_sync(self, text, model):
|
||||
prompt = f"""你是一个专业的视频字幕精炼师。任务是优化“待处理字幕”,使其更适合专业配音。
|
||||
规则:
|
||||
1. 改为流畅、专业的书面语,但必须保留所有的核心操作指令和细节。
|
||||
2. 优先去除明显的口语化词汇、重复和不必要的填充词。
|
||||
3. 在不影响信息完整性的前提下,可以适当缩短句子。
|
||||
4. 【重要】只输出精炼后的字幕文本,不要包含任何标签、解释或引号。
|
||||
---
|
||||
[待处理字幕]
|
||||
{text}
|
||||
---
|
||||
[精炼后的文本]:"""
|
||||
payload = {"model": model, "prompt": prompt, "stream": False, "options": {'temperature': 0.3}}
|
||||
try:
|
||||
response = requests.post(f"{OLLAMA_HOST}/api/generate", json=payload, timeout=45)
|
||||
response.raise_for_status()
|
||||
response_data = response.json()
|
||||
refined_text = response_data.get('response', '').strip().replace("\n", " ")
|
||||
return re.sub(r'^["\'“‘]|["\'”’]$', '', refined_text)
|
||||
except Exception as e:
|
||||
self.gui_queue.put({"type": "error", "data": f"API调用失败: {e}"})
|
||||
return None
|
||||
|
||||
def process_queue(self):
|
||||
try:
|
||||
while True:
|
||||
msg = self.gui_queue.get_nowait()
|
||||
msg_type, data = msg.get("type"), msg.get("data")
|
||||
if msg_type == "line_refined":
|
||||
if data["iid"] == self.current_selected_work_iid:
|
||||
self.editor_text.delete(1.0, tk.END)
|
||||
self.editor_text.insert(tk.END, data["text"])
|
||||
self.set_buttons_state(True)
|
||||
elif msg_type == "batch_line_refined":
|
||||
orig_iid = data["orig_iid"]
|
||||
for work_entry in self.working_entries:
|
||||
if work_entry.original_index == orig_iid:
|
||||
work_entry.text = data["text"]
|
||||
values = (work_entry.start_str, work_entry.end_str, work_entry.text.replace('\n', ' '))
|
||||
self.tree_work.item(str(work_entry.index), values=values, tags=("modified",))
|
||||
break
|
||||
elif msg_type == "batch_finish":
|
||||
self.repopulate_work_tree()
|
||||
self.vars['status'].set("批量润色完成!")
|
||||
self.set_buttons_state(True)
|
||||
messagebox.showinfo("完成", "所有字幕已批量润色。请检查并进行微调。")
|
||||
elif msg_type == "refine_failed":
|
||||
if data == self.current_selected_work_iid: self.set_buttons_state(True)
|
||||
elif msg_type == "error": messagebox.showerror("Ollama 错误", data)
|
||||
elif msg_type == "models_loaded":
|
||||
if data:
|
||||
self.model_combo['values'] = data; self.model_combo.set(data[0])
|
||||
self.is_ollama_available = True
|
||||
if self.working_entries: self.refine_all_btn.config(state="normal")
|
||||
else:
|
||||
self.gui_queue.put({"type": "error", "data": "Ollama连接成功, 但未检测到任何模型。请确保您已下载模型。"})
|
||||
|
||||
except queue.Empty: pass
|
||||
finally: self.root.after(100, self.process_queue)
|
||||
|
||||
def set_buttons_state(self, is_enabled):
|
||||
state = "normal" if is_enabled else "disabled"
|
||||
self.load_btn.config(state=state); self.save_btn.config(state=state); self.refine_all_btn.config(state=state)
|
||||
self.set_editor_buttons_state(is_enabled and self.current_selected_work_iid is not None)
|
||||
|
||||
def load_ollama_models(self):
|
||||
threading.Thread(target=self._load_models_worker, daemon=True).start()
|
||||
|
||||
def _load_models_worker(self):
|
||||
# *** 修复: 补全此函数并增强错误处理 ***
|
||||
try:
|
||||
self.vars['status'].set("正在连接Ollama...")
|
||||
response = requests.get(f"{OLLAMA_HOST}/api/tags", timeout=5)
|
||||
response.raise_for_status() # 如果状态码不是 200-299,则抛出异常
|
||||
|
||||
# 检查返回的是否是有效的JSON
|
||||
try:
|
||||
models_data = response.json()
|
||||
except json.JSONDecodeError:
|
||||
self.gui_queue.put({"type": "error", "data": "Ollama返回了无效的数据格式,无法解析模型列表。"})
|
||||
return
|
||||
|
||||
models = models_data.get('models')
|
||||
if models is not None:
|
||||
model_names = [m['name'] for m in models]
|
||||
self.gui_queue.put({"type": "models_loaded", "data": model_names})
|
||||
self.vars['status'].set("Ollama连接成功!")
|
||||
else: # models 键不存在
|
||||
self.gui_queue.put({"type": "models_loaded", "data": []})
|
||||
|
||||
except requests.exceptions.Timeout:
|
||||
self.gui_queue.put({"type": "error", "data": f"连接Ollama超时 ({OLLAMA_HOST})。\n请检查服务是否运行且地址正确。"})
|
||||
except requests.exceptions.ConnectionError:
|
||||
self.gui_queue.put({"type": "error", "data": f"无法连接到Ollama ({OLLAMA_HOST})。\n请确保Ollama服务正在运行。"})
|
||||
except requests.exceptions.RequestException as e:
|
||||
self.gui_queue.put({"type": "error", "data": f"连接Ollama时发生网络错误: {e}"})
|
||||
finally:
|
||||
if not self.is_ollama_available: self.vars['status'].set("Ollama连接失败,润色功能不可用。")
|
||||
|
||||
if __name__ == "__main__":
|
||||
root = tk.Tk()
|
||||
app = App(root)
|
||||
root.mainloop()
|
||||
254
srt_optimizer_v2.py
Normal file
254
srt_optimizer_v2.py
Normal file
@@ -0,0 +1,254 @@
|
||||
# filename: srt_optimizer_v3.4.py
|
||||
|
||||
import tkinter as tk
|
||||
from tkinter import filedialog, ttk, messagebox
|
||||
import sv_ttk
|
||||
import re
|
||||
from datetime import timedelta
|
||||
import os
|
||||
import jieba
|
||||
|
||||
# --- V3.4 配置区 - 黄金语速优先 ---
|
||||
# 基于您的标准 "这是一个静止画面" (7字/1.86秒) 计算出的黄金语速
|
||||
TARGET_SPEED_STRICT = 3.8
|
||||
|
||||
# 定义一个自然的、合理的停顿时长(秒)
|
||||
NATURAL_PAUSE_DURATION = 0.5
|
||||
|
||||
# 其他触发条件
|
||||
FAST_SPEED_THRESHOLD = 8.0
|
||||
SPLIT_CHARS_THRESHOLD = 25
|
||||
|
||||
# 安全阈值
|
||||
MIN_DURATION_THRESHOLD = 0.4
|
||||
MIN_CHARS_THRESHOLD = 2
|
||||
|
||||
# ... (SrtEntry 类 和 parse_srt 函数保持不变)
|
||||
class SrtEntry:
|
||||
def __init__(self, index, start_td, end_td, text):
|
||||
self.index = index; self.start_td = start_td; self.end_td = end_td
|
||||
self.text = text.strip(); self.is_new = False
|
||||
@property
|
||||
def duration(self): return (self.end_td - self.start_td).total_seconds()
|
||||
@property
|
||||
def char_count(self): return len(re.sub(r'[\s,.?!。,、?!]', '', self.text))
|
||||
@property
|
||||
def speed(self):
|
||||
count = self.char_count
|
||||
return count / self.duration if self.duration > 0 and count > 0 else 0
|
||||
@property
|
||||
def start_str(self): return self._td_to_str(self.start_td)
|
||||
@property
|
||||
def end_str(self): return self._td_to_str(self.end_td)
|
||||
@staticmethod
|
||||
def _td_to_str(td):
|
||||
total_seconds = td.total_seconds()
|
||||
ms = int((total_seconds - int(total_seconds)) * 1000)
|
||||
total_seconds = int(total_seconds)
|
||||
h, m, s = total_seconds // 3600, (total_seconds % 3600) // 60, total_seconds % 60
|
||||
return f"{h:02d}:{m:02d}:{s:02d},{ms:03d}"
|
||||
def to_srt_block(self): return f"{self.index}\n{self.start_str} --> {self.end_str}\n{self.text}\n\n"
|
||||
|
||||
def parse_srt(content):
|
||||
entries = []
|
||||
pattern = re.compile(r'(\d+)\n(\d{2}:\d{2}:\d{2},\d{3})\s*-->\s*(\d{2}:\d{2}:\d{2},\d{3})\n([\s\S]*?)(?=\n\n|\Z)', re.MULTILINE)
|
||||
def to_td(time_str):
|
||||
h, m, s, ms = map(int, re.split('[:,]', time_str))
|
||||
return timedelta(hours=h, minutes=m, seconds=s, milliseconds=ms)
|
||||
for match in pattern.finditer(content):
|
||||
entries.append(SrtEntry(int(match.group(1)), to_td(match.group(2)), to_td(match.group(3)), match.group(4)))
|
||||
return entries
|
||||
|
||||
class App:
|
||||
def __init__(self, root):
|
||||
self.root = root
|
||||
self.root.title("字幕听感优化器 V3.4 - 黄金语速版 (最终决定版)")
|
||||
self.root.geometry("1200x800")
|
||||
self.srt_path = ""; self.original_entries = []; self.optimized_entries = []
|
||||
self.build_ui(); sv_ttk.set_theme("dark")
|
||||
|
||||
# ... (GUI方法不变)
|
||||
def build_ui(self):
|
||||
main_pane = ttk.PanedWindow(self.root, orient=tk.HORIZONTAL)
|
||||
main_pane.pack(fill=tk.BOTH, expand=True, padx=10, pady=10)
|
||||
control_frame = ttk.Frame(main_pane, width=250)
|
||||
main_pane.add(control_frame, weight=0)
|
||||
load_btn = ttk.Button(control_frame, text="1. 加载SRT文件", command=self.load_srt, style="Accent.TButton")
|
||||
load_btn.pack(fill=tk.X, pady=5)
|
||||
self.optimize_btn = ttk.Button(control_frame, text="2. 优化节奏", command=self.run_optimization, state="disabled")
|
||||
self.optimize_btn.pack(fill=tk.X, pady=5)
|
||||
self.save_btn = ttk.Button(control_frame, text="3. 另存为...", command=self.save_srt, state="disabled")
|
||||
self.save_btn.pack(fill=tk.X, pady=5)
|
||||
separator = ttk.Separator(control_frame, orient=tk.HORIZONTAL)
|
||||
separator.pack(fill=tk.X, pady=15)
|
||||
self.info_label = ttk.Label(control_frame, text="请先加载SRT文件", anchor="w", wraplength=230, justify="left")
|
||||
self.info_label.pack(fill=tk.X, pady=5)
|
||||
table_container = ttk.Frame(main_pane)
|
||||
main_pane.add(table_container, weight=1)
|
||||
left_frame = ttk.Labelframe(table_container, text="原始字幕")
|
||||
left_frame.pack(side=tk.LEFT, fill=tk.BOTH, expand=True, padx=(0, 5))
|
||||
right_frame = ttk.Labelframe(table_container, text="优化后 (绿色为新生成)")
|
||||
right_frame.pack(side=tk.LEFT, fill=tk.BOTH, expand=True, padx=(5, 0))
|
||||
self.tree_orig = self.create_treeview(left_frame)
|
||||
self.tree_optim = self.create_treeview(right_frame)
|
||||
self.tree_optim.tag_configure("new", background="#3a6b3a")
|
||||
def create_treeview(self, parent):
|
||||
cols = ("#0", "开始时间", "结束时间", "时长", "字数", "语速", "文本")
|
||||
tree = ttk.Treeview(parent, columns=cols[1:], show="headings")
|
||||
for col in cols: tree.heading(col, text=col)
|
||||
tree.column("#0", width=40, anchor="center"); tree.column("开始时间", width=90); tree.column("结束时间", width=90)
|
||||
tree.column("时长", width=50, anchor="e"); tree.column("字数", width=40, anchor="e")
|
||||
tree.column("语速", width=50, anchor="e"); tree.column("文本", width=300)
|
||||
vsb = ttk.Scrollbar(parent, orient="vertical", command=tree.yview)
|
||||
tree.configure(yscrollcommand=vsb.set); tree.pack(side=tk.LEFT, fill=tk.BOTH, expand=True); vsb.pack(side=tk.RIGHT, fill=tk.Y)
|
||||
return tree
|
||||
def load_srt(self):
|
||||
path = filedialog.askopenfilename(filetypes=[("SRT Subtitles", "*.srt")])
|
||||
if not path: return
|
||||
self.srt_path = path
|
||||
try:
|
||||
with open(path, 'r', encoding='utf-8-sig') as f: content = f.read()
|
||||
self.original_entries = parse_srt(content)
|
||||
self.populate_tree(self.tree_orig, self.original_entries)
|
||||
self.tree_optim.delete(*self.tree_optim.get_children())
|
||||
self.info_label.config(text=f"已加载: {os.path.basename(path)}\n共 {len(self.original_entries)} 条字幕。")
|
||||
self.optimize_btn.config(state="normal"); self.save_btn.config(state="disabled")
|
||||
except Exception as e: messagebox.showerror("加载失败", f"无法加载或解析文件: {e}")
|
||||
def populate_tree(self, tree, entries, highlight_new=False):
|
||||
tree.delete(*tree.get_children())
|
||||
for entry in entries:
|
||||
values = (entry.start_str, entry.end_str, f"{entry.duration:.2f}s", entry.char_count, f"{entry.speed:.2f}", entry.text)
|
||||
tags = ("new",) if highlight_new and entry.is_new else ()
|
||||
tree.insert("", "end", text=str(entry.index), values=values, tags=tags)
|
||||
def save_srt(self):
|
||||
if not self.optimized_entries: return
|
||||
original_basename = os.path.splitext(os.path.basename(self.srt_path))[0]
|
||||
save_path = filedialog.asksaveasfilename(defaultextension=".srt", initialfile=f"{original_basename}_optimized.srt", filetypes=[("SRT Subtitles", "*.srt")])
|
||||
if not save_path: return
|
||||
try:
|
||||
with open(save_path, 'w', encoding='utf-8') as f:
|
||||
for entry in self.optimized_entries: f.write(entry.to_srt_block())
|
||||
messagebox.showinfo("保存成功", f"优化后的SRT文件已保存至:\n{save_path}")
|
||||
except Exception as e: messagebox.showerror("保存失败", f"无法保存文件: {e}")
|
||||
|
||||
def run_optimization(self):
|
||||
if not self.original_entries: return
|
||||
paced_entries = self._optimize_pacing(self.original_entries)
|
||||
self.optimized_entries = self._perform_suturing(paced_entries)
|
||||
for i, entry in enumerate(self.optimized_entries): entry.index = i + 1
|
||||
self.populate_tree(self.tree_optim, self.optimized_entries, highlight_new=True)
|
||||
self.save_btn.config(state="normal")
|
||||
self.info_label.config(text=f"优化完成!\n原 {len(self.original_entries)} 句 -> 新 {len(self.optimized_entries)} 句")
|
||||
messagebox.showinfo("优化完成", "字幕节奏已优化!请在右侧检查结果。")
|
||||
|
||||
def _optimize_pacing(self, entries):
|
||||
# --- 核心算法 V3.4 修正处 ---
|
||||
temp_entries = []
|
||||
for entry in entries:
|
||||
entry.text = entry.text.replace('\n', ' ').strip()
|
||||
|
||||
is_slow_and_short = entry.speed < TARGET_SPEED_STRICT and entry.char_count < 20 and entry.char_count > 0
|
||||
is_fast_or_long = entry.speed > FAST_SPEED_THRESHOLD or entry.char_count > SPLIT_CHARS_THRESHOLD
|
||||
|
||||
if is_slow_and_short:
|
||||
ideal_duration_sec = entry.char_count / TARGET_SPEED_STRICT
|
||||
ideal_duration_sec = max(ideal_duration_sec, MIN_DURATION_THRESHOLD)
|
||||
new_end_td = entry.start_td + timedelta(seconds=ideal_duration_sec)
|
||||
new_entry = SrtEntry(0, entry.start_td, new_end_td, entry.text)
|
||||
new_entry.is_new = True
|
||||
temp_entries.append(new_entry)
|
||||
continue
|
||||
|
||||
elif is_fast_or_long:
|
||||
parts = re.split(r'([。,、?!,?!])', entry.text)
|
||||
sub_sentences = []
|
||||
current_part = ""
|
||||
for i, part in enumerate(parts):
|
||||
part = part.strip()
|
||||
if not part: continue
|
||||
if i % 2 == 0: current_part += part
|
||||
else: current_part += part; sub_sentences.append(current_part.strip()); current_part = ""
|
||||
if current_part.strip(): sub_sentences.append(current_part.strip())
|
||||
|
||||
if len(sub_sentences) == 1 and entry.char_count > 12:
|
||||
text = sub_sentences[0]; words = list(jieba.cut(text))
|
||||
if len(words) > 1:
|
||||
mid_len = len(text) / 2; best_split_index = -1; min_diff = float('inf'); current_len = 0
|
||||
for i, word in enumerate(words[:-1]):
|
||||
current_len += len(word); diff = abs(current_len - mid_len)
|
||||
if diff < min_diff: min_diff = diff; best_split_index = i + 1
|
||||
if best_split_index != -1:
|
||||
part1 = "".join(words[:best_split_index]); part2 = "".join(words[best_split_index:])
|
||||
sub_sentences = [part1.strip(), part2.strip()]
|
||||
|
||||
if len(sub_sentences) > 1:
|
||||
current_start_td = entry.start_td
|
||||
|
||||
# <--- BUG 修复处 ---
|
||||
# 之前: sum(entry.char_count for entry in sub_sentences)
|
||||
# 修正: 直接计算每个字符串的字符数
|
||||
total_chars_in_block = sum(len(re.sub(r'[\s,.?!。,、?!]', '', s)) for s in sub_sentences)
|
||||
# --- 修复结束 ---
|
||||
|
||||
if total_chars_in_block == 0:
|
||||
# 如果切分后没有有效字符,直接保留原始条目
|
||||
entry.is_new = False
|
||||
temp_entries.append(entry)
|
||||
continue
|
||||
|
||||
num_gaps = len(sub_sentences) - 1
|
||||
total_natural_pause_sec = num_gaps * NATURAL_PAUSE_DURATION
|
||||
available_speech_time_sec = entry.duration - total_natural_pause_sec
|
||||
|
||||
if available_speech_time_sec > total_chars_in_block / (FAST_SPEED_THRESHOLD * 1.2):
|
||||
pause_td = timedelta(seconds=NATURAL_PAUSE_DURATION)
|
||||
for i, sub_sentence_text in enumerate(sub_sentences):
|
||||
sub_char_count = len(re.sub(r'[\s,.?!。,、?!]', '', sub_sentence_text))
|
||||
|
||||
# 增加一个保护,防止 total_chars_in_block 为0导致除零错误
|
||||
if total_chars_in_block > 0:
|
||||
speech_duration_sec = (sub_char_count / total_chars_in_block) * available_speech_time_sec
|
||||
else:
|
||||
speech_duration_sec = 0
|
||||
|
||||
speech_duration_td = timedelta(seconds=speech_duration_sec)
|
||||
|
||||
new_entry = SrtEntry(0, current_start_td, current_start_td + speech_duration_td, sub_sentence_text)
|
||||
new_entry.is_new = True
|
||||
temp_entries.append(new_entry)
|
||||
|
||||
current_start_td += speech_duration_td
|
||||
if i < num_gaps: current_start_td += pause_td
|
||||
continue
|
||||
|
||||
entry.is_new = False
|
||||
temp_entries.append(entry)
|
||||
return temp_entries
|
||||
|
||||
def _perform_suturing(self, entries):
|
||||
# ... (此函数完全不变)
|
||||
if not entries: return []
|
||||
final_entries = []; merge_buffer = []
|
||||
def flush_buffer():
|
||||
nonlocal merge_buffer
|
||||
if not merge_buffer: return
|
||||
if len(merge_buffer) == 1: final_entries.append(merge_buffer[0])
|
||||
else:
|
||||
start_time = merge_buffer[0].start_td; end_time = merge_buffer[-1].end_td
|
||||
combined_text = " ".join(e.text for e in merge_buffer)
|
||||
merged_entry = SrtEntry(0, start_time, end_time, combined_text); merged_entry.is_new = True
|
||||
final_entries.append(merged_entry)
|
||||
merge_buffer = []
|
||||
for entry in entries:
|
||||
if entry.duration < MIN_DURATION_THRESHOLD or entry.char_count < MIN_CHARS_THRESHOLD:
|
||||
merge_buffer.append(entry)
|
||||
else:
|
||||
flush_buffer(); final_entries.append(entry)
|
||||
flush_buffer()
|
||||
return final_entries
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
root = tk.Tk()
|
||||
app = App(root)
|
||||
root.mainloop()
|
||||
Reference in New Issue
Block a user