ETH官方钱包

前往
大廳
主題

智慧型文本分段工具之試作品

城作也 | 2024-12-06 12:21:47 | 巴幣 404 | 人氣 132

# 智慧型文本分段工具

這是一個專門為處理長篇文本而設計的網頁工具,特別適合需要將大量中文或日文文本分割成較小段落的場景。無論是準備演講稿、編輯文章,還是處理教材,都能派上用場。

## 主要功能

1. **智慧分段**
   - 自動識別句子結束位置(。.!?等標點符號)
   - 根據指定的目標字數進行智慧分段
   - 考慮文意連貫性,避免不自然的斷句

2. **雙語支援**
   - 支援繁體中文和日文
   - 自動切換介面語言
   - 準確計算中日文字數(不計符號和空格)

3. **客製化選項**
   - 可自訂分隔線樣式
   - 可調整目標字數
   - 支援手動微調分段位置

4. **統計功能**
   - 顯示總字數統計
   - 顯示各段落字數
   - 標示分段位置資訊

## 使用場景

1. **演講準備**
   - 將長篇演講稿分割成適合的段落
   - 添加清晰的分段標記
   - 方便掌控演講節奏

2. **教材編輯**
   - 將教材內容分割成適合的學習單元
   - 根據閱讀難度調整段落長度
   - 便於學習進度規劃

3. **文章編輯**
   - 將長文分割成易於閱讀的段落
   - 保持文意連貫性
   - 方便進行文章結構調整

4. **翻譯工作**
   - 將原文分割成適合翻譯的單位
   - 確保翻譯工作的進度管理
   - 便於多人協作翻譯

## 使用方法

1. **基本操作**
   - 選擇操作語言(繁體中文/日文)
   - 在輸入框貼上要處理的文本
   - 設定目標字數(預設1000字)
   - 點擊「插入分隔線」按鈕

2. **自訂設定**
   - 可在分隔線設定區域更改分隔線樣式
   - 可根據需要調整目標字數
   - 可利用上下移動按鈕微調分段位置

3. **檢視結果**
   - 在統計區域查看各段落字數
   - 確認分段位置是否適當
   - 可進行手動調整優化

## 使用建議

1. **字數設定**
   - 演講用:建議設定500-800字
   - 一般閱讀:建議設定1000-1500字
   - 教材用:根據學習者程度調整,建議300-1000字

2. **分段調整**
   - 注意檢查段落之間的連貫性
   - 避免在重要概念中間斷開
   - 確保每個段落都有完整的主旨

3. **實用技巧**
   - 可先用較大的字數進行初步分段
   - 再根據實際需求微調各段落
   - 善用統計資訊進行優化

## 注意事項

- 工具會自動忽略標點符號和空格的計數
- 分段時會考慮句子的完整性
- 實際分段位置可能與目標字數有±20%的誤差
- 請在使用前先備份原文

由於無法直接附上HTML檔案,我把工具的程式碼寫在下面,您可以將它保存為HTML檔案後使用:

[以下是完整的HTML程式碼...]
<!DOCTYPE html>
<html lang="zh-TW">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>高級文本分段工具 / 高機能テキスト區切りツール</title>
    <style>
        body {
            font-family: "Microsoft JhengHei", "Hiragino Kaku Gothic Pro", sans-serif;
            max-width: 1000px;
            margin: 0 auto;
            padding: 20px;
            background-color: #f5f5f5;
        }
        .container {
            background-color: white;
            padding: 20px;
            border-radius: 8px;
            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
        }
        .language-select {
            margin-bottom: 20px;
        }
        select {
            padding: 5px;
            border-radius: 4px;
            border: 1px solid #ddd;
        }
        textarea {
            width: 100%;
            min-height: 200px;
            margin: 10px 0;
            padding: 10px;
            border: 1px solid #ddd;
            border-radius: 4px;
            resize: vertical;
            line-height: 1.6;
        }
        .controls {
            margin: 15px 0;
            display: flex;
            gap: 10px;
            align-items: center;
            flex-wrap: wrap;
        }
        .separator-controls {
            margin: 15px 0;
            padding: 15px;
            border: 1px solid #ddd;
            border-radius: 4px;
            background: #f9f9f9;
        }
        input[type="number"], input[type="text"] {
            padding: 5px;
            border: 1px solid #ddd;
            border-radius: 4px;
        }
        input[type="number"] {
            width: 100px;
        }
        input[type="text"].separator-input {
            width: 200px;
        }
        button {
            padding: 8px 16px;
            background-color: #4CAF50;
            color: white;
            border: none;
            border-radius: 4px;
            cursor: pointer;
        }
        button:hover {
            background-color: #45a049;
        }
        button.secondary {
            background-color: #666;
        }
        button.secondary:hover {
            background-color: #555;
        }
        .section-stats {
            margin: 15px 0;
            border: 1px solid #ddd;
            border-radius: 4px;
            padding: 10px;
        }
        .section-stat-item {
            display: flex;
            justify-content: space-between;
            padding: 5px 0;
            border-bottom: 1px solid #eee;
        }
        .section-stat-item:last-child {
            border-bottom: none;
        }
        .split-point {
            background-color: #e9e9e9;
            padding: 5px;
            margin: 5px 0;
            cursor: pointer;
            border-radius: 4px;
        }
        .split-point:hover {
            background-color: #ddd;
        }
        .split-point.active {
            background-color: #4CAF50;
            color: white;
        }
        .adjustment-controls {
            display: none;
            margin-top: 10px;
            padding: 10px;
            background: #f0f0f0;
            border-radius: 4px;
        }
        .adjustment-controls.visible {
            display: block;
        }
        .info-message {
            color: #666;
            font-size: 0.9em;
            margin: 5px 0;
        }
        .stats-container {
            max-height: 300px;
            overflow-y: auto;
            margin: 10px 0;
        }
    </style>
</head>
<body>
    <div class="container">
        <div class="language-select">
            <select id="langSelect" onchange="updateInterface()">
                <option value="zh">繁體中文</option>
                <option value="ja">日本語</option>
            </select>
        </div>

        <h1 id="title">高級文本分段工具</h1>
        
        <!-- 區切り記號カスタマイズセクション -->
        <div class="separator-controls">
            <h3 id="separatorTitle">分隔線設定</h3>
            <div class="controls">
                <input type="text" id="separatorText" class="separator-input" value="--------">
                <button onclick="updateSeparator()" id="updateSeparatorBtn">更新</button>
            </div>
        </div>

        <div>
            <label id="inputLabel" for="inputText">請輸入文字:</label>
            <textarea id="inputText" placeholder="請在此輸入文字..."
                      oninput="updateCharCount()"></textarea>
            <div class="info-message">
                <span id="charCountLabel">字數:</span><span id="charCount">0</span>
            </div>
        </div>
        
        <div class="controls">
            <label id="limitLabel" for="charLimit">目標字數:</label>
            <input type="number" id="charLimit" value="1000" min="1">
            <button onclick="insertSeparators()" id="buttonText">插入分隔線</button>
        </div>

        <!-- 分割位置の調整コントロール -->
        <div id="adjustmentArea" class="adjustment-controls">
            <h3 id="adjustmentTitle">調整分割位置</h3>
            <div id="splitPoints"></div>
        </div>

        <!-- セクション統計 -->
        <div class="section-stats">
            <h3 id="statsTitle">段落統計</h3>
            <div id="sectionStats" class="stats-container"></div>
        </div>

        <div>
            <label id="outputLabel" for="outputText">結果:</label>
            <textarea id="outputText" readonly></textarea>
        </div>
    </div>

    <script>
        const translations = {
            zh: {
                title: "高級文本分段工具",
                separatorTitle: "分隔線設定",
                updateSeparatorBtn: "更新",
                inputLabel: "請輸入文字:",
                placeholder: "請在此輸入文字...",
                charCountLabel: "字數:",
                limitLabel: "目標字數:",
                buttonText: "插入分隔線",
                adjustmentTitle: "調整分割位置",
                statsTitle: "段落統計",
                outputLabel: "結果:",
                sectionPrefix: "第",
                sectionSuffix: "段",
                charCount: "字數",
                moveUp: "向上移動",
                moveDown: "向下移動",
                preview: "預覽",
                apply: "應用"
            },
            ja: {
                title: "高機能テキスト區切りツール",
                separatorTitle: "區切り線設定",
                updateSeparatorBtn: "更新",
                inputLabel: "テキストを入力してください:",
                placeholder: "ここにテキストを入力してください...",
                charCountLabel: "文字數:",
                limitLabel: "目安の文字數:",
                buttonText: "區切り線を挿入",
                adjustmentTitle: "區切り位置の調整",
                statsTitle: "セクション統計",
                outputLabel: "結果:",
                sectionPrefix: "セクション",
                sectionSuffix: "",
                charCount: "文字數",
                moveUp: "上に移動",
                moveDown: "下に移動",
                preview: "プレビュー",
                apply: "適用"
            }
        };

        let currentSeparator = "--------";
        let currentSections = [];
        let splitPoints = [];

        function isCJKChar(char) {
            return /^[\u3040-\u309F\u30A0-\u30FF\u4E00-\u9FAF]$/.test(char);
        }

        function findNextSentenceEnd(text, startIndex, lang) {
            const endMarks = lang === 'zh' ?
                ['。', '.', '!', '?', '!', '?'] :
                ['。'];
            
            let earliest = text.length;
            for (let mark of endMarks) {
                const pos = text.indexOf(mark, startIndex);
                if (pos !== -1 && pos < earliest) {
                    earliest = pos;
                }
            }
            return earliest === text.length ? text.length : earliest + 1;
        }

        function updateSeparator() {
            const newSeparator = document.getElementById('separatorText').value;
            if (newSeparator.trim() !== '') {
                currentSeparator = newSeparator;
                if (currentSections.length > 0) {
                    updateOutput();
                }
            }
        }

        function updateInterface() {
            const lang = document.getElementById('langSelect').value;
            const t = translations[lang];
            
            // 各要素のテキストを更新
            Object.keys(t).forEach(key => {
                const element = document.getElementById(key);
                if (element) {
                    element.textContent = t[key];
                }
            });
            
            updateCharCount();
        }

        function updateCharCount() {
            const text = document.getElementById('inputText').value;
            const charCount = [...text].filter(isCJKChar).length;
            document.getElementById('charCount').textContent = charCount;
        }

        function updateSectionStats() {
            const lang = document.getElementById('langSelect').value;
            const t = translations[lang];
            const statsContainer = document.getElementById('sectionStats');
            statsContainer.innerHTML = '';

            currentSections.forEach((section, index) => {
                const charCount = [...section].filter(isCJKChar).length;
                const statItem = document.createElement('div');
                statItem.className = 'section-stat-item';
                statItem.innerHTML = `
                    <span>${t.sectionPrefix}${index + 1}${t.sectionSuffix}</span>
                    <span>${t.charCount}: ${charCount}</span>
                `;
                statsContainer.appendChild(statItem);
            });
        }

        function updateSplitPoints() {
            const lang = document.getElementById('langSelect').value;
            const t = translations[lang];
            const splitPointsContainer = document.getElementById('splitPoints');
            splitPointsContainer.innerHTML = '';

            currentSections.forEach((section, index) => {
                if (index > 0) {
                    const pointDiv = document.createElement('div');
                    pointDiv.className = 'split-point';
                    pointDiv.innerHTML = `
                        ${t.sectionPrefix}${index}${t.sectionSuffix} → ${t.sectionPrefix}${index + 1}${t.sectionSuffix}
                        <button onclick="moveSplitPoint(${index}, -1)">${t.moveUp}</button>
                        <button onclick="moveSplitPoint(${index}, 1)">${t.moveDown}</button>
                    `;
                    splitPointsContainer.appendChild(pointDiv);
                }
            });

            document.getElementById('adjustmentArea').classList.add('visible');
        }

        function moveSplitPoint(index, direction) {
            const targetIndex = direction > 0 ? index + 1 : index - 1;
            if (targetIndex >= 0 && targetIndex < currentSections.length) {
                // セクションの分割位置を調整
                if (direction > 0) {
                    const nextEnd = findNextSentenceEnd(currentSections[index], 0, document.getElementById('langSelect').value);
                    const movedText = currentSections[index].substring(0, nextEnd);
                    currentSections[index] = currentSections[index].substring(nextEnd);
                    currentSections[index - 1] += movedText;
                } else {
                    const prevEnd = findNextSentenceEnd(currentSections[index - 1], 0, document.getElementById('langSelect').value);
                    const movedText = currentSections[index - 1].substring(0, prevEnd);
                    currentSections[index - 1] = currentSections[index - 1].substring(prevEnd);
                    currentSections[index] = movedText + currentSections[index];
                }
                
                updateOutput();
                updateSectionStats();
                updateSplitPoints();
            }
        }

        function updateOutput() {
            const result = currentSections.map((section, index) => {
                if (index === 0) return section;
                return `${currentSeparator}\n(${index + 1}/${currentSections.length})\n${section}`;
            }).join('\n');

            document.getElementById('outputText').value = result;
        }

        function insertSeparators() {
            const lang = document.getElementById('langSelect').value;
            const inputText = document.getElementById('inputText').value;
            const targetCharLimit = parseInt(document.getElementById('charLimit').value);
            
            if (!inputText) {
                alert(lang === 'zh' ? '請輸入文字' : 'テキストを入力してください。');
                return;
            }
            
            if (targetCharLimit <= 0) {
                alert(lang === 'zh' ? '請輸入正數' : '正の數を入力してください。');
                return;
            }

            let currentPos = 0;
            let cjkCharCount = 0;
            currentSections = [];
            let currentSection = '';

            while (currentPos < inputText.length) {
                const nextEnd = findNextSentenceEnd(inputText, currentPos, lang);
                const segmentText = inputText.substring(currentPos, nextEnd);
                const segmentCJKCount = [...segmentText].filter(isCJKChar).length;
                
                if (cjkCharCount + segmentCJKCount > targetCharLimit * 1.2) {
                    if (cjkCharCount >= targetCharLimit * 0.8) {
                        currentSections.push(currentSection);
                        currentSection = '';
                        cjkCharCount = 0;
                    }
                }
                
                currentSection += segmentText;
                cjkCharCount += segmentCJKCount;
                currentPos = nextEnd;
            }
            if (currentSection) {
                currentSections.push(currentSection);
            }

            updateOutput();
            updateSectionStats();
            updateSplitPoints();
        }

        // 初期化
        updateInterface();
    </script>
</body>
送禮物贊助創作者 !
0
留言

創作回應

威逼逼
實用推推
2024-12-06 12:27:56
城作也
幾十萬字的文件,用眼睛查看然後用滑鼠操作來分段非常麻煩,所以我試著製作了這個工具。
2024-12-06 12:41:43

更多創作