Initial commit: working RIP/INEX_TM help processing pipeline

- help_processor.py: parses .docx/.html/.pdf/.doc/.txt, extracts images,
  classifies sections via Claude API, writes to SQL Server
- generate_html.py: builds interactive HTML viewer (Home/Editor/Search/Generator)
- save_keywords.py: applies keyword edits back to DB
- Prefix-scoped DB schema (RIP_help_files, RIP_help_sections) so multiple
  projects share the same database without collision
- BAT launchers per project (RIP_load.bat, INEX_TM_load.bat, ...) load
  credentials from gitignored .env via _load_env.bat
- Rich HTML preservation for .html sources (html_text column)
- Image extraction for all formats with MS Word / LibreOffice fallback for .doc

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-20 11:52:11 +03:00
commit 711053b8bd
16 changed files with 2421 additions and 0 deletions

5
.env.example Normal file
View File

@@ -0,0 +1,5 @@
REM Copy to .env and fill in real values. .env is gitignored.
REM Loaded by .bat файловете чрез: for /f "delims=" %%a in (.env) do set "%%a"
ANTHROPIC_API_KEY=sk-ant-api03-XXXXXXXXXXXXXXXXXXXXXXXXX
HELP_DB_CONN=DRIVER={ODBC Driver 18 for SQL Server};TrustServerCertificate=yes;SERVER=host,port;DATABASE=db;UID=user;PWD=password

30
.gitignore vendored Normal file
View File

@@ -0,0 +1,30 @@
# Credentials
.env
.env.local
*.local
# Python
__pycache__/
*.pyc
*.pyo
# Logs
*.log
# Generated outputs
help_viewer.html
keywords_changes*.json
# Output processing folders (на отделен диск, не за git)
Output/
output/
# Archives
*.zip
*.tar.gz
# IDE / tools
.vscode/
.idea/
.claude/
*.swp

BIN
Bairaci.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

9
INEX_TM_load.bat Normal file
View File

@@ -0,0 +1,9 @@
:@echo off
chcp 65001 > nul
call "%~dp0_load_env.bat" || exit /b 1
set PYTHONIOENCODING=utf-8
echo === INCREMENTAL prefix=INEX_TM ===
echo.
python help_processor.py --prefix=INEX_TM "q:\___Proekti\2022 INEX Технологична модернизация" "q:\___Proekti\2022 INEX Технологична модернизация\Output"
pause

9
INEX_TM_load_force.bat Normal file
View File

@@ -0,0 +1,9 @@
:@echo off
chcp 65001 > nul
call "%~dp0_load_env.bat" || exit /b 1
set PYTHONIOENCODING=utf-8
echo === FORCE + PURGE prefix=INEX_TM ===
echo.
python help_processor.py --prefix=INEX_TM --force --purge-missing "q:\___Proekti\2022 INEX Технологична модернизация" "q:\___Proekti\2022 INEX Технологична модернизация\Output"
pause

6
INEX_TM_view.bat Normal file
View File

@@ -0,0 +1,6 @@
:@echo off
chcp 65001 > nul
call "%~dp0_load_env.bat" || exit /b 1
set PYTHONIOENCODING=utf-8
python generate_html.py --prefix=INEX_TM

124
README.md Normal file
View File

@@ -0,0 +1,124 @@
# RIP Help System — Help-файл декомпозитор и viewer
Обработва help-файлове (`.html`, `.htm`, `.docx`, `.doc`, `.pdf`, `.txt`), декомпозира ги на секции, извлича картинки, класифицира секциите с Claude API (заглавие + ключови думи), и записва всичко в SQL Server. После генерира интерактивен HTML viewer.
## Архитектура
```
Входни файлове → help_processor.py → SQL Server → generate_html.py → help_viewer.html
(.docx, .html, (RIP_help_*) (Home / Редактор /
.pdf, .doc) Търсене / Генератор)
save_keywords.py ← keywords_changes.json
(от Редактора на viewer-а)
```
## Инсталация
```
pip install -r requirements.txt
```
За стар `.doc` формат — едно от:
- **LibreOffice** в PATH (кросплатформено)
- **MS Word** (Windows, чрез pywin32 COM — автоматичен fallback)
## Конфигурация
Копирай `.env.example` като `.env` и попълни:
```
ANTHROPIC_API_KEY=sk-ant-...
HELP_DB_CONN=DRIVER={ODBC Driver 18 for SQL Server};TrustServerCertificate=yes;SERVER=host,port;DATABASE=db;UID=user;PWD=password
```
`.env` е gitignore-нат. Bat файловете го зареждат автоматично през `_load_env.bat`.
## Употреба (Windows)
### Обработка на нов проект
Първо създай `<PROJECT>_load.bat` и `<PROJECT>_view.bat` (вж. `RIP_load.bat`, `RIP_view.bat` като образец).
| BAT | Какво прави |
|---|---|
| `RIP_load.bat` | Incremental — обработва само нови/променени файлове по SHA-256 hash |
| `RIP_load_force.bat` | `--force --purge-missing` — преобработва всичко, изтрива orphans |
| `RIP_view.bat` | Генерира `help_viewer.html` за prefix=RIP и го отваря в браузъра |
### Директно от CLI
```
python help_processor.py --prefix=<PREFIX> <input_dir> <output_dir>
python help_processor.py --prefix=<PREFIX> --force --purge-missing <input_dir> <output_dir>
python generate_html.py --prefix=<PREFIX> # без Home таб
python generate_html.py --prefix=<PREFIX> --home img.png # с Home таб
```
## Prefix scoping
Всеки проект има свой `--prefix` (напр. `RIP`, `INEX_TM`). Прави следните неща изолирани между проектите:
- Кодовете на секциите: `RIP_0001_SEC_0001` vs `INEX_TM_0001_SEC_0001`
- skip-by-hash (incremental) — само в рамките на prefix-а
- `--purge-missing` — изтрива orphans само в текущия prefix
- `generate_html.py --prefix=X` — viewer-а филтрира по prefix
## Структура на базата
### `RIP_help_files`
| Поле | Тип | Описание |
|---|---|---|
| id | INT IDENTITY | PK |
| prefix | NVARCHAR(50) | Project scope |
| file_path | NVARCHAR(1000) | Пълен път до файла |
| file_hash | CHAR(64) | SHA-256 за incremental |
| processed_at | DATETIME2 | Последна обработка |
| section_count | INT | Брой секции |
UNIQUE constraint: `(prefix, file_path)`
### `RIP_help_sections`
| Поле | Тип | Описание |
|---|---|---|
| id | INT IDENTITY | PK |
| prefix | NVARCHAR(50) | Project scope |
| code | NVARCHAR(80) | `<PREFIX>_NNNN_SEC_NNNN` (UNIQUE) |
| source_file | NVARCHAR(1000) | Източник |
| title | NVARCHAR(500) | AI-генерирано заглавие |
| keywords | NVARCHAR(300) | До 5 ключови думи |
| char_count | INT | Размер на чистия текст |
| output_path | NVARCHAR(1000) | Път до `.txt` файла |
| images | NVARCHAR(MAX) | JSON масив с относителни пътища |
| html_text | NVARCHAR(MAX) | Rich HTML с форматиране (само за `.html` източници) |
| created_at, updated_at | DATETIME2 | |
## HTML Viewer — 3 / 4 таба
- **Home** (опционален, ако `--home <image>` е подаден) — началов екран с изображение
- **Редактор** — таблица със секции; inline редактиране на ключови думи; ✓ Save → JSON download → `save_keywords.py` → UPDATE в БД
- **Търсене** — карти със секции; multi-keyword (intervals = AND, "phrase" = literal); preview с картинки
- **Генератор** — drag & drop ordering → export като HTML (self-contained, всички картинки base64-embed-нати)
## Картинки
Извличат се по време на парсване:
- `.docx``<a:blip>` в paragraph drawings → bytes от related_parts
- `.html` — локални файлове и `data:` URLs; HTTP пропуска
- `.pdf``pdfplumber.page.crop(bbox).to_image()` като PNG
- `.doc` — след LibreOffice/MS Word конверсия до `.docx`
Филтър ≥ 50×50 px (PIL детектва), за да отрязва иконки/булети.
Записват се в `<output_dir>/images/<code>_img_NN.<ext>`. В текста placeholder `[IMG: images/...]`. В DB `images` колоната съдържа JSON масив с пътищата.
## Constants (в `help_processor.py`)
| Константа | Default | Описание |
|---|---|---|
| `MIN_SECTION_TOKENS` | 60 | Под този праг секцията се слива с предишната |
| `MAX_AI_CHARS` | 4000 | Символи, пращани към Claude |
| `AI_MODEL` | claude-sonnet-4-6 | Модел за класификация |
| `MIN_IMAGE_PX` | 50 | Картинки под NxN px се пропускат |

9
RIP_load.bat Normal file
View File

@@ -0,0 +1,9 @@
:@echo off
chcp 65001 > nul
call "%~dp0_load_env.bat" || exit /b 1
set PYTHONIOENCODING=utf-8
echo === INCREMENTAL prefix=RIP ===
echo.
python help_processor.py --prefix=RIP "q:\RIP_Help_Source" "q:\RIP_Help_Source\Output"
pause

9
RIP_load_force.bat Normal file
View File

@@ -0,0 +1,9 @@
:@echo off
chcp 65001 > nul
call "%~dp0_load_env.bat" || exit /b 1
set PYTHONIOENCODING=utf-8
echo === FORCE + PURGE prefix=RIP ===
echo.
python help_processor.py --prefix=RIP --force --purge-missing "q:\RIP_Help_Source" "q:\RIP_Help_Source\Output"
pause

6
RIP_view.bat Normal file
View File

@@ -0,0 +1,6 @@
:@echo off
chcp 65001 > nul
call "%~dp0_load_env.bat" || exit /b 1
set PYTHONIOENCODING=utf-8
python generate_html.py --prefix=RIP --home Bairaci.png

9
_load_env.bat Normal file
View File

@@ -0,0 +1,9 @@
@echo off
REM Зарежда ANTHROPIC_API_KEY и HELP_DB_CONN от .env в текущата cmd среда.
REM Извиква се с: call _load_env.bat
if not exist .env (
echo [ERROR] Липсва .env файл. Копирай .env.example като .env и попълни.
exit /b 1
)
for /f "usebackq tokens=1,* delims== eol=#" %%A in (".env") do set "%%A=%%B"
exit /b 0

938
generate_html.py Normal file
View File

@@ -0,0 +1,938 @@
"""
generate_html.py
================
Чете секциите от SQL Server и генерира help_viewer.html.
Стартирай с: python generate_html.py
"""
import os, sys, json, re, base64, mimetypes, argparse
from pathlib import Path
from datetime import datetime
from typing import Optional
try:
import pyodbc
except ImportError:
sys.exit("Инсталирай pyodbc: pip install pyodbc")
CONN_STR = os.getenv(
"HELP_DB_CONN",
"DRIVER={ODBC Driver 18 for SQL Server};"
"TrustServerCertificate=yes;"
"SERVER=94.26.63.238,13151;DATABASE=blondina;"
"UID=blondina_login;PWD=blondina_parola_123"
)
OUT_HTML = Path(__file__).parent / "help_viewer.html"
_IMG_PLACEHOLDER_RE = re.compile(r"\[IMG:\s*([^\]]+?)\s*\]")
def _esc(s: str) -> str:
return (s.replace("&", "&amp;")
.replace("<", "&lt;")
.replace(">", "&gt;")
.replace('"', "&quot;"))
def _img_src(rel: str, output_dir: Path, embed: bool) -> str:
"""file:// URI или base64 data URI за картинка."""
abs_path = (output_dir / rel).resolve()
if not abs_path.exists():
return _esc(rel)
if embed:
try:
mime = mimetypes.guess_type(str(abs_path))[0] or "image/png"
b64 = base64.b64encode(abs_path.read_bytes()).decode("ascii")
return f"data:{mime};base64,{b64}"
except Exception:
return abs_path.as_uri()
return abs_path.as_uri()
def _text_to_html(text: str, output_dir: Path, embed: bool = False) -> str:
"""Конвертира [IMG: images/foo.png] към <img>; escape-ва останалия текст."""
parts = []
last = 0
for m in _IMG_PLACEHOLDER_RE.finditer(text):
parts.append(_esc(text[last:m.start()]))
rel = m.group(1).strip().replace("\\", "/")
src = _img_src(rel, output_dir, embed)
parts.append(
f'<img src="{src}" alt="" '
f'style="max-width:100%;max-height:240px;display:block;margin:8px 0;'
f'border:1px solid #d8dce3;border-radius:6px">'
)
last = m.end()
parts.append(_esc(text[last:]))
return "".join(parts).replace("\n", "<br>")
def _rich_html_with_images(html: str, output_dir: Path, embed: bool = False) -> str:
"""Същото като _text_to_html, но входът е вече HTML — НЕ escape-ва."""
def sub(m):
rel = m.group(1).strip().replace("\\", "/")
src = _img_src(rel, output_dir, embed)
return (f'<img src="{src}" alt="" '
f'style="max-width:100%;max-height:240px;display:block;margin:8px 0;'
f'border:1px solid #d8dce3;border-radius:6px">')
return _IMG_PLACEHOLDER_RE.sub(sub, html)
def fetch_sections(prefix: Optional[str] = None):
conn = pyodbc.connect(CONN_STR, autocommit=True)
cur = conn.cursor()
if prefix:
cur.execute("""
SELECT s.prefix, s.code, s.title, s.keywords, s.char_count,
s.source_file, s.output_path, s.updated_at,
s.images, s.html_text, f.section_count
FROM RIP_help_sections s
LEFT JOIN RIP_help_files f
ON f.file_path = s.source_file AND f.prefix = s.prefix
WHERE s.prefix = ?
ORDER BY s.code
""", prefix)
else:
cur.execute("""
SELECT s.prefix, s.code, s.title, s.keywords, s.char_count,
s.source_file, s.output_path, s.updated_at,
s.images, s.html_text, f.section_count
FROM RIP_help_sections s
LEFT JOIN RIP_help_files f
ON f.file_path = s.source_file AND f.prefix = s.prefix
ORDER BY s.prefix, s.code
""")
cols = [c[0] for c in cur.description]
rows = []
for r in cur.fetchall():
d = dict(zip(cols, r))
d["updated_at"] = str(d["updated_at"])[:16] if d["updated_at"] else ""
# парсваме images JSON
try:
d["images"] = json.loads(d["images"]) if d.get("images") else []
except Exception:
d["images"] = []
# прочитаме текста от .txt файла ако съществува
d["text"] = ""
d["text_html"] = "" # file:// — за viewer-а
d["text_html_embed"] = "" # base64 data: — за export (self-contained)
out_dir = Path(d["output_path"]).parent if d.get("output_path") else None
if d.get("output_path") and Path(d["output_path"]).exists():
try:
txt_path = Path(d["output_path"])
raw = txt_path.read_text(encoding="utf-8")
parts = raw.split("" * 60, 1)
body = parts[1].strip() if len(parts) > 1 else raw
d["text"] = body[:800]
except Exception:
pass
# rich HTML от БД има приоритет; иначе fallback към plain text
if d.get("html_text") and out_dir:
d["text_html"] = _rich_html_with_images(d["html_text"], out_dir, embed=False)
d["text_html_embed"] = _rich_html_with_images(d["html_text"], out_dir, embed=True)
elif out_dir and d["text"]:
d["text_html"] = _text_to_html(d["text"][:1200], out_dir, embed=False)
d["text_html_embed"] = _text_to_html(d["text"], out_dir, embed=True)
rows.append(d)
conn.close()
return rows
def _home_image_data_uri(home_path: Optional[str]) -> Optional[str]:
"""Връща data: URI ако файлът съществува, иначе None."""
if not home_path:
return None
p = Path(home_path).expanduser()
if not p.is_absolute():
p = (Path(__file__).parent / p).resolve()
if not p.is_file():
print(f" [home] файлът не е намерен: {p}", file=sys.stderr)
return None
mime = mimetypes.guess_type(str(p))[0] or "image/png"
b64 = base64.b64encode(p.read_bytes()).decode("ascii")
return f"data:{mime};base64,{b64}"
def build_html(sections, home_image: Optional[str] = None):
data_json = json.dumps(sections, ensure_ascii=False)
generated = datetime.now().strftime("%d.%m.%Y %H:%M")
home_uri = _home_image_data_uri(home_image)
if home_uri:
home_tab_html = '<div class="tab active" onclick="switchTab(\'home\')">00 / Home</div>'
editor_tab_cls = "tab"
editor_panel_cls = "panel"
home_panel_html = (
'<div id="tab-home" class="panel active">'
f' <div class="home-wrap"><img src="{home_uri}" alt="Home"></div>'
'</div>'
)
tab_index_list = "['home','editor','search','generator']"
initial_tab = "home"
else:
home_tab_html = ""
editor_tab_cls = "tab active"
editor_panel_cls = "panel active"
home_panel_html = ""
tab_index_list = "['editor','search','generator']"
initial_tab = "editor"
return f"""<!DOCTYPE html>
<html lang="bg">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Help Viewer</title>
<style>
@import url('https://fonts.googleapis.com/css2?family=IBM+Plex+Mono:wght@400;500&family=IBM+Plex+Sans:wght@300;400;500&display=swap');
*, *::before, *::after {{ box-sizing: border-box; margin: 0; padding: 0; }}
:root {{
--bg: #f5f6f8;
--bg2: #ffffff;
--bg3: #eef0f4;
--border: #d8dce3;
--accent: #3a6fcf;
--accent2: #2ca36f;
--text: #1a1a1f;
--muted: #6a6a78;
--danger: #c84545;
--tag-bg: #e7efff;
--tag-text: #2c5cb8;
--mono: 'IBM Plex Mono', monospace;
--sans: 'IBM Plex Sans', sans-serif;
--radius: 6px;
--radius-lg: 12px;
}}
body {{
background: var(--bg);
color: var(--text);
font-family: var(--sans);
font-size: 14px;
min-height: 100vh;
}}
/* ── Header ── */
header {{
display: flex;
align-items: center;
gap: 16px;
padding: 14px 24px;
border-bottom: 1px solid #1f3a6f;
background: linear-gradient(90deg, #2d5fb0 0%, #3a6fcf 55%, #5589dd 100%);
color: #ffffff;
box-shadow: 0 2px 6px rgba(0,0,0,.10);
position: sticky; top: 0; z-index: 100;
}}
header h1 {{
font-family: var(--mono);
font-size: 16px;
font-weight: 600;
color: #ffffff;
letter-spacing: .05em;
padding: 4px 12px;
background: rgba(255,255,255,.14);
border: 1px solid rgba(255,255,255,.28);
border-radius: var(--radius);
}}
header .sep {{ color: rgba(255,255,255,.35); }}
header #total-count {{ color: rgba(255,255,255,.85) !important; }}
.gen-time {{
font-size: 11px;
color: rgba(255,255,255,.80);
margin-left: auto;
font-family: var(--mono);
}}
/* ── Tabs ── */
.tabs {{ display: flex; gap: 2px; padding: 0 24px; background: var(--bg2); border-bottom: 1px solid var(--border); }}
.tab {{
padding: 10px 20px;
font-size: 13px;
font-family: var(--mono);
color: var(--muted);
cursor: pointer;
border-bottom: 2px solid transparent;
transition: color .15s, border-color .15s;
user-select: none;
}}
.tab.active {{ color: var(--accent); border-color: var(--accent); }}
.tab:hover:not(.active) {{ color: var(--text); }}
/* ── Panels ── */
.panel {{ display: none; padding: 20px 24px; }}
.panel.active {{ display: block; }}
/* ── Toolbar ── */
.toolbar {{ display: flex; gap: 10px; margin-bottom: 16px; flex-wrap: wrap; align-items: center; }}
input[type=text], textarea {{
background: var(--bg3);
border: 1px solid var(--border);
border-radius: var(--radius);
color: var(--text);
font-family: var(--sans);
font-size: 13px;
padding: 7px 12px;
outline: none;
transition: border-color .15s;
}}
input[type=text]:focus, textarea:focus {{ border-color: var(--accent); }}
.search-box {{ flex: 1; min-width: 220px; }}
button {{
padding: 7px 16px;
border-radius: var(--radius);
border: none;
font-family: var(--mono);
font-size: 12px;
cursor: pointer;
transition: opacity .15s;
}}
button:hover {{ opacity: .85; }}
.btn-primary {{ background: var(--accent); color: #fff; }}
.btn-success {{ background: var(--accent2); color: #0a1a12; }}
.btn-danger {{ background: var(--danger); color: #fff; }}
.btn-ghost {{ background: var(--bg3); color: var(--text); border: 1px solid var(--border); }}
/* ── Stats bar ── */
.stats {{ font-size: 11px; color: var(--muted); font-family: var(--mono); }}
/* ── Table ── */
.tbl-wrap {{ overflow-x: auto; border: 1px solid var(--border); border-radius: var(--radius-lg); }}
table {{ width: 100%; border-collapse: collapse; }}
thead th {{
background: var(--bg3);
padding: 10px 12px;
text-align: left;
font-family: var(--mono);
font-size: 11px;
font-weight: 500;
color: var(--muted);
letter-spacing: .06em;
text-transform: uppercase;
border-bottom: 1px solid var(--border);
white-space: nowrap;
}}
tbody tr {{ border-bottom: 1px solid var(--border); transition: background .1s; }}
tbody tr:last-child {{ border-bottom: none; }}
tbody tr:hover {{ background: var(--bg3); }}
td {{ padding: 8px 12px; vertical-align: top; }}
.code-badge {{
font-family: var(--mono);
font-size: 11px;
color: var(--accent);
background: rgba(58,111,207,.10);
padding: 2px 7px;
border-radius: 4px;
white-space: nowrap;
}}
.kw-cell {{ display: flex; flex-direction: column; gap: 6px; min-width: 280px; }}
.kw-tags {{ display: flex; flex-wrap: wrap; gap: 4px; min-height: 18px; }}
.tag {{
font-family: var(--mono);
font-size: 11px;
background: var(--tag-bg);
color: var(--tag-text);
padding: 2px 8px;
border-radius: 20px;
white-space: nowrap;
}}
.kw-edit-row {{ display: flex; gap: 6px; align-items: center; }}
.kw-input {{
flex: 1; min-width: 180px;
background: var(--bg2);
border: 1px solid var(--border);
border-radius: var(--radius);
color: var(--text);
padding: 5px 10px;
font-size: 12px;
font-family: var(--sans);
}}
.kw-input:focus {{
border-color: var(--accent);
background: var(--bg2);
box-shadow: 0 0 0 2px rgba(58,111,207,.15);
outline: none;
}}
.kw-input.changed {{ border-color: var(--accent2); background: rgba(44,163,111,.05); }}
.save-btn {{
padding: 5px 12px;
font-size: 12px;
background: var(--accent2);
color: #fff;
border-radius: var(--radius);
border: none;
cursor: pointer;
display: none;
font-family: var(--mono);
}}
.save-btn.visible {{ display: inline-block; }}
.save-btn:hover {{ opacity: .9; }}
.src-file {{ font-size: 11px; color: var(--muted); font-family: var(--mono); max-width: 200px; overflow: hidden; text-overflow: ellipsis; white-space: nowrap; }}
.title-cell {{ max-width: 220px; }}
/* ── Search results ── */
.results-grid {{ display: grid; grid-template-columns: repeat(auto-fill, minmax(340px, 1fr)); gap: 12px; }}
.card {{
background: var(--bg2);
border: 1px solid var(--border);
border-radius: var(--radius-lg);
padding: 14px 16px;
transition: border-color .15s;
cursor: pointer;
}}
.card:hover {{ border-color: var(--accent); }}
.card.selected {{ border-color: var(--accent2); background: rgba(44,163,111,.10); }}
.card-header {{ display: flex; justify-content: space-between; align-items: flex-start; gap: 8px; margin-bottom: 8px; }}
.card-title {{ font-weight: 500; font-size: 13px; line-height: 1.4; }}
.card-tags {{ display: flex; flex-wrap: wrap; gap: 4px; margin-bottom: 8px; }}
.card-text {{ font-size: 12px; color: var(--muted); line-height: 1.6; max-height: 280px; overflow: hidden; }}
.card-text img {{ max-width: 100%; max-height: 200px; display: block; margin: 6px 0; border-radius: 4px; border: 1px solid var(--border); }}
.card-footer {{ margin-top: 8px; font-size: 11px; color: var(--muted); font-family: var(--mono); }}
.check-icon {{ width: 18px; height: 18px; border-radius: 50%; border: 2px solid var(--border); flex-shrink: 0; margin-top: 2px; transition: all .15s; }}
.card.selected .check-icon {{ background: var(--accent2); border-color: var(--accent2); }}
/* ── Generator ── */
.gen-layout {{ display: grid; grid-template-columns: 1fr 280px; gap: 20px; }}
.selected-list {{ display: flex; flex-direction: column; gap: 8px; }}
.sel-item {{
background: var(--bg2);
border: 1px solid var(--border);
border-radius: var(--radius);
padding: 10px 14px;
display: flex;
align-items: center;
gap: 10px;
cursor: grab;
}}
.sel-item:active {{ cursor: grabbing; }}
.sel-item.drag-over {{ border-color: var(--accent); background: rgba(58,111,207,.10); }}
.drag-handle {{ color: var(--muted); font-size: 16px; user-select: none; }}
.sel-item-info {{ flex: 1; }}
.sel-item-title {{ font-size: 13px; font-weight: 500; }}
.sel-item-code {{ font-family: var(--mono); font-size: 11px; color: var(--muted); }}
.remove-btn {{ background: none; border: none; color: var(--muted); font-size: 16px; cursor: pointer; padding: 0 4px; }}
.remove-btn:hover {{ color: var(--danger); }}
.gen-panel {{
background: var(--bg2);
border: 1px solid var(--border);
border-radius: var(--radius-lg);
padding: 20px;
position: sticky;
top: 80px;
}}
.gen-panel h3 {{ font-family: var(--mono); font-size: 13px; color: var(--muted); margin-bottom: 16px; letter-spacing: .06em; }}
.format-btns {{ display: flex; flex-direction: column; gap: 8px; }}
.format-btn {{
padding: 10px 16px;
border-radius: var(--radius);
border: 1px solid var(--border);
background: var(--bg3);
color: var(--text);
font-family: var(--mono);
font-size: 12px;
cursor: pointer;
text-align: left;
transition: all .15s;
display: flex;
align-items: center;
gap: 10px;
}}
.format-btn:hover {{ border-color: var(--accent); color: var(--accent); }}
.format-dot {{ width: 8px; height: 8px; border-radius: 50%; }}
.dot-word {{ background: #2b7fd4; }}
.dot-html {{ background: #e0854a; }}
.dot-pdf {{ background: #e05c5c; }}
.empty-state {{ text-align: center; padding: 60px 20px; color: var(--muted); }}
.empty-state .big {{ font-size: 40px; margin-bottom: 12px; }}
/* ── Toast ── */
#toast {{
position: fixed; bottom: 24px; right: 24px;
background: var(--accent2); color: #0a1a12;
padding: 10px 20px; border-radius: var(--radius);
font-family: var(--mono); font-size: 13px;
opacity: 0; transform: translateY(10px);
transition: all .3s; pointer-events: none; z-index: 999;
}}
#toast.show {{ opacity: 1; transform: translateY(0); }}
/* ── Save panel ── */
#save-panel {{
position: fixed; bottom: 24px; left: 50%; transform: translateX(-50%);
background: var(--bg2); border: 1px solid var(--accent);
border-radius: var(--radius-lg); padding: 14px 24px;
display: flex; align-items: center; gap: 16px;
box-shadow: 0 6px 24px rgba(0,0,0,.12);
opacity: 0; pointer-events: none; transition: opacity .2s; z-index: 200;
}}
#save-panel.show {{ opacity: 1; pointer-events: auto; }}
#save-count {{ font-family: var(--mono); font-size: 13px; color: var(--accent); }}
.sep {{ color: var(--border); margin: 0 4px; }}
/* ── Home tab ── */
.home-wrap {{
display: flex; justify-content: center; align-items: flex-start;
padding: 24px;
}}
.home-wrap img {{
max-width: 100%; max-height: calc(100vh - 200px);
border: 1px solid var(--border);
border-radius: var(--radius-lg);
box-shadow: 0 4px 16px rgba(0,0,0,.06);
}}
</style>
</head>
<body>
<header>
<h1>BG16RFPR001-1.001-0068</h1>
<span class="sep">|</span>
<span style="font-size:12px;color:var(--muted);font-family:var(--mono)" id="total-count"></span>
<span class="gen-time">генериран: {generated}</span>
</header>
<div class="tabs">
{home_tab_html}
<div class="{editor_tab_cls}" onclick="switchTab('editor')">01 / Редактор</div>
<div class="tab" onclick="switchTab('search')">02 / Търсене</div>
<div class="tab" onclick="switchTab('generator')">03 / Генератор</div>
</div>
{home_panel_html}
<!-- ══════════════════════════════════════
ТАБ 1: РЕДАКТОР
══════════════════════════════════════ -->
<div id="tab-editor" class="{editor_panel_cls}">
<div class="toolbar">
<input type="text" class="search-box" id="editor-search" placeholder="Филтрирай по код, заглавие, ключова дума..." oninput="filterEditor()">
<span class="stats" id="editor-stats"></span>
</div>
<div class="tbl-wrap">
<table id="editor-table">
<thead>
<tr>
<th>Код</th>
<th>Заглавие</th>
<th>Ключови думи</th>
<th>Source файл</th>
<th>Обновен</th>
</tr>
</thead>
<tbody id="editor-body"></tbody>
</table>
</div>
</div>
<!-- ══════════════════════════════════════
ТАБ 2: ТЪРСЕНЕ
══════════════════════════════════════ -->
<div id="tab-search" class="panel">
<div class="toolbar">
<input type="text" class="search-box" id="search-input"
placeholder='Няколко думи разделени с интервал (AND); за фраза - "в кавички"'
oninput="doSearch()">
<span class="stats" id="search-stats"></span>
<button class="btn-primary" onclick="addSelectedToGenerator()">Добави избраните → Генератор</button>
</div>
<div class="results-grid" id="search-results"></div>
</div>
<!-- ══════════════════════════════════════
ТАБ 3: ГЕНЕРАТОР
══════════════════════════════════════ -->
<div id="tab-generator" class="panel">
<div class="gen-layout">
<div>
<div class="toolbar">
<span class="stats" id="gen-stats">Няма избрани секции</span>
<button class="btn-ghost" onclick="clearGenerator()">Изчисти</button>
</div>
<div class="selected-list" id="selected-list">
<div class="empty-state">
<div class="big">⬡</div>
<div>Избери секции от таб Търсене</div>
</div>
</div>
</div>
<div class="gen-panel">
<h3>ГЕНЕРИРАЙ ДОКУМЕНТ</h3>
<div class="format-btns">
<button class="format-btn" onclick="generateDoc('docx')">
<span class="format-dot dot-word"></span> Word (.docx)
</button>
<button class="format-btn" onclick="generateDoc('html')">
<span class="format-dot dot-html"></span> HTML файл
</button>
<button class="format-btn" onclick="generateDoc('pdf')">
<span class="format-dot dot-pdf"></span> PDF
</button>
</div>
<div style="margin-top:20px;font-size:11px;color:var(--muted);line-height:1.7">
Подреди секциите с drag &amp; drop преди генериране.
<br><br>
За Word и PDF е нужен Python backend — засега се генерира HTML.
</div>
</div>
</div>
</div>
<!-- Save panel -->
<div id="save-panel">
<span id="save-count">0 промени</span>
<button class="btn-success" onclick="saveChanges()">Запази в JSON</button>
<button class="btn-ghost" onclick="discardChanges()">Отхвърли</button>
</div>
<div id="toast"></div>
<script>
const ALL = {data_json};
const changes = {{}}; // code -> new keywords
const selected = new Set(); // codes selected for generator
let genOrder = [];
// ── Init ──────────────────────────────
document.getElementById('total-count').textContent = ALL.length + ' секции';
renderEditor(ALL);
doSearch();
renderGenerator();
// ── Tabs ──────────────────────────────
function switchTab(name) {{
const order = {tab_index_list};
document.querySelectorAll('.tab').forEach((t,i) => t.classList.toggle('active', order[i] === name));
document.querySelectorAll('.panel').forEach(p => p.classList.remove('active'));
document.getElementById('tab-' + name).classList.add('active');
if (name === 'generator') renderGenerator();
}}
// ── EDITOR ────────────────────────────
function renderEditor(rows) {{
const tbody = document.getElementById('editor-body');
tbody.innerHTML = rows.map(r => `
<tr>
<td><span class="code-badge">${{r.code}}</span></td>
<td class="title-cell">${{esc(r.title)}}</td>
<td>
<div class="kw-cell">
<div class="kw-tags" id="tags-${{r.code}}">
${{(r.keywords||'').split(',').filter(k=>k.trim()).map(k=>`<span class="tag">${{esc(k.trim())}}</span>`).join('')}}
</div>
<div class="kw-edit-row">
<input class="kw-input" data-code="${{r.code}}" value="${{esc(r.keywords||'')}}"
placeholder="ключови думи, разделени със запетая"
oninput="onKwChange(this)"
onkeydown="if(event.key==='Enter'){{saveOne(this);event.preventDefault();}}">
<button class="save-btn" id="sb-${{r.code}}"
onclick="saveOne(document.querySelector(\`[data-code='${{r.code}}']\`))">✓ Запази</button>
</div>
</div>
</td>
<td><span class="src-file" title="${{esc(r.source_file)}}">${{esc(shortPath(r.source_file))}}</span></td>
<td style="font-size:11px;color:var(--muted);font-family:var(--mono);white-space:nowrap">${{r.updated_at}}</td>
</tr>
`).join('');
document.getElementById('editor-stats').textContent = rows.length + ' реда';
}}
function filterEditor() {{
const q = document.getElementById('editor-search').value.toLowerCase();
const filtered = q ? ALL.filter(r =>
(r.code||'').toLowerCase().includes(q) ||
(r.title||'').toLowerCase().includes(q) ||
(r.keywords||'').toLowerCase().includes(q) ||
(r.source_file||'').toLowerCase().includes(q)
) : ALL;
renderEditor(filtered);
}}
function onKwChange(inp) {{
const code = inp.dataset.code;
changes[code] = inp.value;
inp.classList.add('changed');
document.getElementById('sb-' + code).classList.add('visible');
updateSavePanel();
// live preview of tags
const tagsHost = document.getElementById('tags-' + code);
if (tagsHost) {{
tagsHost.innerHTML = inp.value.split(',')
.filter(k => k.trim())
.map(k => `<span class="tag">${{esc(k.trim())}}</span>`)
.join('');
}}
}}
function saveOne(inp) {{
const code = inp.dataset.code;
changes[code] = inp.value;
inp.classList.remove('changed');
const btn = document.getElementById('sb-' + code);
if (btn) {{ btn.classList.remove('visible'); }}
const row = ALL.find(r => r.code === code);
if (row) row.keywords = inp.value;
updateSavePanel();
toast('Маркирано — натисни "Запази в JSON" долу за да приложиш в БД');
}}
// ── SEARCH ────────────────────────────
function tokenizeQuery(q) {{
const tokens = [];
const re = /"([^"]+)"|(\S+)/g;
let m;
while ((m = re.exec(q)) !== null) {{
const t = (m[1] || m[2] || '').toLowerCase();
if (t) tokens.push(t);
}}
return tokens;
}}
function doSearch() {{
const q = document.getElementById('search-input').value.trim().toLowerCase();
const tokens = q ? tokenizeQuery(q) : [];
const results = tokens.length === 0 ? ALL : ALL.filter(r => {{
const hay = ((r.keywords||'') + ' ' + (r.title||'') + ' ' + (r.text||'')).toLowerCase();
return tokens.every(t => hay.includes(t));
}});
document.getElementById('search-stats').textContent =
results.length + ' резултата' + (tokens.length > 1 ? ' (за ' + tokens.length + ' термина)' : '');
document.getElementById('search-results').innerHTML = results.map(r => `
<div class="card ${{selected.has(r.code)?'selected':''}}" onclick="toggleSelect('${{r.code}}', this)">
<div class="card-header">
<div>
<div class="code-badge" style="margin-bottom:6px">${{r.code}}</div>
<div class="card-title">${{esc(r.title)}}</div>
</div>
<div class="check-icon"></div>
</div>
<div class="card-tags">
${{(r.keywords||'').split(',').filter(k=>k.trim()).map(k=>`<span class="tag">${{esc(k.trim())}}</span>`).join('')}}
</div>
<div class="card-text">${{r.text_html || esc(r.text||'(няма preview)')}}</div>
<div class="card-footer">${{esc(shortPath(r.source_file))}} &nbsp;·&nbsp; ${{r.char_count}} знака</div>
</div>
`).join('');
}}
function toggleSelect(code, el) {{
if (selected.has(code)) {{
selected.delete(code);
el.classList.remove('selected');
genOrder = genOrder.filter(c => c !== code);
}} else {{
selected.add(code);
el.classList.add('selected');
genOrder.push(code);
}}
}}
function addSelectedToGenerator() {{
if (selected.size === 0) {{ toast('Избери поне една секция'); return; }}
switchTab('generator');
}}
// ── GENERATOR ─────────────────────────
function renderGenerator() {{
const list = document.getElementById('selected-list');
const stats = document.getElementById('gen-stats');
if (genOrder.length === 0) {{
list.innerHTML = '<div class="empty-state"><div class="big">⬡</div><div>Избери секции от таб Търсене</div></div>';
stats.textContent = 'Няма избрани секции';
return;
}}
stats.textContent = genOrder.length + ' секции избрани';
list.innerHTML = genOrder.map((code, idx) => {{
const r = ALL.find(x => x.code === code);
if (!r) return '';
return `
<div class="sel-item" draggable="true" data-idx="${{idx}}"
ondragstart="dragStart(event,${{idx}})"
ondragover="dragOver(event,${{idx}})"
ondrop="dragDrop(event,${{idx}})"
ondragleave="this.classList.remove('drag-over')">
<span class="drag-handle">⠿</span>
<div class="sel-item-info">
<div class="sel-item-title">${{esc(r.title)}}</div>
<div class="sel-item-code">${{code}}</div>
</div>
<button class="remove-btn" onclick="removeFromGen('${{code}}')">✕</button>
</div>`;
}}).join('');
}}
function removeFromGen(code) {{
selected.delete(code);
genOrder = genOrder.filter(c => c !== code);
renderGenerator();
}}
function clearGenerator() {{
selected.clear();
genOrder = [];
renderGenerator();
}}
// Drag & drop
let dragIdx = null;
function dragStart(e, idx) {{ dragIdx = idx; e.dataTransfer.effectAllowed = 'move'; }}
function dragOver(e, idx) {{ e.preventDefault(); e.currentTarget.classList.add('drag-over'); }}
function dragDrop(e, idx) {{
e.preventDefault();
e.currentTarget.classList.remove('drag-over');
if (dragIdx === null || dragIdx === idx) return;
const moved = genOrder.splice(dragIdx, 1)[0];
genOrder.splice(idx, 0, moved);
dragIdx = null;
renderGenerator();
}}
// ── GENERATE DOC ──────────────────────
function generateDoc(fmt) {{
if (genOrder.length === 0) {{ toast('Избери секции първо'); return; }}
const sections = genOrder.map(code => ALL.find(r => r.code === code)).filter(Boolean);
if (fmt === 'html') {{
const html = buildHtmlDoc(sections);
const blob = new Blob([html], {{type: 'text/html;charset=utf-8'}});
const url = URL.createObjectURL(blob);
const win = window.open(url, '_blank');
if (!win) {{
toast('Браузърът блокира new tab — позволи pop-ups за този файл');
download('help_document.html', html, 'text/html');
}} else {{
toast('HTML документът е отворен в нов tab');
}}
return;
}}
toast('За Word и PDF е нужен Python backend — ще го добавим след това');
}}
function buildHtmlDoc(sections) {{
const body = sections.map(s => `
<section>
<h2>${{esc(s.title)}}</h2>
<p class="meta">${{esc(s.code)}} &nbsp;·&nbsp; ${{esc(s.keywords||'')}}</p>
<div class="content">${{s.text_html_embed || s.text_html || esc(s.text||'').replace(/\\n/g,'<br>')}}</div>
</section>
`).join('<hr>');
return `<!DOCTYPE html><html lang="bg"><head><meta charset="UTF-8">
<title>Help документ</title>
<style>
body{{font-family:Georgia,serif;max-width:860px;margin:40px auto;padding:0 20px;color:#222;line-height:1.7}}
h1{{font-size:24px;border-bottom:2px solid #333;padding-bottom:10px}}
h2{{font-size:18px;margin-top:0;color:#1a1a2e}}
.meta{{font-size:11px;color:#888;font-family:monospace;margin-bottom:12px}}
.content{{font-size:14px}}
hr{{border:none;border-top:1px solid #e0e0e0;margin:32px 0}}
section{{margin-bottom:8px}}
</style></head><body>
<h1>Help документ</h1>
<p style="font-size:12px;color:#888">Генериран: ${{new Date().toLocaleString('bg-BG')}}</p>
<hr>
${{body}}
</body></html>`;
}}
// ── SAVE CHANGES ──────────────────────
function updateSavePanel() {{
const n = Object.keys(changes).length;
const panel = document.getElementById('save-panel');
document.getElementById('save-count').textContent = n + ' промени';
panel.classList.toggle('show', n > 0);
}}
function saveChanges() {{
const data = Object.entries(changes).map(([code, keywords]) => ({{code, keywords}}));
const json = JSON.stringify(data, null, 2);
download('keywords_changes.json', json, 'application/json');
toast('JSON файлът е свален — пусни save_keywords.py за да запишеш в БД');
Object.keys(changes).forEach(k => delete changes[k]);
updateSavePanel();
}}
function discardChanges() {{
Object.keys(changes).forEach(k => delete changes[k]);
document.querySelectorAll('.save-btn').forEach(b => b.classList.remove('visible'));
updateSavePanel();
toast('Промените са отхвърлени');
}}
// ── UTILS ─────────────────────────────
function esc(s) {{
return String(s||'').replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;').replace(/"/g,'&quot;');
}}
function shortPath(p) {{
if (!p) return '';
const parts = p.replace(/\\\\/g,'/').split('/');
return parts.slice(-2).join('/');
}}
function toast(msg) {{
const el = document.getElementById('toast');
el.textContent = msg;
el.classList.add('show');
setTimeout(() => el.classList.remove('show'), 3000);
}}
function download(filename, content, mime) {{
const a = document.createElement('a');
a.href = URL.createObjectURL(new Blob([content], {{type: mime}}));
a.download = filename;
a.click();
}}
</script>
</body>
</html>"""
if __name__ == "__main__":
ap = argparse.ArgumentParser(description="Генерира help_viewer.html от БД")
ap.add_argument(
"--prefix",
default=os.getenv("HELP_PREFIX"),
help="Филтрира viewer-а по prefix (например 'HLP', 'PROJ_X'). "
"Ако липсва, показва всички префикси."
)
ap.add_argument(
"--out",
default=str(OUT_HTML),
help=f"Изходен HTML път (default: {OUT_HTML.name})."
)
ap.add_argument(
"--home",
default=None,
help="Път към изображение, което да се покаже като Home таб (пръв). "
"Ако липсва — няма Home таб (трите стандартни таба остават)."
)
args = ap.parse_args()
print("Четем от базата данни...")
if args.prefix:
print(f" Филтър по prefix: {args.prefix}")
if args.home:
print(f" Home image: {args.home}")
try:
sections = fetch_sections(prefix=args.prefix)
except Exception as e:
sys.exit(f"Грешка при свързване с БД: {e}")
print(f"Намерени {len(sections)} секции.")
html = build_html(sections, home_image=args.home)
out_path = Path(args.out)
out_path.write_text(html, encoding="utf-8")
print(f"Генериран: {out_path}")
import webbrowser
webbrowser.open(out_path.as_uri())
print("Отворен в браузъра.")

1162
help_processor.py Normal file

File diff suppressed because it is too large Load Diff

7
requirements.txt Normal file
View File

@@ -0,0 +1,7 @@
anthropic>=0.25.0
pyodbc>=5.0.0
python-docx>=1.1.0
beautifulsoup4>=4.12.0
lxml>=5.0.0
pdfplumber>=0.11.0
chardet>=5.0.0

77
save_keywords.py Normal file
View File

@@ -0,0 +1,77 @@
"""
save_keywords.py
================
Чете keywords_changes.json (генериран от браузъра)
и записва промените в SQL Server.
Стартирай с: python save_keywords.py
"""
import os, sys, json
from pathlib import Path
from datetime import datetime
try:
import pyodbc
except ImportError:
sys.exit("Инсталирай pyodbc: pip install pyodbc")
CONN_STR = os.getenv(
"HELP_DB_CONN",
"DRIVER={ODBC Driver 18 for SQL Server};"
"TrustServerCertificate=yes;"
"SERVER=94.26.63.238,13151;DATABASE=blondina;"
"UID=blondina_login;PWD=blondina_parola_123"
)
CHANGES_FILE = Path(__file__).parent / "keywords_changes.json"
def main():
if not CHANGES_FILE.exists():
print("Файлът keywords_changes.json не е намерен.")
print("Запази промените от браузъра първо.")
return
changes = json.loads(CHANGES_FILE.read_text(encoding="utf-8"))
if not changes:
print("Няма промени за запис.")
return
print(f"Записвам {len(changes)} промени в БД...")
conn = pyodbc.connect(CONN_STR, autocommit=False)
cur = conn.cursor()
ok, err = 0, 0
for item in changes:
code = item.get("code", "").strip()
keywords = item.get("keywords", "").strip()
if not code:
continue
try:
cur.execute(
"UPDATE RIP_help_sections SET keywords=?, updated_at=GETDATE() WHERE code=?",
keywords, code
)
if cur.rowcount > 0:
ok += 1
print(f"{code}")
else:
print(f" ? {code} — не е намерен в БД")
except Exception as e:
print(f"{code}{e}")
err += 1
conn.commit()
conn.close()
print(f"\nГотово: {ok} записани, {err} грешки.")
# Архивираме файла
ts = datetime.now().strftime("%Y%m%d_%H%M%S")
archive = CHANGES_FILE.parent / f"keywords_changes_{ts}.json"
CHANGES_FILE.rename(archive)
print(f"Файлът е архивиран като: {archive.name}")
if __name__ == "__main__":
main()

21
view.bat Normal file
View File

@@ -0,0 +1,21 @@
:@echo off
chcp 65001 > nul
call "%~dp0_load_env.bat" || exit /b 1
set PYTHONIOENCODING=utf-8
rem Optional: %1 = prefix filter (e.g. RIP, INEX_TM). Empty = show all.
if "%~1"=="" (
echo Generate help_viewer.html from DB ^(all prefixes^)
python generate_html.py
) else (
echo Generate help_viewer.html from DB ^(prefix=%~1^)
python generate_html.py --prefix=%~1
)
echo.
echo ok. browser should be open
echo.
echo to write changes in key words back into DB
echo python save_keywords.py
echo.
pause