Core Architecture
The TMT tool is structured as a pipeline that transforms source documents into translated outputs through a Parse → Validate → Execute lifecycle. The design deliberately decouples document format handling from the underlying translation service and network concerns.
System Overview
The application follows a modular design where the app module orchestrates the high-level flow, while specialized modules handle CLI parsing, configuration, and format-specific logic.
┌─────────────┐ ┌──────────────┐ ┌──────────────────┐ ┌──────────────┐
│ main.rs │────▶│ app::run │────▶│ formats:: │────▶│ Translation │
│ (CLI parse)│ │ (lifecycle) │ │ translate_file │ │ Service │
└─────────────┘ └──────────────┘ └──────────────────┘ └──────────────┘
│ │
RuntimeConfig Format Handler
(validated) (pdf / docx / csv_tsv)
Data Flow and Lifecycle
1. Parse & Validate
The system converts the raw Cli struct into a RuntimeConfig. During this phase it performs critical safety checks:
- File existence and size — verifies the input file exists and does not exceed
MAX_FILE_SIZE_BYTES(1 MB). - Format matching — ensures the output file extension matches the input extension.
- Environment preparation — creates any necessary parent directories for the output path.
2. Format Dispatch
Once validated, formats::translate_file acts as the central router. It reads the file extension, then initialises a TmtClient and TranslationService before handing off to the correct handler.
3. Execution
The request is dispatched to one of the three format handlers — PDF, DOCX, or CSV/TSV. Each handler owns the internal structure of its file format and delegates the actual text translation to TranslationService.
Major Subsystems
| Subsystem | Responsibility | Key Types |
|---|---|---|
| CLI & Config | Parse arguments and environment into a validated runtime state | Cli, RuntimeConfig |
| App Orchestration | Manage the high-level lifecycle and filesystem safety checks | app::run, validate_input_file |
| Format Handlers | Parse document structures and reconstruct translated output | formats::pdf, formats::docx, formats::csv_tsv |
| Translation Layer | Sentence splitting, caching, and concurrency control | TranslationService, TmtClient |
Error Handling Strategy
A centralised AppError enum (defined with the thiserror crate) captures all failure modes — from IO failures and rate limits to format-specific parsing errors. Standard library errors such as std::io::Error and csv::Error are automatically converted into AppError variants via From implementations. Errors bubble up to main.rs where they are logged before the process exits with a non-zero code.
See Error Handling for a full reference of all error variants.