Importing Source Files
Learn how to import various file formats into Codex Editor including Word documents, subtitles, Bible texts, and more
Codex Editor supports importing a wide variety of file formats for translation projects. Whether you're working with Bible texts, documents, subtitles, or structured data, the import system automatically detects and processes your files.
Understanding Source vs Target Import
Before importing, it's important to understand the two types of imports:
Source Import - Import original content that you want to translate
- Creates a new source file (read-only reference)
- Automatically creates a blank target file for your translation
- Use this for: Bible texts, documents you're translating, video subtitles
Target Import - Import existing translations for source files already in your project
- Matches translated content to existing source files
- Populates your target notebook with translations
- Use this for: partially completed translations, importing work from other tools
Opening the Import Interface
- Click the Compass icon in the sidebar to open Navigation
- Click "Add Files" or "Import Source Files" button
- The import wizard will guide you through the process
Choosing Your Import Type
Option 1: Import New Source Files
Use this when you're starting fresh or adding new content to translate:
- Select "Source Files" in the wizard
- Browse available importers (see supported formats below)
- Upload your file(s)
- Codex creates both source and target notebooks automatically
Option 2: Import Translations
Use this when you have existing translations to bring into Codex:
- Select "Target Files" in the wizard
- Choose which source file this translation belongs to
- Select the appropriate importer
- The system aligns your translation with the source content
Tip: If you're importing a translation, make sure you've already imported the corresponding source file first. The wizard will show you which source files are available.
Supported File Formats
Bible & Scripture Formats
USFM (Unified Standard Format Markers)
- Standard format for Bible translations
- Preserves chapter/verse structure and formatting markers
- Supports footnotes, cross-references, and special formatting
- Use for: Bible translation projects, Paratext exports
Paratext Projects
- Import complete Paratext projects (folder or ZIP)
- Processes
.SFMfiles with metadata - Reads project settings and book names
- Use for: Existing Paratext translation work
eBible Corpus Files
- CSV/TSV format with verse-by-verse content
- Metadata support for original language texts
- Use for: Structured Bible data exports
eBible Download
- Download Bible translations directly from eBible repository
- Access to hundreds of translations
- Includes Macula Hebrew/Greek texts
- Use for: Source text acquisition
Open Bible Stories (OBS)
- Import OBS markdown files
- Supports single files, ZIP archives, or repository download
- Preserves images and story structure
- Round-trip export supported
- Use for: Bible story translation projects
Document Formats
Microsoft Word (DOCX)
- Two import options available:
- Standard DOCX: Basic import with images
- DOCX Round-trip: Preserves complete structure for export back to Word
- Extracts text, paragraphs, and basic formatting
- Image extraction support
- Use for: Document translation, literature projects
Important: If you plan to export back to Word format later, use the "Word Documents (Round-trip)" importer. The standard importer cannot export back to DOCX.
Markdown
- GitHub Flavored Markdown support
- Image extraction
- Preserves headers, lists, and formatting
- Use for: Documentation translation, blog posts, articles
Smart Segmenter (Universal Text)
- Supports 40+ file formats including:
- Plain text (
.txt) - Code files (
.py,.js,.java, etc.) - Config files (
.json,.yaml,.xml) - Data files (
.csv)
- Plain text (
- Intelligent structure-aware splitting
- JSON parsing with object-to-section mapping
- Use for: Any text-based content not covered by other importers
Media & Subtitle Formats
VTT/SRT Subtitles
- WebVTT (
.vtt) and SubRip (.srt) formats - Timestamp-based cell alignment
- Perfect for video translation projects
- Supports media synchronization
- Use for: Video dubbing, subtitle translation
Media Translation Workflow: Import subtitle files, translate the text cells, then export back to VTT/SRT format. See our Video & Audio Translation guide for details.
Specialized Formats
InDesign (IDML)
- Import InDesign Markup files
- Structure preservation for round-trip workflow
- Use for: Publishing and layout projects
PDF (Experimental)
- Text extraction from PDF documents
- Page-based segmentation
- Use for: Translating PDF documents
CSV/TSV (Tabular Data)
- Automatic column detection
- Intelligent mapping (source, target, ID columns)
- Creates source and target notebooks
- Use for: Spreadsheet-based translation workflows
Import Workflow Examples
Example 1: Importing a Bible Book (USFM)
- Open Navigation → Add Files
- Select "Source Files"
- Choose "USFM" importer
- Upload your
.usfmfile (e.g.,01-GEN.usfm) - Codex creates:
GEN.source- Source text (read-only)GEN.codex- Your translation workspace
Example 2: Downloading a Bible from the Cloud
You can download open-license Bibles directly within Codex:
- Open Navigation → Add Files
- Select "Source Files"
- Choose "eBible Download" importer
- Search for your language or specific Bible version (e.g., "WEB", "Greek")
- Includes popular versions like BBE, WEB, ASV, KJV
- Includes original languages (Hebrew Masoretic, Greek NT, Septuagint)
- Click Download to fetch the text
- Optionally check 'Import as Translation Only' if you want to populate an existing target project
- Monitor progress and confirm once complete
Example 3: Importing Video Subtitles
- Open Navigation → Add Files
- Select "Source Files"
- Choose "Subtitles (VTT/SRT)" importer
- Upload your subtitle file (e.g.,
video.vtt) - Codex creates timestamped cells for each subtitle
- Translate each cell
- Export back to VTT/SRT when complete
Example 3: Importing a Word Document for Translation
- Open Navigation → Add Files
- Select "Source Files"
- Choose "Word Documents (Round-trip)" importer
- Upload your
.docxfile - Translate paragraph by paragraph
- Export back to DOCX with formatting preserved
Example 4: Importing an Existing Translation
- Open Navigation → Add Files
- Select "Target Files"
- Choose the source file this translation belongs to
- Select the appropriate importer (USFM, DOCX, etc.)
- Upload your translation file
- Codex aligns content with your source file
- Review alignment and continue translation work
Advanced Import Features
Batch Import (Multiple Files)
- Drag and drop multiple files at once
- Use ZIP archives for organized imports
- System processes files in parallel
Alignment Algorithms
When importing translations (Target Import), Codex uses intelligent alignment:
ID-Based Matching
- Matches by verse references (for Bible texts)
- Matches by cell IDs (for structured content)
- Accuracy: Very high for properly structured files
Sequential Insertion
- Fills empty cells in order
- Used for content without IDs (DOCX, Markdown, plain text)
- Accuracy: Good for similar-length translations
Timestamp-Based (Subtitles)
- Matches by time overlap
- Handles complex 1:many and many:1 mappings
- Accuracy: Excellent for time-synced content
Preview & Confirm
For translation imports, you can:
- Preview alignment results before applying
- See matched vs paratext content statistics
- Try different alignment methods
- Review confidence scores
Troubleshooting Import Issues
Import Fails or File Not Recognized
Problem: File won't import or shows an error
Solutions:
- Verify the file format is supported (check file extension)
- Try using the "Smart Segmenter" for unusual text formats
- Check that the file isn't corrupted (open in original application)
- Ensure the file isn't password-protected or encrypted
Translation Import Shows Poor Alignment
Problem: Imported translation doesn't match source content well
Solutions:
- Try a different alignment method in the preview screen
- Verify you selected the correct source file
- Check that source and translation have similar structure
- Consider manual alignment for complex cases
Images or Formatting Lost
Problem: Images or formatting didn't import
Solutions:
- Use the "Round-trip" importer versions (DOCX, IDML)
- Standard importers focus on text content only
- Images should appear in
.project/attachments/folder - Some formatting is intentionally simplified for translation
Duplicate File Warning
Problem: System warns that a file with this name already exists
Solutions:
- Rename the new file if it's different content
- Overwrite if you want to replace the existing file
- Cancel and review your existing files first
Best Practices
Before Importing
- Organize your files with clear, descriptive names
- Check file formats are supported
- Backup important files before conversion
- Review file quality (no corruption, proper encoding)
During Import
- Use appropriate importers (Round-trip for exports, specific parsers for accuracy)
- Preview alignment results for translation imports
- Check cell counts to ensure complete import
- Verify structure looks correct before proceeding
After Import
- Review imported content for accuracy
- Check metadata (book names, cell IDs, timestamps)
- Test export if using round-trip workflow
- Begin translation with confidence!
Next Steps
After importing your files:
- Begin using translation tools
- Configure AI assistance
- Set up collaboration
- Learn about export options
FAQ
Need more help? Join our Discord community for tips and assistance from other users.