+Tools
Documentation

Version 0.1 beta, All rights reserved
Ⓒ 1998-2024, Yves Champollion

Introduction

+Tools (say "PlusTools") has two functions:

  1. +Tools is a Translation Memory & glossary editor for the translation industry.
  2. It supports the Wordfast TXT format, as well as Oasis TMX, TBX, and XLIFF formats. +Tools works on disk, not in RAM, so very large files are supported while maintaining speed. Browsing, editing, searching and/or replacing, filtering, splitting, merging, converting, exporting, are supported, as well as spotting malformed units, anonymizing data, removing redundant units, aligning content to create TMs, and much more.
    Wordfast Server (WFS) TMs and glossaries are also supported through the HTTP protocol, even while they are in use by WFS, which is crucial in most workflows.

  3. +Tools is also a Computer-Assisted Translation (CAT) tool that currently handles Microsoft DOCX Word files, PDF, and XLIFF. If that's your primary use, jump to the relevant section.

Download & install

Click here for the installer (Windows 10/11 recommended)


Translation Memory & glossary editor for the translation industry.

Most CAT tools have little to no utilities to manage Translation Memories and glossaries.
Some tools provide a simplistic lookup function, and little else. +Tools' aim is to bring a speadsheet-like familiar interface for browsing and searching, and support TMs that by far exceed spreadsheet capacity (millions of lines). The other aim is to provide utilities designed and optimized for linguistic databases.

Powerful utilities are available:

The list of features is constanty growing. Terminology extraction is in preparation.

Note: +Tools is currenty in beta stage, and free. Use at your own risk.

Quick start

Click here for the installer.

Overall, the browser paradigm is used, so you don't need to learn yet another User Interface.
Data is presented in a spreadsheet way so again, you are in a familar UI. The F2 key toggles between source/target view (2-column), or all fields. Click a + symbol to add a tab, a x symbol to close a tab. +Tools can open up to 5 tabs.

Acronyms used in this guide.
TM stands for Translation Memory
Glossary is frequently used instead of the more glorious "Terminology Database" for the sake of compacity.
TMX is the acronym for Translation Memory eXchange, a TM standard widely used in the translation/localization industry, maintained by the OASIS group.
TBX is the acronym for Term Base eXchange, a terminology standard widely used in the translation/localization industry, maintained by the OASIS group.
WFS means Wordfast Server, a TM and terminology (glossary) server marketed by Wordfast LLC. Note that WFS is free for personal use (up to three simultaneously connected users). It's one the best-kept secrets in the industry, bringing immense power to users with a technical mind.
WFC means Wordfast Classic, a widely used translation tool which first appeared in 1999, and whose TM/glo formats (unchanged since 1999) probably are the most user-friendly.

Who, why

Who

+Tools is programmed by Yves Champollion.
Suggestions, bug reports, spanking can be directed to yves@champollion.net

Why Maintaining CAT tool data (TMs and Glossaries) is a problem for power users, project managers, even freelance translators with years of activity, who have accumulated precious assets. One free and open-source utility is Olifant, a member of the Okapi framework, with Yves Savourel as the main impulse behind the project.
Olifant supports TMX and the WFC format. But it's not meant for glossaries, and can only handle TMs of modest sizes.

Here are issues power users deal with every day:

Essential actions and shortcuts

To select/unselect individual lines, create a filter first with Ctrl+L, or with the Menu > Filter.

Without a filter
Delete one lineDelete key
Add one lineInsert key
Edit a cellEnter. After edition, Enter confirms changes, Escape cancels changes.
Edit an entire lineShift+Ctrl+Enter. Note that with an XML file, this is a raw edition, you must pay attention to marked-up content, tags, XML-forbidden characters, etc. Do not edit XML files if you are not familiar with XML, TMX, TBX.
FindCtrl+F (Ctrl+Up/Down to find next/previous)
Find-replaceCtrl+H
FiltersCtrl+L (Filter, Concordance, Suspicious TUs, Redundant TUs)
ColumnsF2 Toggles all-column, or 2-column view. Right-click column header to reset column widths
To end of fileCtrl+End
To start of file Ctrl+Home
Paste all lines from the clipboardCtrl+V
With a filter
Select allCtrl+A
Unselect allCtrl+B
Reverse-select allCtrl+R
Select current lineSpace bar (It's a toggle to Select/Unselect)
Copy Ctrl+C Copies all selected lines to the clipboard, and overwrites it.
Delete linesCtrl+Delete

The clipboard is just another file named clipboard.tmp, and survives closing/opening +Tools. The clipboard can be opened as any other file. As with most clipboards, every Copy action rewrites it.

Action time! A few examples

I want to filter some lines, then copy or delete them.
  1. Hamburger menu > Filter (or Ctrl+L). Choose & set a filter, apply it.
  2. Now, only lines that pass the filter are visible, and "selected", shown against a blue background.
    Spacebar: unselect a line, Ctrl+A: select all; Ctrl+B: unselect all; Ctrl+R: reverse-select all.
  3. You can now use:
    Ctrl+C ("Copy") to copy selected & filtered lines to the clipboard, or
    Ctrl+Delete to delete them from the file.

I want to add +Tools clipboard content to some file.
  • Open that file in +Tools.
  • Use Ctrl+V to paste from the +Tools clipboard into that file.

  • Spot and remove suspected garbage from my TM
    1. Hamburger menu > Filter > Suspicious. Set the conditions, apply.
    2. Now, only lines that may be faulty, or unwanted, are visible. They are all "selected".
      Spacebar: unselect a line, Ctrl+A: select all; Ctrl+B: unselect all; Ctrl+R: reverse-select all.
    3. Use Ctrl+C ("Copy") to copy selected & filtered lines to the clipboard, or
      Use Ctrl+Delete to delete them from the file.
      Note that a "stopword" approach is also used, which is a linguistic category (not statistical, as other approaches). With segments of over X words (typically, X >=7), the absence of any stopword can reveal two crucial symptoms: A. Wrong language; B. Gibberish.

    The approach here is to statistically spot & remove garbage. On close inspection, you may fault the software, finding false positives. You can manually unselect "garbage" that is legit, however, when the TM size is over 10,000, not to mention 100,000 or a million units, that is not feasible. The idea, with very large TMs, is that, if spotting garbage is 90% reliable, and garbage is frequent, a TM that's free from 90% of its garbage (risking the deletion of a couple legit units) is still better than a dirty monster. Everything comes at a price, even cleanliness.


    Spot and remove redundant lines (TUs, or entries) from my TM or glossary
    1. Hamburger menu > Filter > Redundant. Set the conditions, apply.
    2. Now only redundant lines are visible, and "selected", in groups of siblings (redundant lines).
      Spacebar: unselect a line, Ctrl+A: select all; Ctrl+B: unselect all; Ctrl+R: reverse-select all.
    3. Use Ctrl+Delete to delete filtered & selected lines from the TM.
    Note:

    Align text to create a translation memory
    Make sure there are no more than 3 opened tabs, so there is room for the resulting TM.
    1. Open a new tab, click "Align".
    2. Copy-paste source, and target content in the appropriate boxes, click "Align".
    Supported languages are Arabic, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Polish, Portuguese, Norwegian, Romanian, Russian, Slovak, Spanish, Swedish, Turkish, Ukrainian, Vietnamese.
    We're working on Korean, and more languages.

    Evaluating the aligner. If you see misalignments, try aligning the same content with other aligners, even those that come at a price, and let me know the overall result.
    I may not respond to uncompared, undocumented "This/that was wrongly aligned". See my note on the subject: Testing vs Using:Academical vs Professional.

    +Tools' aligner does not need the internet, it's totally offline. It has to download a resource file per language, but that's done only once. Once that resource file has been downloaded, the alignment process does not need an internet connection.

    Anonymize the TM to prevent confidentiality issues
    1. Hamburger menu > Settings > Anonymization. Set the conditions, apply.
    2. Hamburger menu > Reorganize. Check "Anonymize", reorganize.
    Note:

    Globally replace Foobar with Barfoo in my file
    1. Press Ctrl+H (a popular shortcut for "Replace"), or click the  ...  icon next to the search bar.
    2. Enter Foobar under Find:     enter Barfoo under Replace with:
    3. Click "Replace"

    Globally replace a language code with another one
    1. Press Ctrl+H (a popular shortcut for "Replace"), or click the  ...  icon next to the search bar.
    2. Select the appropriate language code (source or target) in the list fields.
    3. Enter what language code to find, and what other language code must replace it.
    4. Click "Replace"

    I want to append a TM/glossary to my current TM/glossary
    1. Hamburger menu > Filter > Append.

    I want to append the clipboard to my TM (or glossary)
    1. Press Ctrl+V (the universal shortcut for "Paste")

    Note: the clipboard is just another file, and can be viewed as either a TM, or a glossary.

    Note: Paste/Append between an XML-based format (TMX, TBX) and Wordfast Classic TXT:
    This is possible from TXT to XML. From XML to TXT, only the donor XML units with two language codes that match those of the recipient TXT file will be copied. If you work across TMs and glossaries whose language codes are not compatible, be aware that copying data must respect language codes.


    How many lines, TM TUs / Glo entries, are in my file?>
    1. Press F5 (a popular shortcut for "recalculate")

    WFS TMs and Glossaries

    To connect to WFS TMs and glossaries, note that the HTTP/HTTPS mode is used. Make sure your version of WFS is 1.14.742.274, or higher. If that is not the case, just download WFS from https://www.wordfast.net/zip/WfServer.zip.
    Note that WFS is free for up to three simultaneous connections. If you see the red "demo" mode flashing in WFS, but you have under 4 connections, WFS is actually working in full mode.
    Recent versions of Wordfast Server are compatible with the standard REST interface model used by +Tools.

    Under Setup > Network, check the Port HTTP Active checkbox. Make sure the computer where WfServer runs has the port 81 opened (or any other port you set up).

    In the "Accounts" tab in WFS, you must check the "Allow raw calls" checkbox. The account and its assigned TM/Glossary should not restrict access, and not use encryption.

    When the above is set and secured, +Tools can "talk" to WFS.

      WfServer 1.14.742.274   port: 47110
    oX

    TMs

    Glossaries

    Accounts

    Users

    Groups

    Activity

    Sessions

    Setup
     
     
     
     
     
     
     
     
     
     
    General
     Network
    TMs
    Glos
    Accounts
    Users
    Groups
    Activity
    Setup

     Network 

    Port WF #         LAN IP        
    Port HTTP #     Active WAN IP     
    Port HTTPS # ☐ Active
     

      SysLog 

    Server

     
     Enabled   
     

     

     Email alerts 

    Auth. mail


       Passw

    SMTP server    Enabled
    Send mail to   

    A connection string is the same as with WF Pro, Wf Anywhere, except that the wf:// part (the native WFS protocol) is replaced with http:// as follows:

    http://accountName:passWord@11.12.13.14:81

    where

    For WFS locally running in the same machine as +Tools, the IP number would be 127.0.0.1, or "localhost" if you use a domain name.

    Note that if WFS is "localhost" (located on your computer, IP 127.0.0.1), or in your company's intranet, HTTPS is not necessary, just use HTTP. If HTTPS is used, the workstation (a physical server) must be equipped with a valid SSL certificate for the IP, or domain name.

    The TM and Account, defined under TM or Account, should not restrict, nor encrypt accesss.

    In case of connection difficulty, make sure the "Accept raw calls" checkbox is checked (see above). For more complex issues, check out Troubleshooting section in WFS' manual. Note, however, that your protocol is http:// (not the default wf:// protocol as in the manual).

    Searching in a Wordfast Server TM

    Press Ctrl+F for a quick-and-easy "full file" search mode. Ctrl+F searches the entire database: source and target segments, dates, user names, all meta data.

    Note that Concordance ( ☰ Hambuger menu > Filter, or Ctrl+L) is different, as it works as a filter, only displaying TUs that contain all, or any of, the keyword(s) being searched.

    With Wordfast Server, Concordance uses the TM's index, so it's much faster than the "brute force", regular search. TUs that pass the Concordance search in WFS are sorted, with those containing multiple words - the more relevant ones - on top. Concordance only retrieves up to one megabyte of text, which is still a few thousand TUs. If you search for very common words like and or the, you may reach that maximum. But Concordance is typically used to spot real words, usually rare or peculiar ones, not frequent "stopwords".

    +Tools service files

    Clipboard.tmp contains any data you wrote into the clipboard with Ctrl+C. It can be deleted. It can also be opened by +Tools as any other file. Which means that you can first apply a filter, press Ctrl+C to copy filtered lines into the clipboard, open the clipboard, repeat the operation to drill down a file with successive filters.
    This shows the versatility in +Tools: you can apply successive layers of custom and/or preset filters to virtually extract whatever you want from a file.

    A file that ends with EXPORT.txt contains data exported with a custom export. It can be discarded after use.

    A FLT file (a file with a .flt extension) is a temporary file that contains the data found when applying a filter.
    FLT files can be deleted; they will recreated by +Tools when the need occurs.

    A BAK file is a backup of a file before a complete rewrite. A Search-replace operation, or a reorganization, entirely rewrites a file. Bak files can be deleted.

    +Tools as a CAT tool

    This is at beta stage, but used by some in production.

    Click here for the installer (Windows 10/11 recommended)

    +Tools is a desktop application designed as a possible successor to Wordfast Classic, in case Microsoft ends support for Office VBA, on which Wordfast Classic runs.

    In beta stage, +Tools runs on Microsoft Windows.
    Mac support: There are plans to release +Tools for the Mac.

    Audience: +Tools is meant for solo freelance translators working directly for clients, with Microsoft DOCX, or PDF files (see the note on translating PDF). If jobs are delivered by a Project Manager (PM), please use the tool that your PM recommends.

    I know Wordfast Classic (WFC). What are the differences between WFC and +Tools?

    +Tools does not require Microsoft Word.
    This is why you will see those red critters, so-called <1>tags<2>, inside opened segments. +Tools tries to minimize the quantity of tags. With simple document formatting, tags should be rare.
    At this early stage, +Tools implements a strict one-to-one tag verification, source to target. Flexibility will be introduced for advanced use, allowing custom tags.

    Other than that, you will find the simplicity that made WFC a success from the year 1999 onward.

    The open-source, user-friendly WFC TXT format for TMs and glossaries is used. If +Tools finds a pre-existing Wordfast Classic setup like wordfast.ini, it uses it.

    What is the main difference between +Tools and other CAT tools?

    +Tools has been patiently coded from scratch, instruction after instruction, to avoid an assemblage of bulky third-party libraries.

    Wordfast Classic users still run TMs and glossaries created 25 years ago, and never had to upgrade or convert them.

    I don't want to waste my time. What is +Tools missing compared to other CAT tools?

    The intended audience, as described earlier, rarely, if ever, uses that level of sophistication. They expect a simple and efficient production tool.

    The focus with +Tools, beside a blend of simplicity and efficiency, is modern Machine Translation. This is where +Tools shines, and outperforms the competition.

    Does +Tools need the internet; is it a cloud application?

    No. We have a cloud solution, another tool called Wordfast Anywhere. +Tools does not need the internet to run, unless you set it up to query Machine Translation from a web source - but that is optional. And even if you need web-based MT, +Tools offers a one-click offline mode: pre-MT the entire document in a minute, and the MT results are locally stored. From that point on, you can work offline for days, and still have MT support.

    Can +Tools be integrated in an existing TMS or workflow?

    It's coming, when we get past the proof-of-concept stage. +Tools' code is ready for that. If your TMS can bundle files into a zipped project package (document, TM, glossary, tool setup), you will be covered. The TM format, the glossary format, the tool's setup are all open source, and text-based. They are easy to create or manipulate with simple scripts. The underlying document format for +Tools is XLIFF, which is universally used in the translation industry. The DOCX format can be used too. Unlike other workflows, you can add the entire application, +Tools, into the project file, or include a link to download/install it in seconds.

    It looks like +Tools is based on XLIFF. Can I translate XLIFF files from other tools, like Wordfast Pro, MemoQ, Trados, etc?

    That is intended. However, as long as +Tools is in beta stage, 100% compatibility cannot be guaranteed. TXLF, which is Wordfast Pro's XLIFF, is currently suppported with good results.

    If you translate XLIFF, chances are, you are not directly working for a client, you are subcontracting for a translation agency. Use their recommended translation tool.


    Appendix 1: Accessing data through HTTP.

    As explained in the introduction, a TM is a database rather than just another "file". Handling gigabytes of data is no mean task.
    With WFS TMs, the location of an individual record in the database is not relevant. What is relevant is that the record exists, and can be accessed, edited, worked upon. If you "see" a record in +Tools, and edit it, the record may disappear from the screen after edition. This is because editing it caused WFS to move the record to another location within the database.
    This is almost always the case if, after edition, the record is significantly larger than before edition. (It's actually rarely the case: most editions are correcting typos, and that rarely increases the size). In that case, you would need to perform another search for that TU. Try reaching the end of the TM using Ctrl+End: the record may have been moved there.

    Marking individual TUs, as in TXT or TMX mode, is not yet available when using WFS. Conversions are not supported yet, but note that WFS itself can import, export data to/from TMX & TBX.

    Although a fast and comfortable browsing of all TUS in a remote database, as with a local file, is not possible, +Tools aims to provide a viable alternative to browse, search, examine, and edit remote TMs. +Tools is also a good tool to debug and test other CAT tools that have difficulty reaching WFS.


    Appendix 2: Translating PDF

    Let's get something behind us: no translation tool supports first-degree PDF translation. You will always convert PDF to an editable format, like DOCX. After translation, you either

    Now for the two variants of PDF.

    A. The PDF contains text.

    Double-click the PDF to open it. Try to select just a few words with the mouse. If you can do that, and copy-paste the selected text, your PDF contains text. Otherwise, the PDF contains images of text (scans, screenshots, whatever they're called). Jump to B. The PDF contains images.
    At that point, there are various ways to proceed:

    B. The PDF contains images.


    Appendix 3: XML/HTML, and TXT Container Data Integrity

    This discussion touches the philosophy of coding.

    There are two relevant categories of code to this discussion:

    1. Code in applications used by humans (they usually have a UI).
    2. Code "libraries", components in other applications.

    In category 1, it is acceptable that the software, when faced with a malformed container, attempts at salvaging what can be salvaged, informing the user that it does so, or prompting the user for action.

    In category 2, where the code is a step in a chain of actions, it must refuse malformed XML, unless it has a User Interface (UI) to prompt users for a decision. "Libraries" rarely have a UI, they simply return an error.

    A case in point is the most used applications: internet browsers.
    Browsers do not reject malformed HTML. Browsers accept a certain degree of recuperable errors, and have mechanisms to work around. The purpose is to display what can be displayed, which usually is most of the document.

    +Tools falls in that last category, with exceptions.

    If a TMX / TBX / TXT file, is malformed, +Tools attempts to step around malformed units, and display what can be displayed. At that point, using "Reorganize", the file can be rewritten without the faulty units it contains.
    That is done within the limits of what's possible.
    +Tools uses other fault tolerance techniques to avoid walking away on users with the dreaded "File rejected: error at line X column Y". Although geeks can fix and survive those errors, most people can't and won't, and wish that the rest of the file (probably 99% of it) could be saved.

    Exception: With XLIFF, a malformed container is more difficult to salvage. XLIFF is part of a long chain, where various tools intervene: document text extraction, segmentation, translation, reconstruction of the final document. Fixing an error in XLIFF does not guarantee success in reconstructing the final document.

    That is why malformed XLIFF files are rejected.


    Artificial Intelligence and Translation

    As of 2024, this is a hot topic, and it can be viewed from different angles, such as the human side, like the practice of translation, deontology, or the financial aspect of translation, or technology. The angle here is technology.

    In the past three decades, a few disruptive technologies have forced CAT tools to deeply modify their behaviors. The latest one is AI. AI may bring drastic improvements in translation in two major areas that will impact the daily lives of translators: Machine Translation and Dictation. It may also improve areas that are less apparent to translators, such as tag handling and segmentation. At this time (2024) +Tools focuses on the MT side of AI. You will find +Tools pre-equipped with connectors to www.openai.com.

    AI operators other than www.openai.com can be leveraged as well, using "Custom MT".

    AI can propose different styles of speech. Those different styles are achieved by fine-tuning so-called prompts.

    A "prompt" is how we, humans, tell the blind machine what we expect. It's a game of concision, clarity, rectitude.

    +Tools demonstrates the power and versatility of AI by offering 5 different types of speech in its default MT setup. They are unchecked by default, so you have to activate them.

    www.openai.com's AI-driven translation comes at a price, and requires an API key. However, in 2024, +Tools offers that for free.
    If those options (A1, A2, A3, A4, A5) are checked under "Machine Translation", once a segment is opened (press F6 to re-MT an opened segment if needed), you will see small buttons pop up. Every button demonstrates a speech style. Click a button to see what happens. Click "All" to see all styles.

    Feel free to fine-tune prompts, there's no limit. To do so, proceed to the general Setup, "Machine Translation", click one MT definition line, click "Edit".

    You can add more speech styles, the rule is that their ID contain a number (A1, A2, etc.). To add a custom MT, select "Custom", click the line, then "Edit".

    Here is a screenshot of PlusTools running two AI speech styles in addition to the default "Neutral" mode: DEI (genderless / inclusive), and casual. The source text was chosen to showcase AI's capacity to offer different speech types in translation:

    Privacy declaration

    +Tools beta is a desktop application - not a cloud application. By default, +Tools does not connect to the internet for any purpose:

    +Tools allows connections to remote online Machine Translation services such as Microsoft Translator, Google Translate, deepL, etc. Those are opt-ins. By default, no remote Machine Translation services are connected.

    +Tools never uses the internet without users' express consentment.

    Obtaining the software in beta format, and using it, does not require users to provide any personal information.

    Yves Champollion does not collect, store, share, recycle, disclose, any personal data it may become privy to.

    Contact person for privacy matters: Yves Champollion

    Yves Champollion adamantly opposes practices known as spamming, data collection, cookie collection, telemetry, analytics, etc.

    Right of objection and redress:
    All users have a right to get a communication of any data they believe is kept by Yves Champollion, a right to correct that information, or delete that information.
    Please direct all privacy matters to:

    Yves Champollion
    44 rue Danton
    94270 Le Kremlin-Bicetre
    France
    Or contact yves@champollion.net

    All trademarks noted™ are the property of their respective owners.
    Ms-Word™, Excel™, Access™, PowerPoint™ are trademarks of Microsoft Corp.

    Table of Contents