Tags in a WFC TM Wordfast Classic

From Wordfast Wiki
Jump to: navigation, search

When dealing with so-called tagged documents, a WFC TM records placeholders for tags. Those placeholders have a &tX; format, where X is the order of appearance of tags in the source segment. The X order is noted A (ANSI decimal 65), B, C, etc., up to ANSI decimal code 165. Thus, there can be no more than 100 tags in a WFC segment.

For example, the following "tagged" source segment:

<FONT FACE="Helvetica"> This is some text.</FONT>

would appear, in a Wordfast TM as:

&tA;This is some text.&tB;

At translation time, when WFC pulls a TU from the TM and is about to propose the TU's target segment as a translation candidate, WFC uses a substitution algorithm to dress the proposed target segment with the full "real" tags, taken from the document's (not the TM's) source segment, using a triangulation method:

Document's source segment <—> TM's source segment <—> TM's target segment

The triangulation can be successful only if all target tags have a "parent" tag in the source segment. In the rare cases when the TM's target segment has tags that do not appear in the TM's source segment (orphaned tags), WFC records the full syntax of these orphaned tags at TU creation time, so that they can be restored properly at translation time, when the target segment must be proposed with the correct format. If we have, at TU creation time:

In source segment: <FONT FACE="Arial">This is some text:
In target segment: <FONT FACE="Arial">Voici du texte&nbsp;:

then the target segment would be recorded in the TM as:

&tA;Voici du texte&t=;&nbsp;&t=;:

where &t= opens the original tag syntax (&nbsp; in our example) and ; (colon) closes the sequence.

Other examples of segments:

In source segment: <FT>This is some text<AR> here<FT>.
In target segment: <AR>Voici du texte<FT> ici.
In TM TU source: &tA;This is some text&tB; here&tA;.
In TM TU target: &tB;Voici du texte&tA; ici.
In source segment: <FT>This is some text<AR> here.
In target segment: <AR>Voici du<AR> texte<X;X> ici<FT>.
In TM TU source: &tA;This is some text&tB; here.
In TM TU target: &tB;Voici du&tB; texte&t=;<X;X>&t=; ici&tA;.

In most translation memory systems, TMs are overloaded with tags that do not belong there. A TM takes significance when its content is put to (re-) use, meaning, when its past translations are leveraged for a new transation project. Re-using TM content is only done in the presence of a new document to be translated. In other words, at use time, the software operates a triangulation between a new document's new source segment which contains the new formatting, and an existing TM source/target pair which contains formatting placeholders.

 Back to Wordfast Classic User Manual