Difference between revisions of "Tags in a WFC TM Wordfast Classic"

From Wordfast Wiki
Jump to: navigation, search
(Created page with "When dealing with so-called tagged documents, a Wordfast TM records placeholders for tags. Those placeholders have a &tX; format, where X is the order of appearance of tags...")
 
 
Line 1: Line 1:
When dealing with so-called tagged documents, a Wordfast TM records placeholders for tags. Those placeholders have a  &tX;  format, where X is the order of appearance of tags in the source segment. The X order is noted A (ANSI decimal 65), B, C, etc., up to ANSI decimal code 165. Thus, there can be no more than 100 tags in a Wordfast segment.
+
When dealing with so-called tagged documents, a WFC TM records placeholders for tags. Those placeholders have a  &tX;  format, where X is the order of appearance of tags in the source segment. The X order is noted A (ANSI decimal 65), B, C, etc., up to ANSI decimal code 165. Thus, there can be no more than 100 tags in a WFC segment.
  
For example, the following tagged source segment:
+
For example, the following "tagged" source segment:
  
 
<div style="text-align: center;font-family: Courier New"> <span style="color:#FF0000"><</span><span style="color:#FF0000">FONT FACE="Helvetica"></span> This is some text.<span style="color:#FF0000"></FONT></span></div>
 
<div style="text-align: center;font-family: Courier New"> <span style="color:#FF0000"><</span><span style="color:#FF0000">FONT FACE="Helvetica"></span> This is some text.<span style="color:#FF0000"></FONT></span></div>
Line 9: Line 9:
 
<div style="text-align: center;font-family: Courier New"> &tA;This is some text.&tB; </div>
 
<div style="text-align: center;font-family: Courier New"> &tA;This is some text.&tB; </div>
  
At translation time, when Wordfast pulls a TU from the TM and is about to propose the TU's target segment as a translation candidate, Wordfast uses a substitution algorithm to dress the proposed target segment with the full "real" tags, taken from the document's (<u>not the TM's</u>) source segment, using a triangulation method:
+
At translation time, when WFC pulls a TU from the TM and is about to propose the TU's target segment as a translation candidate, WFC uses a substitution algorithm to dress the proposed target segment with the full "real" tags, taken from the document's (<u>not the TM's</u>) source segment, using a triangulation method:
  
 
Document's source segment <—> TM's source segment <—> TM's target segment
 
Document's source segment <—> TM's source segment <—> TM's target segment
  
The triangulation can be successful only if all target tags have a "parent" tag in the source segment. This is because, at translation time, only the new source segment, and the target has to be worked out by the machine. In other words, it's not a problem if the TM's source segment contains tags that do not appear in the TM's target segment. The reverse is a problem, however. If the TM's target segment has tags that do not appear in the TM's source segment (''orphaned tags''), Wordfast records the full syntax of these orphaned tags at TU ''creation time'', so that they can be restored properly at ''translation time'', when the target segment must be proposed with the correct format. If we have, at TU creation time:
+
The triangulation can be successful only if all target tags have a "parent" tag in the source segment. In the rare cases when the TM's target segment has tags that do not appear in the TM's source segment (orphaned tags), WFC records the full syntax of these orphaned tags at TU creation time, so that they can be restored properly at translation time, when the target segment must be proposed with the correct format. If we have, at TU creation time:
 +
 
 
{|
 
{|
 
|-
 
|-
Line 24: Line 25:
 
<span style= "font-family: Courier New">&tA;Voici du texte&t=;&amp;nbsp;&t=;:</span>
 
<span style= "font-family: Courier New">&tA;Voici du texte&t=;&amp;nbsp;&t=;:</span>
  
where <span style= "font-family: Courier New">&t=;</span> opens <u>''and''</u> closes the original tag syntax (<span style= "color:#FF0000">&amp;nbsp;</span> in our example).
+
where &t= opens the original tag syntax (&amp;nbsp; in our example) and ; (colon) closes the sequence.
  
 
Other examples of segments:
 
Other examples of segments:
 +
 
{|
 
{|
 
|-
 
|-
|In source segment: ||<span style= "font-family: Courier New"> <span style= "color:#FF0000"><FT></span>This is some text<span style= "color:#FF0000"><AR></span> here<span style= "color:#FF0000"><FT></span>. </span>
+
|In source segment: ||<span style= "color:#FF0000"><FT></span>This is some text<span style= "color:#FF0000"><AR></span> here<span style= "color:#FF0000"><FT></span>.
 +
|-
 +
|In target segment: ||<span style= "color:#FF0000"><AR></span>Voici du texte<span style= "color:#FF0000"><FT></span> ici.
 
|-
 
|-
|In target segment: ||<span style= "font-family: Courier New"> <span style= "color:#FF0000"><AR></span>Voici du texte<span style= "color:#FF0000"><FT></span> ici. </span>
+
|In TM TU source: ||&tA;This is some text&tB; here&tA;.
 
|-
 
|-
|In TM TU source: ||<span style= "font-family: Courier New"> &tA;This is some text&tB; here&tA;. </span>
+
|In TM TU target: ||&tB;Voici du texte&tA; ici.
|-In TM TU target: ||<span style= "font-family: Courier New"> &tB;Voici du texte&tA; ici. </span>
 
 
|-
 
|-
 
|
 
|
Line 44: Line 47:
 
|
 
|
 
|-
 
|-
|In source segment: ||<span style= "font-family: Courier New"> <span style= "color:#FF0000"><FT></span>This is some text<span style= "color:#FF0000"><AR></span> here. </span>
+
|In source segment: ||<span style= "color:#FF0000"><FT></span>This is some text<span style= "color:#FF0000"><AR></span> here.
 
|-
 
|-
|In target segment: ||<span style= "font-family: Courier New"> <span style= "color:#FF0000"><AR></span>Voici du<span style= "color:#FF0000"><AR></span> texte<span style= "color:#FF0000"><X;X></span> ici<span style= "color:#FF0000"><FT></span>. </span>
+
|In target segment: ||<span style= "color:#FF0000"><AR></span>Voici du<span style= "color:#FF0000"><AR></span> texte<span style= "color:#FF0000"><X;X></span> ici<span style= "color:#FF0000"><FT></span>.
 
|-
 
|-
|In TM TU source: ||<span style= "font-family: Courier New"> &tA;This is some text&tB; here. </span>
+
|In TM TU source: ||&tA;This is some text&tB; here.
 
|-
 
|-
|In TM TU target: ||<span style= "font-family: Courier New"> &tB;Voici du&tB; texte&t=;<X;X>&t=; ici&tA;.</span>
+
|In TM TU target: ||&tB;Voici du&tB; texte&t=;<X;X>&t=; ici&tA;.
 
|}
 
|}
In most translation memory systems, TMs are overloaded with tags that do not belong there. A TM takes significance when its content is put to (re-) use, meaning, when its past translations are leveraged for a new transation project. Re-using TM content is only done in the presence of a <u>new document</u> to be translated. In other words, at use time, we can operate a triangulation between a new document's new source segment which contains the new formatting, and an existing TM source/target pair which contains formatting placeholders.
 
  
Only orphaned (unknown) target-side tags need to store the complete tag syntax. Those are target-side tags that have no equivalent n the source segment. All the rest is unnecessary, purely redundant information.
+
In most translation memory systems, TMs are overloaded with tags that do not belong there. A TM takes significance when its content is put to (re-) use, meaning, when its past translations are leveraged for a new transation project. Re-using TM content is only done in the presence of a new document to be translated. In other words, at use time, the software operates a triangulation between a new document's new source segment which contains the new formatting, and an existing TM source/target pair which contains formatting placeholders.
 
   Back to [[Wordfast Classic User Manual]]
 
   Back to [[Wordfast Classic User Manual]]

Latest revision as of 07:39, 6 November 2017

When dealing with so-called tagged documents, a WFC TM records placeholders for tags. Those placeholders have a &tX; format, where X is the order of appearance of tags in the source segment. The X order is noted A (ANSI decimal 65), B, C, etc., up to ANSI decimal code 165. Thus, there can be no more than 100 tags in a WFC segment.

For example, the following "tagged" source segment:

<FONT FACE="Helvetica"> This is some text.</FONT>

would appear, in a Wordfast TM as:

&tA;This is some text.&tB;

At translation time, when WFC pulls a TU from the TM and is about to propose the TU's target segment as a translation candidate, WFC uses a substitution algorithm to dress the proposed target segment with the full "real" tags, taken from the document's (not the TM's) source segment, using a triangulation method:

Document's source segment <—> TM's source segment <—> TM's target segment

The triangulation can be successful only if all target tags have a "parent" tag in the source segment. In the rare cases when the TM's target segment has tags that do not appear in the TM's source segment (orphaned tags), WFC records the full syntax of these orphaned tags at TU creation time, so that they can be restored properly at translation time, when the target segment must be proposed with the correct format. If we have, at TU creation time:

In source segment: <FONT FACE="Arial">This is some text:
In target segment: <FONT FACE="Arial">Voici du texte&nbsp;:

then the target segment would be recorded in the TM as:

&tA;Voici du texte&t=;&nbsp;&t=;:

where &t= opens the original tag syntax (&nbsp; in our example) and ; (colon) closes the sequence.

Other examples of segments:

In source segment: <FT>This is some text<AR> here<FT>.
In target segment: <AR>Voici du texte<FT> ici.
In TM TU source: &tA;This is some text&tB; here&tA;.
In TM TU target: &tB;Voici du texte&tA; ici.
In source segment: <FT>This is some text<AR> here.
In target segment: <AR>Voici du<AR> texte<X;X> ici<FT>.
In TM TU source: &tA;This is some text&tB; here.
In TM TU target: &tB;Voici du&tB; texte&t=;<X;X>&t=; ici&tA;.

In most translation memory systems, TMs are overloaded with tags that do not belong there. A TM takes significance when its content is put to (re-) use, meaning, when its past translations are leveraged for a new transation project. Re-using TM content is only done in the presence of a new document to be translated. In other words, at use time, the software operates a triangulation between a new document's new source segment which contains the new formatting, and an existing TM source/target pair which contains formatting placeholders.

 Back to Wordfast Classic User Manual