Why different tools provide different word counts

From Wordfast Wiki
Revision as of 15:17, 29 June 2012 by David Daduč (talk | contribs) (Simplification, deletion of outdated information)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Here are a few comments to why different tools provide different wordcounts.

1. Areas of the document

Text in documents is either in the main range (the body of the document), or in special areas, such as headers/footers, footnotes/endnotes, text fields (grouped or not), embedded objects, fields (like TOC), etc. Some programs will count words in the main range only, while others may also count words in some of the special areas.

2. All text, or just translatable text

Translation tools, such as Wordfast Classic or Wordfast Pro will only count translatable text, while general-purpose programs, such as MS Word, will count even non-translatable text. This can make a big difference, especially if the document contains a large number of isolated numbers, fields or tags.

3. Definition of word

Is and/or, hard-working, or they're one word, or two? It's one word for MS Word and two words for most translation tools. This can make a big difference in some languages, while it can be negligible in others. A test David Daduč made suggested the difference may be about 6% in French, 0.8% in English, and near zero in German or Czech.