Useful regular expressions (Regex)

From Wordfast Wiki
Revision as of 20:23, 22 April 2021 by John (talk | contribs) (added (1,000s))
Jump to: navigation, search

A regular expression (regex or regexp) is a sequence of characters that define a search pattern.[1] They can be very helpful for filtering out segments in the TXLF Editor or for find/replace operations. Below are a few examples of useful regex. When you use them in Wordfast Pro, make sure to tick the regex box accordingly.

Hide all number-only segments

Use the following regex in the segment filtering bar:

^(([0-9][^\n]*[^0-9])|([^0-9][^\n]*[0-9])|([^0-9]?[^\n]*[^0-9]))$

This regex will also hide numbers with punctuation (decimals, etc.)

Show only number-only segments

Use the following regex in the segment filtering bar:

^(?:(?:-|–|(?:(?:\$|€|£)(?:\h)?))?(?:\d{1,3})(?:\h|,|\.|(?:(?:\h)?(?:%|\$|€|£)))?)+$

If you have numbers like 8,675,309.00 that need to be replaced with 8.675.309,00, you can copy all sources to target with the filter applied, then apply a 3-step find and replace:

  1. Find . and replace with DUMMY
  2. Find , and replace with .
  3. Find DUMMY and replace with ,


Show only thousands, ten thousands or hundred thousands surrounded by parentheses

If you negative monetary figures in a financial report, they are generally surrounded by parentheses like this (1,234) or (10,123) or (100,123).

Use the following regex in the segment filtering bar to filter out these segments when the divider is a non-breaking space:

\(\d{1,3}∘\d{3}\)

And this one if the divider is a comma:

\(\d{1,3},\d{3}\)

With the filter in place, copy all sources, then find ∘ and replace with , or vice versa. (∘ represents a non-breaking space).


Invert currency symbols

Say you have a lot of monetary values like 103,50€ in your document and you want to globally find/replace with €103.50, how would you do this?

Open the Find/Replace function and be sure to tick the Use Regex box.

Type the following regex in the Find what field:

(^[^,]+?)(,)([^€]+?)(€)

Type the following regex in the Replace with field:

\€$1\.$3

NOTE: This only works for values up to 999. Values in the thousands will need another regex operation to replace comma/space/decimal with comma/space/decimal.

References

  1. Check out this article for a more detailed explanation of the history of regular expressions and how they work.