Framework Tips VI: Escaping strings for RegEx

Regular Expressions are very powerful and useful, but one thing that stops them from being widely used, is their complexity. They’re commonly referred to as write-only language. Even my colleague, who is very proficient at crafting complex regular expressions often deletes the whole thing and starts from the scratch.

For easy tasks however, like matching a literal piece of text, you can use two static methods of Regex class: Escape and Unescape. Escape escapes all characters that have special meaning in regex language, and Unescape, does the opposite thing.

string text = @"This shirt costs* $20 (55PLN). [* only red color]";

string escapedText = Regex.Escape(text);

Console.WriteLine(escapedText);

string unescapedText = Regex.Unescape(escapedText);

Debug.Assert(text == unescapedText);

This will escape text to

"This\ shirt\ costs\*\ \$20\ \(55PLN\)\.\ \[\*\ only\ red\ color]"

Now when you run match against some piece of text containing sentence held by text variable, escapedText would match it.

Keep in mind however that this method treats its input as literal text, so it won’t be very helpful, when you need something more complicated. In that case you’d have to roll up your sleeves and write the pattern manually.

Technorati Tags: , ,