Framework Tips VII: Splitting strings

Joe Developer once had to read options from configuration file, where each line looked basically like this:

option:   first;second;third;;fifth;

Each line consisted of option name, followed by few spaces, and its parameters delimited by semicolons. Now the question is: how would Joe get to those parameters? The most elegant way would be using Regular Expressions, but Joe has strong allergy to them. For this simple example String.Split method will do.

Joe rolled up his sleeves and crafted that beautiful masterpiece of code:

string options = @"option:   first;second;third;;fifth;";

string[] strings = options.Split(';');

foreach (string s in strings)

    Console.WriteLine(s);

    

Console.WriteLine("That's all. Press any key to exit.");

Console.ReadKey(false);

However, that didn’t work quite as well as expected:

string.split.ss0

Joe successfully split each option to different line, but he had also some empty lines, as well as additional stuff along with first option. Fortunately Joe noticed that thanks to params keyword, he can pass more parameters to Split method. He quickly altered his code, by adding two more split characters, and changing foreach loop to for, to leave option element out.

string options = @"option:   first;second;third;;fifth;";

string[] strings = options.Split(';', ':', ' ');

for (int i = 1; i < strings.Length; i++)

    Console.WriteLine(strings[i]);

 

Console.WriteLine("That's all. Press any key to exit.");

Console.ReadKey(false);

string.split.ss1 Proud of himself Joe ran his program to notice, that the output, while better, still didn’t look as he needed. While he managed to get each parameter into its own line, he still got those annoying empty lines.

His immediate idea was to add a condition, to check wether given string is empty before writing it out, but then he remembered the quote:

When debugging, novices insert corrective code; experts remove defective code.  ~Richard Pattis

And since Joe was a true expert, he decided to remove the defect. He looked here and there, and discovered, that StringSplitOptions enum, was exactly what he needed. Fortunately, there was even an overload of Split method that accepted StringSplitOptions as one of its arguments. Joe quickly altered 2nd line of  his code to:

string[] strings = options.Split(new char[] {';', ':', ' '}, StringSplitOptions.RemoveEmptyEntries);

string.split.ss2 And he finally was happy with what he saw when he ran the program. His happiness didn’t last quite long, however. Only ’till he met his boss, to be exact, who told him, that only first two parameters are important, and all remaining should be left out.

I can do it! – Joe thought. – All I need to do is to limit the for loop to stop after two first elements. Then he thought – Why waste resources and time to split further, when all I need are just two first parameters. – He recalled the quote once again, and decided, that as true expert he would find another solution.

Then he found yet another overload, of Split method, that accepted additional parameter: integer, specifying maximum number of substrings to be produced. Joe quickly adjusted his code to the new requirements:

string options = @"option:   first;second;third;;fifth;";

string[] strings = options.Split(new char[] {';', ':', ' '}, 4, StringSplitOptions.RemoveEmptyEntries);

for (int i = 1; i < strings.Length && i < 3; i++)

    Console.WriteLine(strings[i]);

 

Console.WriteLine("That's all. Press any key to exit.");

Console.ReadKey(false);

Now he was really proud of himself – with one simple method, he was able to accomplish the task.

Technorati Tags: ,

Framework Tips VI: Escaping strings for RegEx

Regular Expressions are very powerful and useful, but one thing that stops them from being widely used, is their complexity. They’re commonly referred to as write-only language. Even my colleague, who is very proficient at crafting complex regular expressions often deletes the whole thing and starts from the scratch.

For easy tasks however, like matching a literal piece of text, you can use two static methods of Regex class: Escape and Unescape. Escape escapes all characters that have special meaning in regex language, and Unescape, does the opposite thing.

string text = @"This shirt costs* $20 (55PLN). [* only red color]";

string escapedText = Regex.Escape(text);

Console.WriteLine(escapedText);

string unescapedText = Regex.Unescape(escapedText);

Debug.Assert(text == unescapedText);

This will escape text to

"This\ shirt\ costs\*\ \$20\ \(55PLN\)\.\ \[\*\ only\ red\ color]"

Now when you run match against some piece of text containing sentence held by text variable, escapedText would match it.

Keep in mind however that this method treats its input as literal text, so it won’t be very helpful, when you need something more complicated. In that case you’d have to roll up your sleeves and write the pattern manually.

Technorati Tags: , ,

Framework Tips V: Extension Methods and nulls

You can call extension methods on null elements. It’s obvious when you think about it: its a normal static method where you specify its first parameter with this.

using System;

 

namespace ExtensionMethods2

{

    class Program

    {

        static void Main(string[] args)

        {

            string isNull = null;

            Console.WriteLine(isNull.IsNullOrEmpty());

            string isNotNull = "string";

            Console.WriteLine(isNotNull.IsNullOrEmpty());

        }

    }

 

    public static class StringExtensions

    {

        public static bool IsNullOrEmpty(this string s)

        {

            return s == null || s == "";

        }

    }

}

Notice that if it was normal instance method (i.e. defined in the System.String class) I would have to check if my string instance is not null, before calling it, or I would receive runtime NullReferenceException. Now I can move that check to the method, where it belongs, and make my code cleaner. Pretty slick 😉

Framework Tips I: Clear StringBuilder

StringBuilder is a class that allows you to manipulate strings in mutable manner. It has many methods allowing you to Append, Insert and Replace, portions of the string. However it does not contain Clear() method, that would allow you to clear the content of a StringBuilder.
There is Remove(int,int) method, that allows you to do this, but it requires you to pass two parameters to achieve this.

int startIndex = 0;

stringBuilder.Remove(startIndex, stringBuilder.Length);

There is however easier, and more elegant way to do this. Other than in case of most classes in the framework, StringBuilder’s Length property is writable, so you can simply write:

stringBuilder.Length = 0;

Technorati Tags: , ,