Category: c#

Framework Tips I: Clear StringBuilder

StringBuilder is a class that allows you to manipulate strings in mutable manner. It has many methods allowing you to Append, Insert and Replace, portions of the string. However it does not contain Clear() method, that would allow you to clear the content of a StringBuilder.
There is Remove(int,int) method, that allows you to do this, but it requires you to pass two parameters to achieve this.

int startIndex = 0;

stringBuilder.Remove(startIndex, stringBuilder.Length);

There is however easier, and more elegant way to do this. Other than in case of most classes in the framework, StringBuilder’s Length property is writable, so you can simply write:

stringBuilder.Length = 0;

Technorati Tags: , ,

Nullable<bool> GetHashCode() – bug or a feature?

Today I stumbled upon a strange bug, that seems to be a feature of .net framework. I had a method that performed some action upon a instance of a class, lets say Customer, based on the hash value of that record. Seems plain and simple, however my unit test exhibited a strange behavior – in some cases, although Customer record had been updated, it acted as if it was not changed.

Short investigation pointed to a field of type bool? (Nullable<bool>), that although its value was changed, returned the same hash code.

The problem is, that generic struct Nulllable<T> implements GetHashCode like this:

public override bool GetHashCode()

{

    if(this.HasValue)

        return Value.GetHashCode();

    return 0;

}

System.Boolean implements its the same method like this:

public override bool GetHashCode()

{

    if(this)

        return 1;

    return 0;

}

It boils down to the fact, that for both: null, and false, we get hash value of 0. That’s why, although Customer changed value of that field, from false to null, or he other way around, its hash value was still the same.

I consider this a bug, but on the other hand, returning -1 or 2 might save the day for Nullable<bool> but how about Nullable<Int32> ?

Technorati Tags: , , ,

Using ‘using’

Using classes (and structures for that matter) that implement IDisposable has one implication: when you’re done using it, you should ASAP call it’s Dispose() method.

Like in the example:

Pies item1 = new Pies("Pies 1");

Console.WriteLine("Accessing {0}", item1.Name);

item1.Dispose();

To make life easier, and not have to remember to call this method directly you can alternatively use some ‘syntactic sugar’ in form of the ‘using’ keyword, and rewrite this example to (more or less, as we’ll se in just a second) equivalent code:

using(Pies item1 = new Pies("Pies 1"))

{

    Console.WriteLine("Accessing {0}",item1.Name);

}

This approach has quite a few advantages: you are freed from calling Dispose() explicitly (it get’s called after last line of code within ‘{‘ and ‘}’ ends executing), it’s more readable, and it limits the visibility of item1, what may be desirable in some cases (for example, you will not be able to call item1 after it’s disposed of, which might cause some nasty, run-time errors).

As I said, ‘using’ keyword, is just a syntactic sugar. Beneath it, is code looking like this (via Reflector):

Pies item1;

bool CS$4$0000;

item1 = new Pies("Pies 1");

l_000C:

try

{

    Console.WriteLine("Accessing {0}", item1.Name);

    goto Label_0031;

}

finally

{

Label_0021:

    if ((item1 == null) != null)

    {

        goto Label_0030;

    }

    item1.Dispose();

Label_0030:;

}

l_0031:

    return;

It wraps, the code we put within ‘using’ range with try/finally just to make sure that Dispose() gets called even if something goes wrong.

You can embed ‘using’ statements, one, within another like this:

using (Pies item1 = new Pies("Pies 1"))

{

    using (Pies item2 = new Pies("Pies 2"))

    {

        Console.WriteLine("Accessing {0}", item1.Name);

        Console.WriteLine("Accessing {0}", item2.Name);

    }

}

or in a shorter form:

using (Pies item1 = new Pies("Pies 1"))

using (Pies item2 = new Pies("Pies 2"))

{

    Console.WriteLine("Accessing {0}", item1.Name);

    Console.WriteLine("Accessing {0}", item2.Name);

}

What I didn’t know was, that when all variables you’re ‘using’ are of the same type (like in the example: both item1 and item2 are of type Pies) you can shorter it even further to this form:

using (Pies item1 = new Pies("Pies 1"), item2 = new Pies("Pies 2"))

{

    Console.WriteLine("Accessing {0}", item1.Name);

    Console.WriteLine("Accessing {0}", item2.Name);

}

All three snippets result in identical IL. I’d most often opt for the second solution. It’s concise, readable, and enable me to use variables of more than one type (like StreamReaded/StreamWriter).

Technorati Tags: , ,

I’m really loving new C# 3.0 features

The more I use new C# 3.0 syntax the more I love it. I basically still use .NET 2.0/3.0 and VS 2005 at work, but at home I’m migrating to Orcas. And I have to admit that more and more, when I’m at work I wish I could use new syntax.

  • var keyword: it’s as simple as it can be, but when you get used to writing:
var filteredItem = filter(item);

instead of:

FilteredItem<MtfModificationLog> filteredItem = filter(item);

it’s really hard to go back. Initially I thought that using this new keyword would make code less readable. That I would have to mouse over a variable name in Visual Studio all the time to see it’s type, but after I’ve used it some time, I changed my mind by 180 degree.It all depends on how you use it, but I’ve found, that using it properly makes code actually easier to read, faster to write, and often you don’t really care about what’s the type of that variable, especially with generic types. What you care about is that it’s FilteredItem<T>, and the  typeof(T) is at this point of code not that important.

  • lambdas: I love anonymous delegates, they are great but they can be a pain in the butt to type. Take the following code:
MtfConcept c2 = _concept.Filter(

    delegate(MtfModificationInfo i)

        {

            return new FilteredItem<MtfModificationInfo>

                (delegate(MtfModificationInfo i2)

                     {

                         return DateTime.Now.Subtract(i2.Date).Days < 1 &&

                                i2.UserName == "Pies";

                     }, i);

        });

It doesn’t do much, but with this all delegate and long type names, it takes a while to find out what’s really going on there. And it’s much more typing than this:

MtfConcept c2 = _concept.Filter(

    i => new FilteredItem<MtfModificationInfo>(

        i2 => DateTime.Now.Subtract(i2.Date).Days < 1 && i2.UserName == "Pies", i));

I find it not only more concise, but easier to read as well. All those delegate keywords, and curly braces (especially with nested anonymous delegates like in the example) create noise that you have to read your way through. With lambdas it’s much cleaner. So far lambdas are my favorite new addition to the language.

  • automatic properties: Is this a class or is this an interface? That’s actually one of few cases where I prefer to be more elaborate about things, and still it’s easier to accomplish (and less typing) with ReSharper than with those odd interface-like syntax.
  • Collections/object initializers: Another feature that is more of a syntactic sugar than real value, but I like it, especially in tests I often want to setup a collection to have multiple elements, and this way feels really more natural, than calling Add several times, or creating an array just to call AddRange. I have one con for object initializers. They are great too, as a mater of fact they are so great that it may be tempting to use them all over the place, and thus creating mutable object, in places where immutable object would be more appropriate. With them it’s easier to do the wrong thing.
    [SetUp]

    public void SetUpDates()

    {

        _dates = new List<DateTime>()

        {

            new DateTime(2007,12,4),

            new DateTime(2007,11,4),

            new DateTime(2007,10,4),

            new DateTime(2007,9,4),

            new DateTime(2007,8,4),

            new DateTime(2007,7,4),

            new DateTime(2007,6,4)

        };

    }

I’ll write about remaining features next time.

Extension Methods "Hello World" I: Introduction

I decided that it’s about time to familiarize myself with new features of .NET 3.5 and C# 3.0. And I don’t mean see an overview, because I’ve read many times about what’s new in it, and I have basic understanding of all those new goodies that new version of framework/language brings. What I mean by familiarize is understand thoroughly and that is what this “Hello World” series (hopefully) is (hopefully) about.

I’m gonna start with extension methods as this is what I’ve started playing with already, and then… well we’ll see 🙂

Extension Methods are a completely new concept to OOP world (at least from what I know), and they may seem odd to people with strong OOP background. We’re used to, that in order to extend a class with new behavior we inherit a new class from it, and add methods to this inherited class. Plain and simple. But how about, for example, sealed classes? Let’s take a basic example: You want to be able to alter a string to have first letter in each world capitalized.

To accomplish this you would probably want to do something like:

using System;
public class MyString:String
{
    public string ToFirstCapitalized()
    { 
        //any implementation you'd like (as long as it's black)
    }
}

Plain and simple, isn’t it? Unfortunately this code has one flaw: It won’t compile. System.String is marked as sealed, so you can’t inherit from it. The only option you have left is to delegate this work to some other class, for example like this:

using System;
public static class MyStringCapitalizator
{
    public static string ToFirstCapitalized(string input)
    { 
        //any implementation you'd like (as long as it's black)
    }
}

Well, this at least will compile, although now instead of this:

string someString = "It's SoME weIrD StrING. Isn'T iT? IT BEcamE liKE This FRoM ListTENING to JINGLE from BETTer kNOW a FRAmEwoRK";
string capitalizedString = someString.ToFirstCapitalized();

You have to write something like this:

string someString = "It's SoME weIrD StrING. Isn'T iT? IT BEcamE liKE This FRoM ListTENING to JINGLE from BETTer kNOW a FRAmEwoRK";
string capitalizedString = MyStringCapitalizator.ToFirstCapitalized(someString);

You now are able to accomplish the goal, but you have to explicitly call another class’s method, so of makes your code more difficult to use.

This is one of possible places where you might want to use Extension Method. As most new things in C# 3.0 (and VB9.0, or Chrome 2.0 for that matter), Extension Methods we’re introduced to enable LINQ. They basically are a syntactic sugar, but they can greatly improve readability of your code, and make it easier to use, like in this example.

Using Extension method you can pin methods to classes you don’t control (like System.String), so that although implementation of the method is contained in some custom class, you can call the method as if it was implemented by System.String.

Using Extension Methods in its core is plain and simple. You create a  static class with static method, where first parameter is of type you want to extend, preceded by keyword this.

 

namespace ExtensionMethodsHelloWorldPartI
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            string someString =
                "It's SoME weIrD StrING. Isn'T iT? IT BEcamE liKE This FRoM ListTENING to JINGLE from BETTer kNOW a FRAmEwoRK";
            
            //you can call method explicitly
            string capitalizedString1 = MyStringCapitalizator.ToFirstCapitalized(someString);
 
            //or as if it was implemented by System.String
            string capitalizedString2 = someString.ToFirstCapitalized();
        }
    }
    public static class MyStringCapitalizator
    {
        public static string ToFirstCapitalized(this string input)
        {
            //any implementation you'd like (as long as it's black)
        }
    }
}

Thanks to this little cosmetic change you can call our new method directly on type string. It is convenient for users of your code, as they will see this method in their IntelliSence dropdown. Notice that it has different icon, with little dropping arrow, that will enable you to immediately distinguish Extension Methods from other. ExtensionMethodsHelloWorldPartI

In Part II we will take a closer look at how compiler decides which of several candidate extension methods to call.

Convert int to string as hex number

Today I needed to parse colors encoded as string in the form 0xRRGGBB where RR GG and BB were red green and blue values of given color encoded in hexadecimal.

Problem I stumbled upon, was, what if I have a number like:

0x008000

There was no problem converting it to System.Drawing.Color class, but back to string.

I used code like below:

string colorString = string.Format("0x{0:X}{1:X}{2:X}",

                       color.R,

                       color.G,

                       color.B);

This would yield:

0x0800

Which is definitely not what I wanted it to be. I needed a way to enforce a number to emit zero in front of it, if it’s small enough to fit in one digit.

After much too long search, trial and error I found a solution, that was so obvious when I finally discovered it, that I felt like it should be the first thing to try: you just give a number right after ‘X’, indicating how many chars you want the number to have. Considering the fact that I tried ‘X,2’, ‘X:2’, ‘X;2’ ‘X,00’ and several more before I tried this I feel now… well, not very well about myself 😉

string string1 = @"0x008000";

Color color = (Color) new ColorConverter().ConvertFromString(string1);

string colorString = string.Format("0x{0:X2}{1:X2}{2:X2}",

                               color.R,

                               color.G,

                               color.B);

Took me like 30 minutes to figure it out.

How do You regionerate your code?

I’ve been using Regionerate for some time, and I’m addicted to it. Literally when I have to write some code on a computer that doesn’t have Regionerate installed I feel odd. This tool is simply pure honey and nuts. Only thing I would change is it’s default keyboard mapping (ctrl+R for running it), because it collides with Visual Studio/ReSharpers “Refactor” shortcut. So every time I install it I have to go to VS settings and change it to something else (alt+3 at the moment).

Main reason for this post however is not to praise Rauchy and his tool, but to talk a little bit about it’s customization capabilities. Regionerate is Xml driven, that is, its regioneration (strange word, huh?) settings are kept in a xml file. It comes with xsd so when you edit it in VS you’ll get intellisence, which is pretty sweat and will save you a lot of time.

The simplest possible  Regionerate settings file would look like this:

<CodeLayout xmlns="http://regionerate.net/schemas/0.6.3.8/CodeLayout.xsd">
    <ForEachClass>
        <CreateRegion>
            <PutFields>
            </PutFields>
        </CreateRegion>
    </ForEachClass>
</CodeLayout>

It creates a region and puts all fields into it, like below:

namespace Xtoff.Tmx.Helpers
{
    public class TmxLanguage
    {
        
        #region [rgn] Unnamed Region (1)
 
        private readonly string _value;
 
        #endregion [rgn]
 
        public TmxLanguage(string value)
        {
            _value = value;
        }
        public string Value
        {
            get { return _value; }
        }
    }
}

All fields were put in a single region, and all other members were left below. Hooray!. However I guess very few would be satisfied at this point.

Before we move on, however, there are a few facts to note.

First of all, regions name: [rgn] Unnamed Region (1)

[rgn] is a standard prefix for regionerate to mark it’s regions. It was introduced because without some kind of differentiator regionerate would break your manually created regions when regionerating your file. Thanks to this, it will only look into parts of your class that are not inside any region, or are inside a Regionerate-created region. You can change this prefix, or remove it. Keep in mind however, that then every region will be treated as a Regionerate-created region.

Next thing is region’s name. We didn’t set it, so Regionerate set it to default. I don’t have to tell you that you DO want to name your regions :).

And finally (1) indication how many elements is in a region. VERY useful when dealing with large files.

Next step would then probably be setting a name, and looking at other options we have.

If you go back to CreateRegion, hit space and wait for intellisence to come up you’ll be presented with 4 options:

Separating lines: Allows you to specify how many free lines you want Regionerate to leave between members in a regions.

ShowCount: Flag allowing you to turn of showing count of members inside of a region, defaults to true, and I don’t recommend changing it.

 

Style: this is one of the best and little known features.

Three valid options are Visible, Comment and Invisible. Visible is the default option, and it will wrap your code with a region like seen above

Comment will clean up your code but instead of enclosing it within a region it will only put a comment on top of all fields, like this:

        // [rgn] Unnamed Region (1)
 
        private readonly string _value;

Invisible, will clean up your code, but it woun’t put any regions not comments.

Title: sets the title for region 🙂

Going down the Xml tree, we can define what we want to put in out region. In our example we chose fields, but you can put basically every class member (field, property, method, event and so on), or inner region. You can do multiple Put* into a region.

Now we’re getting into really interesting stuff, that is defining filters for specific elements we want to put in a region. In the example above we chose to keep all fields in this region, but we could have come up with something much more sophisticated, like region for non serialized public fields with names meetings certain regular expression.

I won’t explain every single option in detail because there are so many that it would take too long. There are also diferences between types of elements (for example for Properties you can filter by accessors). In 9 cases out of 10 you will be able to create rules you want. You can’t create rules like “Region for methods that subscribed to some events” unles you have a naming convention for those, because it would require analysis on a higher level of abstraction, but nonetheless it’s pretty sweat.

And for those interested, I attach my Regionerate settings file.

Technorati Tags: , ,

Fun with ?: operator

First of all, take a look at the following code:

        private string _targetText;
        private int _maxLines;
        private int _maxSize;
 
        public int Lines
        {
            get
            {
                if (_targetText == null)
                    return 0;
                return _targetText.Split('\n').Length;
            }
        }
 
        public bool IsValid
        {
            get
            {
                return  _maxSize == 0 ?
                    true :
                    Size <= _maxSize
                    &&
                    _maxLines == 0 ?
                    true :
                    Lines <= _maxLines;
            }
        }

It’s fairly simple, the most important piece is IsValid property, that checks if _targetText meets certain length and number of lines limitations.

Now, let’s say that  max size is 5, max lines is 1 and target text is “Some incredibly long piece of text“. Million dollar question is: What would IsValid return for these parameters?

 

It would actually  return true, because there is a subtle bug in this code. It may not be apparent and and it’s a tricky beast because you have to know how to look at it to see what’s actually going on. The reason why it returns true, when all signs on earth and in heaven say it should return false is operators priority, and the way how ?: gets translated by compiler to some other code.

Logically thinking we would expect the code to examine if maxSize is 0 and if it is to set left hand flag to true, and if it’s not zero to set it to whether or not Size is less or equal max size, then to do similar thing with maxLines and Lines and set right hand flag, and then, if both flags are true to return true, and false otherwise. By thinking this way we assume that it will first run both ?: operators and then && the results, in other words, we assume that ?: operator has higher priority than && operator that turns not to be true.

That’s because people think of ?: operator like shorthand of if else, whereas mighty Reflector reveals its true nature to be different. When we compile code above and then open it in Reflector we’ll see code like this (line breaks and indents added to make it easier to read).

        public bool IsValid
        {
            get
            {
                return 
                    (
                        (this._maxSize == 0) 
                        || 
                        (
                            (
                                (this.Size <= this._maxSize) 
                                && 
                                (this._maxLines == 0)
                            ) 
                            || 
                            (this.Lines <= this._maxLines)
                        )
                    );
            }
        }

I suspect that this code looks slightly different than what you expected. No if else only logical ands and ors. And it’s the reason for that unexpected output. If you examine that code closely you’ll notice that no mater size and max size – if number of lines is not greater than maxLines it will return true. So how to fix that code? Either by surrounding ?: operators in brackets, or by moving them to other properties/methods like this:

        public bool IsValid
        {
            get
            {
                return HasValidSize
                    && HasValidLineCount;
            }
        }
 
        public bool HasValidLineCount
        {
            get
            {
                return _maxLines == 0 ?
                    true :
                    Lines <= _maxLines;
            }
        }
 
        public bool HasValidSize
        {
            get
            {
                return _maxSize == 0 ?
                    true :
                    Size <= _maxSize;
            }
        }

I hope that was informative.

Replace text elements with Regular Expressions

I love/hate regular expressions. I love them for their flexibility and amount of time you can save using RegEx as opposed to manipulating strings manually. I hate them, because writing them is such a pain in the… you get the point. Today I had to quickly assemble a small tool that would replace certain elements in text file. To be more accurate it had to read lots small text files that were kind of bilingual, meaning English/Chinese, and change them to true Unicode bilingual. I said kind of, because files were written in plain ASCII with English text written normally, and Chinese encoded like this: #$2536#$5231#$AFF, that is #$ then one or two chars denoting older byte’s code, and two chars denoting younger byte’s code. It would be quite hard to do it manually, especially that file was a little bit more complicated than I presented here.

I used Regex class’s method Replace, that is specifically designed to help you replace elements in a string. It gets a string that you want to modify and a MatchEvaluator delegate. MatchEvaluator gets called every time match occurs on a given input string, it gets Match object representing said match, and returns string that substitutes matched element. It may seem complicated, but actual code is plain and simple:

public string Decode(string encodedString, Regex pattern)

{

    return pattern.Replace(encodedString, Replace);

}

 

private string Replace(Match match)

{

    string older = match.Groups["Older"].Value;

    string younger = match.Groups["Younger"].Value;

    char character = Convert(older, younger);

    return character.ToString();

}

 

private char Convert(string first, string second)

{

    if (second.Length == 1)

        second = "0" + second;

    return (char)(Convert.ToInt32(first + second, 16));

}

Groups ‘Older’ and ‘Younger’ denote location of older and younger byte of Unicode character code.

Method ‘Replace’ simply takes you byte codes from matched string and calls Convert that returns single character that is represented by this code, and it is put in the place of matched string. Using this simple approach I can easily substitute those codes with actual characters they represent.

Converting custom Strings to DateTime

One of projects I’m currently working on involves reading a file produced by other tool, that has rather unusual way of storing a date and time. For example 1st of July 2007, 14:00:00 would be stored as 20070701T140000Z (colors added for emphasis).

Using Convert.ToDateTime(string) or DateTime.Parse(string) throws FormatException. Parsing manually (splitting string and parsing its substring to ints to create DateTime object from them) is not very elegant solution. There is however static method DateTime.ParseExact(string, string, IFormatProvider). First parameter is encoded DateTime, second is pattern, and third can be null. How to create pattern you can learn from this msdn article.

Putting it all together, here’s how I parsed those strings to DateTime and DateTime to strings:

DateTime dt = DateTime.ParseExact("20070622T142203Z", "yyyyMMdd'T'HHmmss'Z'", null);

string s = dt.ToString("yyyyMMdd'T'HHmmss'Z'");