In a project I’m currently working on, I needed to check if particular character is a part of given CodePage. Problem with .NET’s Encoding class, is that although it maintains a table mapping Unicode characters to codes in particular CodePage, it keeps it as private field. Moreover it does its best to replace characters it does not contain, with some fallback character.
One might use this fact, and compare character received this way from Encoding’ instance, with original character, assuming, that if they are different, this character is not a part of that CodePage, but this is not an elegant solution. And involves lot of overhead, by first converting char to byte and next the other way around.
Another solution is to use an overload of Encoding’s static GetEncoding method, like this:
this way, when user tries to convert a character that is not a part of given Encoding’s CharSet, fallback encoder throws an exception. So one might use try/catch and be happy with it, but this too is an awful solution, and also limiting, as you have to create Encoding instance yourself, so you’re helpless in cases when you receive arbitrary encoding.
After little bit of poking around I came up with yet another solution, that seems to be better, faster and more elegant than those two. I however didn’t test it thoroughly so it may have flaws as well (or may not even work at all in some cases). First, let the code speak:
I created two classes: one inheriting from EncodierFallback, and one inheriting from EncoderFallbackBuffer. Basically my idea was, that I will provide Encoding instance with fake fallback encoder, that should not try to provide any fallback character. That way Encoding will silently (and fast) fail and its GetBytes and GetByteCount methods will return respectively empty array and 0y.
Only problem I had was to inject actual EncoderFallbackCheckExist instance info Encoding’s EncoderFallback property. Although this property has setter, when IsReadOnly is true, trying to set it, will raise an exception. Encoding however implements ICloneable, and cloning it, does not preserve its readonly state. So after its cloned, you can safely assign its EncoderFallback.
I also created simple EncodingExtensions class, with single extension method, to wrap the whole logic, and attach it to Encoding class, so that you can write:
Looks good to me, and as far as I’ve checked – works. However if you have better idea how to accomplish this, please leave a comment.