Pages - Menu

C# - Convert string to alphanumeric only


Today I have a task to strip off characters from a string that are not alphabets or numbers. I implemented it as extension method so that it is reusable.


public static string ToAlphaNumericString(this string inString)
    if (string.IsNullOrEmpty(inString))
        return string.Empty;

    return new string(inString.Where(c => char.IsLetterOrDigit(c)).ToArray());

With linq, the code is short and simple. There are 2 interesting points in this.


One of the magic part of our magic line is the char.IsLetterOrDigit(). It checks if a character is letter or digit, then it will return as part of the list via linq.


The Where clause from linq returns IEnumerable. Normally we use ToList() to write the result to the memory for later use. However, because string is an array of chars in C#, we try to convert the linq result to an array, we then pass char[] to the constructor of string to instantiate a new object.


You probably already noticed by now. What I have done was checking up the characters one by one and instantiate a new string object with the characters. This has a potential performance issue as create an object is an expensive operation, at least for the purpose of stripping illegal characters. I have not done my benchmark yet, but I think StringBuilder will be faster.

EDIT: It turns out Linq is actually pretty good in performance according to some benchmarks.