Quantcast
Channel: Martijn's C# Programming Blog » Beginner
Viewing all articles
Browse latest Browse all 10

Manipulating Strings in C# -Replacing part of a string / Replacing all occurences of a sub-string

$
0
0

Very often you need to change part of a string, maybe just once, or many times over. Strings in .NET/C# are immutable we cannot actually change a string in-place. But we are able to work on copies. The code example below attaches two new methods to the C# string class.

  • The ReplaceFirst method replaces the first occurrence of “needle” in a string and replaces it with “replacement”.
  • The ReplaceAll function is similar: it steps through the string modifying it each time it finds “needle” and replaces it. To avoid a possible infinite loop it first checks whether “needle” is equivalent to “replacement”.

using System;
using System.Collections;

namespace StringItems
{
        static class StringExt
        {
                public static string ReplaceFirst(this string haystack, string needle, string replacement)
                {
                        int pos = haystack.IndexOf(needle);
                        if (pos < 0) return haystack;

                        return haystack.Substring(0,pos) + replacement + haystack.Substring(pos+needle.Length);
                }

                public static string ReplaceAll(this string haystack, string needle, string replacement)
                {
                        int pos;
                        // Avoid a possible infinite loop
                        if (needle == replacement) return haystack;
                        while((pos = haystack.IndexOf(needle))>0)
                                haystack = haystack.Substring(0,pos) + replacement + haystack.Substring(pos+needle.Length);
                        return haystack;
                }

        }
}

Both methods are implemented using a class extension. (for more on creating class extensions see also Finding all occurrences of a string within another string) After you include these methods into your project you can call them directly from any string instance:

string myString = “Hello World”;
string myModifiedString = myString.ReplaceFirst(“World”,”People”);
Console.WriteLine(“{0}”,myModifiedString); // Writes: “Hello People”

An example use of the ReplaceAll method:

string myString = “boo foo is not foo boo or foo boo foo”;
string myModifiedString = myString.ReplaceFirst(“boo”,”goo”);
Console.WriteLine(“{0}”,myModifiedString); // Writes: “goo foo is not foo goo or foo goo foo”;

Why not just use a regular expression?

If you are familiar with the RegEx class in C# you can easily write a regular expression to achieve the same string replacement result:

using System.Text.RegularExpressions;
Regex regex = new Regex(“boo”);
string result = regex.Replace(“boo foo is not foo boo or foo boo foo”, “goo”);

Regular expressions are flexible and if you do anything more complex than just a basic string replacement they are your only choice. But they come at a hefty performance price. To run a regular expression it needs to be compiled first and then executed. The .NET runtime caches the expression for performance but using a regular expression for string replacement is still much slower.

How much slower are regular expressions for string replacement?

In an earlier post I described the Stopwatch class in System.Diagnostics. It is ideal for a little benchmark testing — so lets compare my string replacement methods with the build-in regular expression library:

string haystack = "boo foo is not foo boo or foo boo foo";
string result;
Stopwatch sw = Stopwatch.StartNew();
for (int Lp = 0; Lp < 100000; Lp++)
result = regex.Replace($haystack, "goo");
sw.Stop();
Console.WriteLine("Time used (float): {0} ms",sw.Elapsed.TotalMilliseconds);

And the same for the string replacement functions:

string haystack = "boo foo is not foo boo or foo boo foo";
string result;
Stopwatch sw = Stopwatch.StartNew();
for(int Lp = 0; Lp < 100000; Lp++)
result = haystack.ReplaceAll("boo","goo");
sw.Stop();
Console.WriteLine("Time used (float): {0} ms",sw.Elapsed.TotalMilliseconds);

The regular expression code needed 1100ms , whereas the string replacement code needed just 27ms. So for this particular example, the string replacement was 40 times faster than a regular expression.

This is a post from Martijn's C# Coding Blog.


Viewing all articles
Browse latest Browse all 10

Trending Articles