Skip to main content

A Quick Tip About Using RegexOptions.Compiled

Since Silverlight doesn't support RegexOptions.Compiled, it's the one option that's missing from Regex Hero.

In a full .NET application there may be times when it's worth it to use this option.  It adds significant cost to initializing the Regex instance, but it can also double the speed of the regular expression itself in some cases.  Therefore it's best to use when you can instantiate the Regex object once, and reuse it many times.  Jeff Atwood talks about this very point in his article, To Compile or Not to Compile.

So if you've determined in your case that RegexOptions.Compiled is worth the initial cost of compilation, then I'm going to offer one simple solution that we as programmers often forget about.

Lazy Loading
The advantage with lazy loading is that the object we're concerned about is only instantiated when it's first accessed.  Therefore, we can use RegexOptions.Compiled without necessarily hurting the initial start-up time of our application. In some cases we can even avoid needlessly instantiating objects...
private static Regex _LibraryTagsRegex = null;
public static Regex LibraryTagsRegex
{
 get
 {
  if (_LibraryTagsRegex == null)
  {
   _LibraryTagsRegex = new Regex(@"^library/tagged/(?<Tags>.+)", RegexOptions.Compiled);
  }
  return _LibraryTagsRegex;
 }
}


We can then call the static LibraryTagsRegex in the example above from anywhere in our code, and it's only going to be instantiated the first time it's accessed.

By the way, in my test I was able to perform 2 million LibraryTagsRegex.IsMatch() calls in 256 ms with the Compiled option (as above) vs 461 ms without it. The compilation time for a Regex this simple is very small and this scenario completely justifies the use of RegexOptions.Compiled.

You can also pull this off with the Lazy type new to C# 4.0...
public static Lazy<Regex> LibraryTagsRegex = 
    new Lazy<Regex>(() => new Regex(@"^library/tagged/(?<Tags>.+)", RegexOptions.Compiled));
Then you'd just reference LibraryTagsRegex.Value in your code.

Comments

Popular posts from this blog

Regex Hero for Windows 10 is Underway

Awhile back I began working on an HTML5 / JavaScript version of Regex Hero . However, it was a huge undertaking essentially requiring a complete rewrite of the entire application. I have not had enough time to dedicate to this lately. So I've begun again, this time rewriting Regex Hero to work in WPF. It'll be usable in Windows 10 and downloadable from the Microsoft Store. This is a much easier task that also has the advantage of running the .NET regex library from the application itself. This will allow for the same speedy experience of testing your regular expressions and getting instant feedback that Regex Hero users have always enjoyed. I expect the first release to be ready in Q4 of 2019.

Optimizing Your Regular Expressions

Regular expressions will backtrack.  That's an unfortunate thing about them because backtracking can be slow.    And in certain (rare) cases the performance can become so awful that executing the regular expression against a relatively short string could take over a minute.  There's a good article about catastrophic backtracking over at regular-expressions.info . And today I created a video about all of this called  Regex Lesson 5: Optimization .  In the video I start with a very poorly written regular expression and make several improvements to it, using the benchmarking feature along the way.  By the end of the video I make the regular expression over 3 million times faster. In addition, today's update to Regex Hero provides a little message in the event that you encounter a regular expression that takes over 10 seconds to evaluate... And then last of all, I changed the benchmarking feature a bit.  In the past it would simply test your regular expression against

Regex Analysis Bug Fixes

All of these updates relate to the analyzer, so if you're not a Regex Hero Professional user, then this won't affect you. I received a report of an analysis bug  related to character classes.  The regex analyzer wouldn't handle opening brackets inside a character class properly. It's one of the finer details of the regular expression syntax.  You wouldn't think that [[abc] would be valid, but it is.  You don't have to escape the opening bracket inside the character class.  So now the analyzer interprets this as it should. I've also fixed bugs around interpreting the \x00 (hex), \u0000 (unicode), and \k<group>  (backreference) expressions. P.S. The major updates I mentioned recently are still in the works.  So the price for Regex Hero Professional is still $20 for now.