Skip to main content

Fixed Bugs With Saving Locally, The Analyzer, And a New Recommendation Engine

Bug Fixes

Today's update takes care of a couple bugs:
  1. Saving locally to your computer with Regex Hero Professional would fail with certain special characters.
  2. The analyzer would fail in rare situations involving very complex character classes.

Recommendation Engine

And lastly, I've included a beta version of a new recommendation engine. This is available only to users of Regex Hero Professional. The recommendations that are produced are all related to performance. And the type of recommendations that are produced are limited at this point. Often you'll see the message, "No recommendations found." This is something I intend to continue to work on and improve in the upcoming months.

Here are the possible recommendations I've included so far:

  1. IgnoreCase is not needed and slows down processing. Please disable it. This one determines when the IgnoreCase flag isn't doing anything for you. For example, \w matches word characters and it's case insensitive, so adding the IgnoreCase just for that would be pointless and would slow down the regular expression.
  2. Redundant quantifiers may be slow. Please remove the first quantifier. This identifies situations such as x+x+. The '+' quantifier used back to back on the same character does nothing but make the regular expression much slower than it should be. This can be simplified to xx+.
  3. Alternations are slow. Please change to a character class. This identifies single character alternations such as a|b|c. It is slightly more efficient to use [abc] instead.
  4. Do not repeat 3 or more characters. Use a numbered quantifier instead. This is a minor one, but rather than \w\w\w you can use \w{3} and see slightly improved performance. The performance gains are greater the more characters you're dealing with.
  5. Do not perform case insensitive matching with a character class. Use the IgnoreCase option instead. In some old regex implementations, there was no IgnoreCase flag. So the workaround would be to explicitly include both cases, e.g. [Aa][Bb][Cc]. But there's no need for that anymore, and it's more efficient to just use the IgnoreCase option.
There will be more rules coming, as well as improvements to the intelligence and guidance behind these existing rules. But the big feature to come next is to actually allow you to simply click a button to fix the problem.



Comments

Popular posts from this blog

Regex Hero for Windows 10 is Underway

Awhile back I began working on an HTML5 / JavaScript version of Regex Hero . However, it was a huge undertaking essentially requiring a complete rewrite of the entire application. I have not had enough time to dedicate to this lately. So I've begun again, this time rewriting Regex Hero to work in WPF. It'll be usable in Windows 10 and downloadable from the Microsoft Store. This is a much easier task that also has the advantage of running the .NET regex library from the application itself. This will allow for the same speedy experience of testing your regular expressions and getting instant feedback that Regex Hero users have always enjoyed. I expect the first release to be ready in Q4 of 2019.

Optimizing Your Regular Expressions

Regular expressions will backtrack.  That's an unfortunate thing about them because backtracking can be slow.    And in certain (rare) cases the performance can become so awful that executing the regular expression against a relatively short string could take over a minute.  There's a good article about catastrophic backtracking over at regular-expressions.info . And today I created a video about all of this called  Regex Lesson 5: Optimization .  In the video I start with a very poorly written regular expression and make several improvements to it, using the benchmarking feature along the way.  By the end of the video I make the regular expression over 3 million times faster. In addition, today's update to Regex Hero provides a little message in the event that you encounter a regular expression that takes over 10 seconds to evaluate... And then last of all, I changed the benchmarking feature a bit.  In the past it would simply test your regular expression against

Regex Analysis Bug Fixes

All of these updates relate to the analyzer, so if you're not a Regex Hero Professional user, then this won't affect you. I received a report of an analysis bug  related to character classes.  The regex analyzer wouldn't handle opening brackets inside a character class properly. It's one of the finer details of the regular expression syntax.  You wouldn't think that [[abc] would be valid, but it is.  You don't have to escape the opening bracket inside the character class.  So now the analyzer interprets this as it should. I've also fixed bugs around interpreting the \x00 (hex), \u0000 (unicode), and \k<group>  (backreference) expressions. P.S. The major updates I mentioned recently are still in the works.  So the price for Regex Hero Professional is still $20 for now.