Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report on
GitHub or Developer Community with
your current version information, steps to reproduce, and relevant error
messages or log files if you are hitting an issue that looks similar to
this resolved bug and you do not yet see a matching new report.
I've pasted some example code here:
I first observed this in YUI Compressor, which was running fine for me in Mono 2.10.5, but appeared to hang with 100% CPU when I switched to Mono 2.11.2. I've extracted and simplified the regex that was causing the problem, to something like this:
This shows terrible running time on Mono 2.11.2 on strings of the form:
acccccccccX accccccccX acccccccX...
The thing is, I can't see any reason for it to blow up while backtracking. It looks for "a" and finds it. It then looks for "c" and finds it, then "c" or "e" repeatedly until it hits the "X", at which point it fails and has to backtrack. But there's nothing else to explore while backtracking. It should go all the way back up until it fails to find any match at position 0 and then advances to continue from position 1, quickly failing at every position until the next "a".
This equivalent Regex does not show the massive slowdown:
I tried both Mono 2.10.5 (Debian package) and Mono 2.11.2 (tarball), and for each I tried with and without MONO_NEW_RX=1. The only combination that is incredibly slow is Mono 2.11.2 *without* MONO_NEW_RX=1.
Given that this regression is recent and linked to the "|" operator, I think it's probably to do with this fix for bug 2663:
I'm not familiar enough with the regex implementation to understand what's going on in there.
I want to fix this one, it's probably easy, but I won't be available to do this until at least august 1st (about a week away). My Linux install is corrupt, and I am away from home.
If someone wants to back out that change, they are welcome to, or just wait till I have time to look at it. I can put back that change with the fix for this at the same time.
Even if I can't get my Linux install back up and running again, this particular fix I should be able to fix on my mac mini which I don't have with me.
I can now reproduce this problem, and i have a source tree built where i should be able to troubleshoot it easily. Not sure when i'll find time to troubleshoot this, but hopefully it will be soon.
I have a fix, i believe, just need time to submit it.
Created attachment 2268 [details]
This patch seems to help a lot with bug 6198.
The problem probably really lies most likely in the Check Combo function.. It's supposed to detect we've already been down a particular path and prevent us from trying the same path -- but as seen in the attached patch, which illustrates that we are finding duplicates and doing the same thing multiple times, it should be clear that as written it is not stopping us from duplicating something we already did. It should be a relatively simple function *or method if you prefer, but as i noted in original commit, i wasn't 100% convinced it was bug free and apparently, and unfortunately, it seems that i was right. I will try to find more time to troubleshoot this next week -- this week is kind of crazy for me. The above patch is more of a workaround which at minimum solves this problem and at worst it doesn't make the original problem in 2663 worse as it at least solves it some of the time.
Created attachment 2269 [details]
Revised, not as fast, but more propper
This is a revision, and discards the previous patch. This patch is not as fast as the previous patch, but it is faster than the original (without it).
The original patch for 2663 had a minor glitch in CheckComboMatch where it would allow it to repeat itself.
I cannot confirm that this gives the results you want, but it at minimum it still solves the original and should be quicker than the original, unlike the previous patch on this bug report.
If someone else wants to give this a shot, it's welcome. I"m not sure my fix is right, and i don't have the patience at this time to start looking into this any more than i have.
Could you review this patch?
I'll try to get back on this, in addition to whatever effort Martin is making. If something has already been done, let me know. Otherwise, i posted a message to the mailing list tonight(mono-devel) which describes an outline for what i need to do to fix this and i think it should work. I am familiar enough with the code that if i can get it downloaded and compiled i can probably fix it quicker than someone else just getting familiar with the code. I just need a day when i'll have time and the memory to do this.
I've got a fix which on my slow computer reduces runtime as follows
@ time mono SlowRegEx.exe
@ time mono SlowRegEx.exe
So it almost cuts the runtime in half, which is better, though admittedly this may not be fast enough for what the user wants, but i think i'm going to try to create a pull request. I just confirmed it passes the unit tests including the ones for 2663.
Ok, here's the pull request on the above if it's 'good enough' for now:
If nothing else, this is a major improvement -- while still not breaking the problem reported in 2663 which was fixed earlier.
The current approach is flawed as it changes the asymptotic runtime of the regex engine.