Notice (2018-05-24): bugzilla.xamarin.com is now in
Please join us on
Visual Studio Developer Community and in the
Mono organizations on
GitHub to continue tracking issues. Bugzilla will remain
available for reference in read-only mode. We will continue to work
on open Bugzilla bugs, copy them to the new locations
as needed for follow-up, and add the new items under Related
Our sincere thanks to everyone who has contributed on this bug
tracker over the years. Thanks also for your understanding as we
make these adjustments and improvements for the future.
Please create a new report for Bug 1310 on
Developer Community or GitHub if you have new
information to add and do not yet see a matching new report.
If the latest results still closely match this report, you can use the
In special cases on GitHub you might also want the comments:
GitHub Markdown with public comments
Find in Files currently reads all files into strings. This is really problematic with large files. At the very least it should have a cap on the filesize that it loads in this way.
Alternatively, maybe it could convert the search string into binary using various encodings, and match this on a binary stream. This would also enable searching binary files better.
Regexes will still need to use strings, so they'll need the cap.
Created attachment 849 [details]
Find in file has problem not only with big files but with small ones when there are a lot of them. For me it crashes when more than 1500-2000 files on the pad.
The core RegexSearch() and Search() logic really needs to be moved into FileProvider so that they have the context to be able to search via a stream, rather than the contents of the entire file read into one big string.
Unfortunately, System.Text.RegularExpressions.Regex does not support searching a stream... sigh.
The Search() method can probably be hacked up to work on a stream (although we'd have to check the pattern string for \n's to figure out how many lines our buffer needs to contain to match against). Annoying, but doable.
I don't know how to get around the Regex limitation.
Imposing an arbitrary file size limitation seems lame. Perhaps the way to do it is to "sample" the file to see if it is even textual (try reading a few lines of text and converting to UTF-8?), if that fails, then just assume no matches... I don't think we actually want to match non-text files anyway, do we?
Real text files aren't *generally* likely to be so large that we won't be able to load them (unless they are perhaps massive log files or uuencoded blobs or something?).
That said, I saw some opportunities to reduce the number of copies of the loaded file content as a string and committed a patch to try and limit it to 1 string content buffer per file. That should help things, but might not be enough.
I guess we could actually do the same multi-line hack for Regex (except what do we do if something like \n+ is in the pattern?).
If we didn't have to support multi-line searches, this would be so much easier... would anyone be opposed to us dropping multi-line matching support?
Is this still an issue?
It appears this has been fixed: https://github.com/mono/monodevelop/blob/master/main/src/core/MonoDevelop.Ide/MonoDevelop.Ide.FindInFiles/FindReplace.cs
It only reads the full file into memory when using regexes or replace patterns.
A couple ways we could improve further:
* when replacing, only read full file into memory when it finds a match in that file.
* for regex search, use a streaming regex engine
* don't search files over a particular size (i.e. stop multi-gigabyte files from OOMing).
Probably not worth worrying about them right now though.
ok, lowering the priority and marking confirmed as something that can still be done.