Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.
Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.
Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.
Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.
Created attachment 14539 [details] mbox-parser.cs I was testing performance of MimeKit's parser this weekend and I noticed a *major* decrease in performance since the last time I ran my benchmarks a year or so ago. To narrow down the problem, I reverted my MimeKit checkout to various points between now and 1.0 and noticed that the was very little difference. At this point I was began to suspect Mono, so I downloaded Mono 4.0.5.1 to check the performance of MimeKit's parser using that. Here are the results (Mac OS X Yosemite 10.10.5): Mono JIT compiler version 4.0.5 ((detached/1d8d582 Tue Oct 20 15:15:33 EDT 2015) Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com TLS: normal SIGSEGV: altstack Notification: kqueue Architecture: x86 Disabled: none Misc: softdebug LLVM: yes(3.6.0svn-mono-(detached/a173357) GC: sgen [fejj@localhost playground]$ mono --llvm mbox-parser.exe larger.mbox Parsed 14896 messages in 1.78 seconds. [fejj@localhost playground]$ mono mbox-parser.exe larger.mbox Parsed 14896 messages in 2.37 seconds. Here are the results for the current stable release of Mono: Mono JIT compiler version 4.2.1 (explicit/6dd2d0d Fri Nov 6 12:25:19 EST 2015) Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com TLS: normal SIGSEGV: altstack Notification: kqueue Architecture: x86 Disabled: none Misc: softdebug LLVM: yes(3.6.0svn-mono-(detached/a173357) GC: sgen [fejj@localhost playground]$ mono --llvm mbox-parser.exe larger.mbox Parsed 14896 messages in 5.11 seconds. [fejj@localhost playground]$ mono mbox-parser.exe larger.mbox Parsed 14896 messages in 5.45 seconds. That's a pretty significant decrease in performance. Attached you'll find my mbox-parser.cs test program. To get MimeKit: git clone https://github.com/jstedfast/MimeKit.git I can't attach the sample mbox that I was using (because it's over a gigabyte), but you can easily recreate it by concatenating multiple copies of MimeKit/UnitTests/TestData/mbox/jwz.mbox.txt into a single file (approx. 100 times). That should give you a large enough input to rule out statistical noise.
Here is what the profiler outputs: 200089 0 1 (wrapper runtime-invoke) <Module>:runtime_invoke_void_object (object,intptr,intptr,intptr) 200089 11 1 MboxParser.Program:Main (string[]) 199843 7 12160 MimeKit.MimeParser:ParseMessage (System.Threading.CancellationToken) 199836 33 12160 MimeKit.MimeParser:ParseMessage (byte*) 161829 21 11440 MimeKit.MimeParser:ConstructMultipart (MimeKit.Multipart,byte*) 160000 59 11440 MimeKit.MimeParser:MultipartScanSubparts (MimeKit.Multipart,byte*) 148425 417 1216769 System.OrdinalComparer:GetHashCode (string) 148007 423 1216766 System.Globalization.TextInfo:GetHashCodeOrdinalIgnoreCase (string) 147584 602 1216766 System.Globalization.TextInfo:GetHashCodeOrdinalIgnoreCase (string,bool,long) 146812 430 1217122 System.Globalization.TextInfo:GetCaseInsensitiveHashCode (string,bool,long) 146381 452 1217122 System.CultureAwareComparer:GetHashCode (string) 145928 423 1217122 System.Globalization.CompareInfo:GetHashCodeOfString (string,System.Globalization.CompareOptions) 145504 596 1217122 System.Globalization.CompareInfo:GetHashCodeOfString (string,System.Globalization.CompareOptions,bool,long) 144631 445 1217122 System.Globalization.CompareInfo:GetSortKey (string,System.Globalization.CompareOptions) 144186 421 1217122 System.Globalization.CompareInfo:CreateSortKey (string,System.Globalization.CompareOptions) 143765 775 1217122 System.Globalization.CompareInfo:CreateSortKeyCore (string,System.Globalization.CompareOptions) 142538 407 1217122 Mono.Globalization.Unicode.SimpleCollator:GetSortKey (string,System.Globalization.CompareOptions) 142130 1027 1217122 Mono.Globalization.Unicode.SimpleCollator:GetSortKey (string,int,int,System.Globalization.CompareOptions) 127447 11392 1217122 Mono.Globalization.Unicode.SimpleCollator:GetSortKey (string,int,int,Mono.Globalization.Unicode.SortKeyBuffer,System.Globalization.CompareOptions) 98029 361 278400 MimeKit.HeaderList:Add (MimeKit.Header) 83428 391 21920 MimeKit.MimeMessage:.ctor (MimeKit.ParserOptions,System.Collections.Generic.IEnumerable`1<MimeKit.Header>) 68671 14597 11582357 Mono.Globalization.Unicode.SimpleCollator:FillSortKeyRaw (int,Mono.Globalization.Unicode.SimpleCollator/ExtenderType,Mono.Globalization.Unicode.SortKeyBuffer,System.Globalization.CompareOptions) 46289 34 76242 MimeKit.MimeParser:Step (byte*) 46227 1309 64080 MimeKit.MimeParser:StepHeaders (byte*) 43171 23 9760 MimeKit.MimeParser:ConstructMessagePart (MimeKit.MessagePart,byte*) 42115 218 343680 MimeKit.MimeParser:ParseAndAppendHeader ()
I guess the problem is that the code ends up here? [System.Security.SecuritySafeCritical] // auto-generated internal unsafe int GetCaseInsensitiveHashCode(String str, bool forceRandomizedHashing, long additionalEntropy) { // Validate inputs if (str==null) { throw new ArgumentNullException("str"); } Contract.EndContractBlock(); #if MONO return StringComparer.CurrentCultureIgnoreCase.GetHashCode (str); #else // Return our result return (InternalGetCaseInsHash(this.m_dataHandle, this.m_handleOrigin, this.m_textInfoName, str, forceRandomizedHashing, additionalEntropy)); #endif } I'm betting that the StringComparer.CurrentCultureIgnoreCase.GetHashCode() is insanely expensive. FWIW, GetCaseInsensitiveHashCode() is effectively called from here: [ResourceExposure(ResourceScope.None)] [ResourceConsumption(ResourceScope.Process, ResourceScope.Process)] internal static int GetHashCodeOrdinalIgnoreCase(String s, bool forceRandomizedHashing, long additionalEntropy) { // This is the same as an case insensitive hash for Invariant // (not necessarily true for sorting, but OK for casing & then we apply normal hash code rules) return (Invariant.GetCaseInsensitiveHashCode(s, forceRandomizedHashing, additionalEntropy)); } Notice that the comment says that it expects to use the InvariantCulture, not the CurrentCulture. This may make a difference. Of course, I'm not sure why they switch to Invariant seeing as how Ordinal should still be cheaper than Invariant?
D'oh, never mind. I misread the code.
The issue should be addressed by https://github.com/mono/referencesource/commit/8382c4605d622d56354e7340bf872caaf36b972a jeff, could you try your test with the latest mono or mono 4.3.2
Unfortunately that change didn't seem to help much: Mono JIT compiler version 4.3.2 (mono-4.3.2-branch/83e0157 Mon Jan 11 18:26:16 EST 2016) Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com TLS: normal SIGSEGV: altstack Notification: kqueue Architecture: x86 Disabled: none Misc: softdebug LLVM: yes(3.6.0svn-mono-master/a173357) GC: sgen [fejj@localhost perf]$ mono --llvm ./mbox-parser.exe gigabyte.mbox Parsed 14896 messages in 10.819862s. [fejj@localhost perf]$ mono ./mbox-parser.exe gigabyte.mbox Parsed 14896 messages in 10.593343s. [fejj@localhost perf]$ mono --version Mono JIT compiler version 4.0.5 ((detached/1d8d582 Tue Oct 20 15:15:33 EDT 2015) Copyright (C) 2002-2014 Novell, Inc, Xamarin Inc and Contributors. www.mono-project.com TLS: normal SIGSEGV: altstack Notification: kqueue Architecture: x86 Disabled: none Misc: softdebug LLVM: yes(3.6.0svn-mono-(detached/a173357) GC: sgen [fejj@localhost perf]$ mono --llvm ./mbox-parser.exe gigabyte.mbox Parsed 14896 messages in 5.737068s. [fejj@localhost perf]$ mono ./mbox-parser.exe gigabyte.mbox Parsed 14896 messages in 6.912308s. (testing this on my laptop which is slower, so posted timings of both 4.3.2 and 4.0.5 for comparison)
Fixed in master