Bug 23316 - System.Linq.Enumerable.GroupBy is very slow.
Summary: System.Linq.Enumerable.GroupBy is very slow.
Status: RESOLVED NOT_REPRODUCIBLE
Alias: None
Product: Class Libraries
Classification: Mono
Component: System.Core ()
Version: 3.2.x
Hardware: PC Linux
: --- normal
Target Milestone: Untriaged
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2014-09-23 18:37 UTC by Anthony
Modified: 2014-09-29 03:45 UTC (History)
2 users (show)

Tags:
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED NOT_REPRODUCIBLE

Description Anthony 2014-09-23 18:37:13 UTC
See the output of this program. In the comments are my result times.

class MainClass
    {
        public static void Main(string[] args)
        {
            var tenthousand = 10 * 1000;
            var million = 1000 * 1000;

            var w = Stopwatch.StartNew();
            // 2,227 ms
            Enumerable.Range(0, tenthousand)
                .GroupBy(i => i)
                .Count();
            w.Stop(); Console.WriteLine(w.ElapsedMilliseconds); 
            w.Restart();

            // 377 ms
            Enumerable.Range(0, million)
                .TestGroupBy(i => i)
                .Count();
            w.Stop(); Console.WriteLine(w.ElapsedMilliseconds);
            w.Restart();

            // 878 ms
            Enumerable.Range(0, million)
                .GroupJoin(Enumerable.Range(0,million), i => i, i => i, Tuple.Create)
                .Count();
            w.Stop(); Console.WriteLine(w.ElapsedMilliseconds); 
        }
    }
    public static class blah
    {
        public static Dictionary<TKey, List<TSrc>> TestGroupBy<TSrc, TKey>(
            this IEnumerable<TSrc> src, Func<TSrc,TKey> groupFunc)
        {
            var dict= new Dictionary<TKey, List<TSrc>>();

            foreach (TSrc s in src)
                {
                    TKey key = groupFunc(s);
                    List<TSrc> list ;

                    if (!dict.TryGetValue(key, out list))
                        {
                            list = new List<TSrc>();
                            dict.Add(key, list);
                        }       
                    list.Add(s);        
                }

            return dict;
        }
    }
Comment 1 Anthony 2014-09-23 18:38:53 UTC
This does not appear to be a new issue. http://stackoverflow.com/questions/7097292/linq-groupby-extremely-slow
Comment 2 Anthony 2014-09-23 18:47:02 UTC
Runnning the GroupBy component in LINQPad on Windows, my friend achieved expected performance.

(16:40:41) Squirrel: You on a windows machine?
(16:40:46) Ensign: yes.
(16:40:50) Squirrel: Do you have LINQPAd?
(16:40:58) Ensign: of course.
(16:41:13) Squirrel: Can you run this and tell me the output?

   var w = Stopwatch.StartNew();
            // 2,227 ms
            Enumerable.Range(0, 10000)
                .GroupBy(i => i)
                .Count();
            w.Stop(); Console.WriteLine(w.ElapsedMilliseconds); 
(16:41:21) Squirrel: Cuz in Mono it's taking 2+ seconds :P
(16:41:31) Ensign: 9
(16:42:22) Ensign: 9 is A LOT LESS than 2227
(16:44:26) Ensign: hrm. 9 appears to have taken into account some startup costs.
(16:44:38) Ensign: the output of 

var millis = new List<long>();
	foreach (var e in Enumerable.Range(0, 1000)) {
	var w = Stopwatch.StartNew();
            // 2,227 ms
            Enumerable.Range(0, 10000)
                .GroupBy(i => i)
                .Count();
            w.Stop(); //Console.WriteLine(w.ElapsedMilliseconds);
			millis.Add(w.ElapsedMilliseconds);
	}
	Console.WriteLine(millis.Average());

is 0.16
(16:45:24) Ensign: and took .765s to run.
Comment 3 Marek Safar 2014-09-29 03:45:23 UTC
Tested with Mono 3.10 and it works as expected for me

Results on my machine

12
488
972