Bug 217 - Performance of SelectNodes with XPath query which queries attributes is really, really, slow
Summary: Performance of SelectNodes with XPath query which queries attributes is reall...
Status: RESOLVED FIXED
Alias: None
Product: Class Libraries
Classification: Mono
Component: System.XML ()
Version: 2.10.x
Hardware: PC Linux
: --- enhancement
Target Milestone: Untriaged
Assignee: Bugzilla
URL:
Depends on:
Blocks:
 
Reported: 2011-08-07 11:11 UTC by Bill Seddon
Modified: 2015-03-17 06:18 UTC (History)
3 users (show)

Tags: XPath,Attribute,Performance
Is this bug a regression?: ---
Last known good build:

Notice (2018-05-24): bugzilla.xamarin.com is now in read-only mode.

Please join us on Visual Studio Developer Community and in the Xamarin and Mono organizations on GitHub to continue tracking issues. Bugzilla will remain available for reference in read-only mode. We will continue to work on open Bugzilla bugs, copy them to the new locations as needed for follow-up, and add the new items under Related Links.

Our sincere thanks to everyone who has contributed on this bug tracker over the years. Thanks also for your understanding as we make these adjustments and improvements for the future.


Please create a new report on GitHub or Developer Community with your current version information, steps to reproduce, and relevant error messages or log files if you are hitting an issue that looks similar to this resolved bug and you do not yet see a matching new report.

Related Links:
Status:
RESOLVED FIXED

Description Bill Seddon 2011-08-07 11:11:48 UTC
Hopefully the title says it all.  In the code example below, the iteration over the nodes in the downloaded document completes in about a 1 second. The XPath query takes ~400 seconds.  When this code run is run in .NET on Windows both take about 1 second which leads me to the view there's a significant performance penalty incurred using queries which filter on attributes.

The schema document used below is one which must be used by companies filing the quarterly (10Q) and annual (10K) submissions to the SEC using XBRL instance documents. It contains 15725 elements with meet the query criteria.

private const string ELEM_NO_REFS_KEY = "//xsd:element[@substitutionGroup]";
private const string XML_SCHEMA_URL = "http://www.w3.org/2001/XMLSchema";
private const string XML_SCHEMA_PREFIX = "xsd";

static void Main(string[] args)
{
  XmlDocument doc = new XmlDocument();
  doc.Load("http://xbrl.fasb.org/us-gaap/2011/elts/us-gaap-2011-01-31.xsd");
  var theManager = new XmlNamespaceManager(doc.NameTable);
  theManager.AddNamespace( XML_SCHEMA_PREFIX, XML_SCHEMA_URL );
  // This one is quick
  var elemList = doc.DocumentElement.ChildNodes.OfType<XmlElement>()
    .Where(node => node.LocalName == "element" &&
           node.NamespaceURI == XML_SCHEMA_URL &&
           node.HasAttribute("substitutionGroup")).Cast<XmlNode>().ToList();
  // This one is slllooowww...
  elemList = doc.SelectNodes(ELEM_NO_REFS_KEY, theManager)
    .OfType<XmlNode>().ToList();
}
Comment 1 Miguel de Icaza [MSFT] 2011-08-09 10:11:21 UTC
Setting priority to enhancement.
Comment 2 Atsushi Eno 2015-03-17 06:18:54 UTC
We have imported XPath part of referencesource so we shouldn't be responsible on this kind of issues anymore.