XML Caching in ASP.NET can be dangerous to your health

Posted by Kyle Heon Fri, 18 Sep 2009 00:03:00 GMT

Today was a day of troubleshooting and fixes. We had a site that was getting ready to go live, hosted on a 64-bit version of Windows Server 2008 (IIS7). I won’t bore you with all the details of all the issues we worked through today but we had one nasty issue that, once it happened it brought the server down; 100% CPU usage by the w3wp.exe process and memory usage steadily climbing. Our error logger was catching mostly one xml argument exception (but with a few variations).

A partial stack trace looked like this:
System.Web.HttpUnhandledException: Exception of type 'System.Web.HttpUnhandledException' was thrown. ---> System.ArgumentException: The node to be removed is not a child of this node.
     at System.Xml.XmlNode.RemoveChild(XmlNode oldChild)
     at System.Xml.XmlNode.RemoveAll()
     at System.Xml.XmlElement.RemoveAll()
This error was at the heart of our navigation framework which makes heavy use of xml and xsl. The process we go through for building a pages navigation looks something like this:
  1. Check to see if the xml navigation file has been cached, if so get it from cache, otherwise load it and cache it.
  2. When navigation is ready to be rendered, it is first manipulated to add in an xml node section that stores the current pages selected states
  3. The modified (in memory) xml is then passed to the desired xsl which handles converting the xml to html and returning the transformed string.

Originally the process of modifying the xml (in memory) looked like this:

protected virtual void AddNavigationSelectedStates()
  {
      /*
      <selected>
          <choice level="1">Selected L1</choice>
          <choice level="2">Selected L2</choice>
          <choice level="3">Selected L3</choice>
      </selected>
      */
  
      XmlNode navigationScheme = this._xml.SelectSingleNode("navigationScheme");
      XmlNode selected = this._xml.CreateNode(XmlNodeType.Element, "selected", "");
      if (null != navigationScheme.SelectSingleNode("selected"))
      {
          selected = navigationScheme.SelectSingleNode("selected");
          selected.RemoveAll();
      }
  
      for (int i = 0; i < base.Selected.Count; i++)
      {
          XmlElement choice = this._xml.CreateElement("choice");
          choice.SetAttribute("level", (i + 1).ToString());
          choice.InnerText = base.Selected[i].ToString();
  
          selected.AppendChild(choice);
      }
      navigationScheme.InsertBefore(selected, this._xml.SelectSingleNode("//navigationScheme/mainNav"));
  }

During stress testing we identified that the problem code was the if (null != navigationScheme.SelectSingleNode(“selected”)) block. As I was the original developer I struggled to remember why such code was there, given that this node should never exist as the page builds. So, after reproducing the errors locally with the above code I removed the if statement and retested. Not a single error. Quickly browsed the site and didn’t notice any issues (I should have looked harder, there were plenty).

We posted an updated build, stress tested the site, still no errors. Then another developer noticed that the site navigation wasn’t working properly. So, back into the source I went. Digging deep I eventually found that each time the XmlNode navigationScheme = this._xml.SelectSingleNode(“navigationScheme”); was run, the section of the document grew with all previous “choices”.

Through more debugging and some pair programming with a fellow developer (and a lot of discussion about what was really happening) we discovered that, while we were never stuffing the modified xml back into cache, it was indeed being updated. I’m still trying to determine why this is but I think I have read somewhere that XmlDocument objects are in memory structures that are passed by reference (and not by value as I originally thought).

Armed with this information we attempted to clone the document and modify that and while that ultimately worked, it didn’t fit into the system we had for navigation render behaviour. With some changes to how we passed along the modified xml we ended up with this modified method (name changed to better reflect it’s newer intent):

public virtual XmlDocument SelectedStatesXml()
  {
      /*
      <selected>
          <choice level="1">Selected L1</choice>
          <choice level="2">Selected L2</choice>
          <choice level="3">Selected L3</choice>
      </selected>
      */
  
      XmlDocument xmlDoc = (XmlDocument)this._xml.CloneNode(true);
      XmlNode navigationScheme = xmlDoc.SelectSingleNode("navigationScheme");
      XmlNode selected = xmlDoc.CreateNode(XmlNodeType.Element, "selected", "");
  
      for (int i = 0; i < base.Selected.Count; i++)
      {
          XmlElement choice = xmlDoc.CreateElement("choice");
          choice.SetAttribute("level", (i + 1).ToString());
          choice.InnerText = base.Selected[i].ToString();
  
          selected.AppendChild(choice);
      }
      navigationScheme.InsertBefore(selected, xmlDoc.SelectSingleNode("//navigationScheme/mainNav"));
      return xmlDoc;
  }

Be sure to note in the code above, the call to this._xml.CloneNode(true); as this is important. We are doing a deep copy of the entire XmlDocument. Manipulation is done strictly to the clone and never the master xml document. So now when we call out to render navigation, it looks like this:

return base.OutputNavigation(
      new XslTransformBehaviour(
          ((XmlNavigation)this.Navigation).SelectedStatesXml(),
          XslHelper.GetTransformer(path),
          ((XmlNavigation)this.Navigation).NavigationArguments
          )
      );

I hope this helps someone someday. I don’t think this behaviour is something I’m going to forget any time soon.

Posted in  | Tags , ,  | 1 comment | no trackbacks

Another quick site update

Posted by Kyle Heon Fri, 03 Mar 2006 02:55:00 GMT

Just posted another small update to the website that includes the latest from Typo’s trunk (with an issue free migration, yippee!).

The big (well big in terms of what is being posted) thing tonight is a new randomizing header image. Currently it should cycle through two, but over time I’m going to add more. The tricky part with that is in getting Typo to NOT cache things so that the image will actually randomize.

Thanks to this tip from Scott Laird I’ve modified the way article caching works. Going to keep an eye on this and see if there are ways to find tune things so that I can improve performance while still allowing for the image randomization.

Posted in  | Tags ,  | no comments