The reason I am writing this article is because of some work I have been doing recently. My client wants a persistent AJAX application to work in IE6+. This means the web page will be left running for days/weeks on end without being refreshed or the browser being closed. This means it cannot have ANY memory leaks.
I currently process incoming XML using an XSL document - which works very well. However when done under IE6 the XSL transformation is output as plain text [please let me know if there is a DOM alternative] which means the only way to insert my newly transformed code is to use innerHTML. This seems to leak memory like a biatch! Note that the leak doesn't occur in FireFox.
So why does innerHTML leak? Well after some digging I found out a few interesting facts - lets look at the who, where, why and how...
Where did innerHTML come from?
innerHTML is not a standard, it is a proprietary format by Microsoft. Given Microsoft's huge installed base and popularity other browsers included innerHTML for the sake of compatibility. It was never in the w3c specification. It subsequently isn't future proof and isn't designed for xhtml [xml mime type].
Why is innerHTML used?
innerHTML is used as a cheap and easy way to write objects to a container element like a span or DIV. It tends to be faster than traditional DOM methods and takes substantially less time to write. For example you can quickly write a table structure into a DIV container using innerHTML like so:
var myDiv = document.getElementById('myDivId');
myDiv.innerHTML = "<table>
<tr><th>Title</th></tr>
<tr><td>Mr</td></tr>
</table>";
The equivalent code using DOM methods would take up considerably more time and lines of code to accomplish. So for lazy coders or new guys innerHTML provides a great way into the world of JavaScript. To be fair to innerHTML, it works very well for the most part, yes its ugly, its not "right" but if your in a rush and don't have the requirements my client has then it can work very well. You should try, where possible, to avoid it though as it can have unpredictable results and its a bad habit to rely on innerHTML to change or update your web page. DOM methods, in the long run, will serve you better.
So why / how does innerHTML leak memory?
The specific answer to this I do not know but we can theorize a solution. Firstly, as previously discussed, innerHTML is not designed to work with xhtml documents, meaning that it can incorrectly interpret the string you pass to it. This means it doesn't actually render exactly what you ask it to - basically it adds things you never asked for. If you parse the html later on using DOM methods you may run into errors or bugs where the DOM hierarchy isn't as you expected.
Is there a solution?
Yes! Simply don't use innerHTML. I don't think you'll ever find an article, and there are quite a few, that puts in in good light [aside from its obvious benefits]. However if you are in my position and are forced to use it then, apparently, there are a few things you can try doing.
Firstly as the memory leak may be coming from its incompatibility with xhtml it is well worth re-writing your code to accommodate this, or by re-structuring your xhtml in a different way. Unfortunately it's quite hard to know exactly what innerHTML is doing so its a bit of guess work.
If you wish to use DOM methods to replace your existing innerHTML code then please look at this very good article on innerHTML alternatives.
As I said at the start, if anyone reading this is thinking "what a muppet I know the answer" then please leave a comment as I'm sure it would help me, and anyone reading this!

8 comments:
"I don't think you'll ever find an article, and there are quite a few, that puts in in good light [aside from its obvious benefits]."
This can be said about anything. It's like saying "Nobody says anything good about the bad parts of innerHTML."
The obvious reason to use innerhtml is performance. quirksmode ran tests showing that innerhtml is several orders of magnitude faster than DOM.
If a browser refresh fixes the memory leak (does it?) it seems like it would be adivsable for your application to reload ITSELF every so often.
Unfortunately a browser refresh or navigating away from the page doesn't fix the issue.
I also experimented with changing how I construct my XHTML.
Thanks for the comment, and yes you are correct about the performance of innerHTML.
However it will probably make developers lazy, or lack a certain understanding that will inhibit their progress later on.
My point really was that so long as you understand it's pitfalls and know where its limits are then you can do what you like. I suppose that's the beauty of programming.
W3C missed the boat yet again on innerHTML. It should be in the spec. It's a timesaver, and performance booster. W3c usually likes to make things harder instead of easier (thanks java geeks).
I also have to code applications that run for weeks on end. Unfortunately I also do not control all the code running on it. We've set up watch dog programs that watch the browser. When it hits a certain memory threshold it closes and restarts. You're correct that simple refreshing will do nothing for memory leaks.
Hey Jim cheers for the comment
Good idea about having a minder program.
I will definitely keep that in mind but we really want to keep it so no extra software is required in order to make it work.
That may just be a pipe dream though!
There is actually a DOM compliant methodology for doing the same thing as innerHTML: Assume you have new content Foo. First you remove all the child nodes of the element you're changing the content of, then create a range by calling document.createRange , then create a fragment F of your new content by calling createContextualFragment(Foo) on your range, then call appendChild(F) on the element you're changing the content of. Yes, it's confusing, but it's really only about 6 lines of script if you want to make it into a function that will do it for you.
The problem is, I'm pretty sure IE doesn't actually support document.createRange, because that would be complying with standards and god knows IE can never do that.
It's also true that once you get a few layers deep inside the document and start replacing content with innerHTML, the browser often fails to render the CSS of the new content correctly. However, it's also the case that when you try to do the same thing with DOM compliant methods in competing browsers you get similarly broken results, so I can't point a finger solely at Micro$oft on that one.
Hmm... What a no brainer of an article. InnerHTML leaks - yet you provide no use case / simplified example of your code to reproduce this or explain what JS you are using for your ajax transactions.
InnerHTML is the fastest way to add content into the DOM using the browsers own parser.
If browsing away or refresh doesn't clear the issue then you need to do some page unload clean ups in your javascript.
Handle your variables (especially variables scoped in and out of closures) and events and clean them up as you go along. Unload all events when you change the page also helps prevent memory leaks in IE...
Maybe the only correct way to use innerHTML - LEAK FREE http://www.posos.com/page/Index.cfm?SelNavID=2714. Use outerHTML...
I don't believe assigning to innerHTML by itself causes memory leaks. You have to dig deeper to find the real problem.
In IE, the memory leaks are caused by setting up circular structures via any attribute values of elements (not just event handlers) that refer to JavaScript structures that reference the element itself or any container of it. Typically event handlers are the culprit because they are functions that are closed over scope that must be preserved, and if that includes a reference to the containing element, that is the loop back that closes the circle.
JavaScript by itself cleans up its own circular structures just fine, but when DOM is involved, that's when browsers have problems figuring out what to keep and what is garbage. IE6 and earlier don't clean up such circular structures, and IE7 does so only partially (only connected elements get garbage collected when the document closes). Firefox also has some problems in this area, but I haven't investigated to learn the details.
Assigning to innerHTML of an element or deleting elements using DOM calls both disconnect some existing elements from the rest of the document, and if those fragments contain circular structures, this will be leaked memory.
Curiously, minimizing an IE window will clean up such garbage.
Post a Comment