James Polanco of DevelopmentArc has a great post on Flash Player internals. Worth a read.
~ October, 2009 ~
Firebug is your friend
In my mind, there are really three ways to a significant dent in performance:
- Find bad algorithms and replace them with fast ones
- Find code that doesn’t actually have to be called and skip it.
- Optimize the code that gets called most often
And you really can’t do any of the three without a profiler. You might think you know what the problem is, but you won’t know until you profile it. In my case, I started out thinking that I had event listeners hanging around that weren’t letting go of their events, but the profiler (in this case, Firebug) told me I was completely wrong.
To get started profiling in Firebug, go to the console tab, press the ‘profile’ button, do some stuff, and hit the ‘profile’ button again. That’s it.
You’ll then be presented with data that looks like this:
For my money, the two most important columns are ‘own time’ and ‘time’. ‘time’ is the total time spent in a function including any functions that are called by that function, and ‘own time’ is the same thing minus the time taken up by other functions.
Problem: $$(‘.class’) can be SLOW!
I created a test where I did the same UI gesture 8 times, and this is what I discovered. Looking at ‘own time’ told me that most of my time was going to DOM traversal via the $$ function.
Looking at ‘time’ told me that the methods responsible for calling $$ were all central functions that were called in many places throughout my code, so it was worth making them as efficient as possible before figuring out whether there was a way to avoid calling some of them altogether.
Phase 1 — replacing traversals of the entire DOM tree (via $$) with smaller traversals
Roughly speaking, this corresponds to strategy (3).
|Replace $$(‘.class’) by $(‘section’).getElements(‘.class’) in critical sections||2345ms||20%||20%|
|Chage getElements(‘.class’) to getElements(‘div.class’) in critical sections||2094ms||12%||34%|
|Found more places to do the above optimizations||1723ms||22%||63%|
|Replaced getElements() with getChildren() where possible||1641ms||5%||71%|
Along the way, I tried all sorts of other optimizations, but none of them yielded much benefit. Now that I was reaching the point of diminishing returns, it was time to see if there were chunks of code I could safely skip.
Phase 2 — skipping handler functions when possible
I knew that there was almost certainly code I was running that could be skipped (strategy 2). Why?
I find that when writing UI code, it is often easier to use brute force to make sure that everything is working consistently. For example, if an AJAX call updates a certain part of the screen, it is often easier to blow away all event handlers from everything and re-add them where needed, rather than just patching up event handlers for the portion of the screen that was updating.
My rationale is that you can always fix this at the end. And well, it was now time to pay the piper.
My test case involved doing the same UI gesture 8 times. And most of the time was going to the following functions:
add_panel_handlers_if_needed(): 8 times
add_content_handlers(): 16 times
add_panel_handlers(): 8 times
actually_do_drag_cleanup(): 8 times
remove_content_handlers(): 24 times
fix_detail_handlers(): 8 times
handle_click(): 8 times
fix_toggle_rte_handlers(): 8 times
add_drag_handlers_and_start(): 8 times
add_insert_handlers(): 8 times
You can see that some functions are being called 8 times and some were being called 24 times. As it turns out, this was just due to programmer laziness. By adding a few checks, some of those redundant calls could be safely avoided.
The other thing that was causing extra work is that only certain interactions caused screen updates that needed event handlers to be reattached. By writing some code to check for that, I was able to avoid many of these calls altogether.
|End of phase 1||1641ms||71%||71%|
|Remove redundant calls to remove_content_handlers and add_content_handlers||1389ms||18%||102%|
|Skip certain fixup calls when content is determined not to have changed||1073ms||29%||162%|
(P.S. there is some small part of my brain that tells me that instead of manually worrying about these event handlers, I should just bite the bullet and switch to JQuery. But I’m not there yet.)
So, what’s the moral? First off, doing $$(‘.class’) is slow. Second, large performance boosts usually come from a combination of skipping code that doesn’t have to run and optimizing the code that does. This was no exception.
One more thing. I just have to say that Firebug is amazing. I expected it to have trouble giving useful timings in the face of inconsistent UI gestures and garbage collection, but it did the “right thing”, which many desktop profilers don’t manage to do. If I had one wish, I wish I could get it to bundle up calls from specified library files and allocate the time spent in them to the calling function.
Ok. Back to more optimizing.