>Long long time ago I'd first look if .innerText has the info I need - but often first step was walking down the nodes, eliminating as much of the rest of body.innerWhatever as possible. If not possible, then resort to string massaging innerHTML. As I was analyzing different pages - most not under my control and changed at inoppotune times - I think this approach saved in the long run maintainance time, as I often could build new needed functionality easily out of existing parsing routines. YMMV depending on how many different pagetypes you need to work on.
This is for a close environment and the format is always the same.