I think some have missed the "a 1 second loop (that has to be cycle perfect)," bit, so I have to simulate, not bypass - I do the bypass bit when the "timing loop" doesn't go off doing other things.
The current bit of software I'm working on 'must' be exactly 10 million cycles from call to return and this loop incorporates another loop that must be 1 million cycles plus it jumps out and back to another bit that is 11 cycles.
So you can see that it is easier to simulate rather than trying to add it all up on paper (or in your head).
I'll have another look at the MPlab one and see if I can make heads or tails of it - I did try it many years ago without much success.