Anyway.. has anybody even tested how much larger your code gets when optimizing for speed. Or how much faster your code gets when optimizing for speed. I have, the difference between size/speed is usually irrelevant, and the selection between optimization levels is more psychological than practical (when compiling for the "final product", debugging is different case).
Back at work from two week holiday. I quickly compiled my current project with -Os and -O3.
-O3 tries to optimize code very heavily for performance. It includes all of the optimizations -O2 includes, plus some more.
-Os, on the other hand, instructs GCC to "optimize for size." It enables all -O2 optimizations which do not increase the size of the executable, and then it also toggles some optimization flags to further reduce executable size.
The difference is more significant than I expected. I did not measure any performance yet. I'll post here if I do some more testing.
I have an elaborate I2C library which has lots of small functions. I think most of the optimization involves inlining those small functions... which is kind of useless on a blocking I2C communication. Only wastes memory.