it certainly is doable with a mcu. you can use an adc to digitize the signal (either onboard adc or external ones), and then display it. most 20mhz scope can get down to 0.1us/div, or 10mhz/div. assuming 10 points per div, that means you need to sample at 100mhz. the fastest onboard adc I have seen can do 1mhz. That means they can do 10us/div + 10 points / div. good enough for audio work.
bit depth isn't a big issue: around 300 points (> 8bit depth) would be sufficient.