Programming dedicated MCU based systems is fundamentally very! different to writing programs to run on a general purpose computer with an operating system.
Trying to use the same approach as for programming for Window or Linux-based systems will only result in long-term problems, or having to use much larger & more expensive target devices that are actually needed.
That may be OK for a one-off hobby project or small run of some device, but for mass production the end unit parts cost (and likely power consumption) are critical factors in profitability and market share.
You (the programmer) own the entire system and every resource in it. The only person you have to share resources with or give up CPU cycles for is yourself!
Any interpreted language is wasting finite resources and invaluable CPU time; everything should be compiled / assembled machine code, using integer or fixed-point math as far as practical and avoiding trig function like the plague.
Most serious long term embedded / real-time device programmers would either fall about laughing or make the warding-off-evil cross sign when someone mentions such as RTOS or interpreted languages in an MCU based device
It's not that unusual to resort to optimising critical routines by cycle-counting the machine instructions to find ways to tweak the code.
Your most important tools are a really good compiler and hardware debugging interface!
If you are basing things on other peoples code / libraries, then you are stuck with whatever system constraints they have - but if practical, you can write your own replacements to fit with the rest of your own code.
For your overall "pipeline" debugging, analyse the inter-device communication - capture data packets and verify the content.
That's where any problems at a device level will be visible. It's pretty obvious where a problem is if a device has valid inputs but produces incorrect outputs!