Look at earlier computers, everything is progressive developments from those.
The "micro" in microprocessor just means the CPU (the "Processor" or central processing unit) is built as an integrated circuit, rather than large numbers of simpler components.
A "Microcontroller" combines a microprocessor with some memory some number if I/O and possibly other peripheral devices in the same IC.
The DEC PDP-8 from the 1960s is great example for learning the concepts - they only have a very few instructions & the very first versions did not use any integrated circuits, just conventional resistors, capacitors, diodes and transistors.
(The "L" or link is what's now called the Carry bit).
For any CPU, the different binary instructions are broken down in to sections.
One section typically breaks down to classes or groups of instructions, to do such things as move a byte or word from memory to the CPU accumulator, move accumulator to memory, increment or decrement, add or subtract etc.
And the most important ones, that technically make it a "computer": comparisons - data in one location greater, less or equal equal etc. to data in another location.
Then depending on that result being true or false, continue with the next instruction or jump to a different part of the program to do something else.
The rest of the binary instruction sets what data, memory address or literal number is being acted on.
If that data location is where a peripheral port or register is, then the instruction could be either reading or setting pin levels, or accessing eg. a serial data UART or timer registers.
Programs, at the "machine code" that the CPUs execute, are just lists of numbers, that represent those various instructions and memory addresses or values etc.