To encode 4 buttons, you will need four frequencies. Either one freq per button, or the dual frequencies like DTMF. One big advantage of dual frequencies is that the probability of two frequencies showing up at the same time in random speech and music is far less than a single frequency occurring. (which happens all the time)
Each switch will have a 'row' frequency, and a 'column' frequency. So, for a 2x2 switch array, you will need to two row and two column frequencies.
Pressing a button on the original Bell system DTMF keypad closed two switches. One would select a tap on an inductor to generate one of the four the row tones. The other switch would select a tap on a 2nd inductor to generate one of the column tones. There was probably a third switch to apply power to the oscillator.
You need to do something similar. Either with four different oscillators, or two oscillators and a means to make each oscillator run at two different frequencies.
At the receiving end, you need four bandpass, or notch, filters. One for each frequency. The inputs of each filter are in parallel. You may need an input buffer amplifier to control the source impedance that each filter sees.
Then use a some logic circuitry to decode the pairs of of tones into the four discreet on/off outputs.