Hi,
There are many different 'shift reigsters', 4094 has worked for me, as well as the 74xx164. You should only need a clock and a data line, but you might also need a 'strobe' line if your shift-register has a latch, so that the data is loaded to the outputs at the same time.
If you've got more than one seven segment display then you can multiplex these, however you'll still require pins for the segments, as well as a common anode/cathode for each display. This can be done with a 3-8 line decoder, or a decade counter OR a johnson counter (so many ways with logic).
You are right btw, you will need 8 I/O's (if you include the decimal point) to display Anything, ie: any combination of segments on/off. For just numbers and if you're limited on I/O's, you might as well just get a 7-seg decoder in logic. There are many things a micro could do to multiplex inputs and google (and this board) has many examples, so I won't go into details here.
Looking at the data sheet for that particular shift register, the two input pins are the inputs of a NAND gate, the output of which goes through an inverter to the 'set' part of a flipflop, and the non-inverted part to the 'reset'. So, as you probably know, a 'NAND' is simply an AND gate with an inverted output. And because the output of this is inverted AGAIN, they cancel. So with the AND gate's inputs, one can be used as the data line, and the other as an 'enable'. When the enable is low, the output of the NAND gate will always be 0, regardless of whats on the data line, so when you clock the SR, you'll just be clocking in 0's. When enable is high, the output is what the data input is.
That particular shiftregister isn't buffered, meaning the parallel outputs will change every time you clock in a data bit. A buffered SR, or one with a latch would probably be better, the parallel output would remain the same until you send the strobe (or 'LOAD') line high, which loads in the connts of the shift register to the output.
Its pretty simple, thats why they have logic diagrams in datasheets, so you know exactly whats going on. Also, if you're using a microcontorller, with an SPI peripheral, then SPI can be used for shift registers easily. And with additional logic, such as a 3-8 line decoder, you could have multiple 'CS' lines for strobing the data out. The possibilities are mind boggling, I/O expansion up to 128 I/O's is easily possible with the minimum of coding, it just depends on what these I/O's are used for.
My two cents,
Blueteeth