No; each 74HC166 provides eight inputs and each 74HC595 provides eight outputs, all completely independent of each other.
It's up to you what you put in the SPI transmit buffer for each output byte, eg. copy from RAM that you use as "outputs" within the program.
Likewise, with each SPI byte transfer, you store the received bytes in a locations and use them as the "inputs" in your program.
You have a separate routine that does the SPI I/O transfers on a regular schedule, or at the end of each program loop, to the RAM copes are constantly synchronised with the real I/O signals.
That's pretty much how industrial PLCs work, often using some form of serial data chained through all the I/O expansion and copying to/from RAM areas before or after each program pass in the main loop.
This is an example using two SRs each for inputs and outputs; you can use as many as you like, within reason.
It also uses the same pin to latch both the outputs and inputs between the SPI transfers, so only four pins used.
Ignore the pin numbers, they will vary with the device being used - using an SPI port, they would be, top to bottom:
SPI Data out, SPI Clock, a general output pin for the latch pulse, and SPI Data in.
As others say, there are various dedicated I/O expanders which may be more convenient in some situations - but SPI on a device that has a hardware SPI port is the fastest and has lowest overhead, as it can run at megabit speeds and needs no handshaking or waiting for remote devices.
Edit - reading the original post again, do you actually want the same pin as alternately output and input?
As long as the output is a fixed polarity drive (eg. "open collector " style) , that system could work by linking the outputs to inputs with diodes and connecting the external signal and a pull-up resistor to the input pin.