When designing and connecting the different blocks that make up the CPU I find
it convenient to make use of "elastic pipelines", i.e. where the interface
consists of a valid
signal from source to sink, and a ready
signal from the
sink back to the source. However, the WISHBONE protocol is not as easy, because
the WISHBONE response signal wb_ack_i
can not be delayed.
Therefore, I find it convenient to introduce the MEMORY module, which acts as an "adapter" from the WISHBONE interface to the "elastic pipeline" interface. Furthermore, the MEMORY module stores the responses on two different output pipelines, for use by the CPU.
The MEMORY module exposes a source interface (connected to the EXECUTE module) as follows:
s_valid_i : in std_logic;
s_ready_o : out std_logic;
s_op_i : in std_logic_vector(2 downto 0);
s_addr_i : in std_logic_vector(15 downto 0);
s_data_i : in std_logic_vector(15 downto 0);
The s_op_i
is a one-hot encoding of the requested operation:
C_WRITE
: Writedata
toaddr
.C_READ_DST
: Read fromaddr
and place result insrc
.C_READ_SRC
: Read fromaddr
and place result indst
.
The src
and dst
interfaces mentioned here are the two output pipelines:
msrc_valid_o : out std_logic;
msrc_ready_i : in std_logic;
msrc_data_o : out std_logic_vector(15 downto 0);
mdst_valid_o : out std_logic;
mdst_ready_i : in std_logic;
mdst_data_o : out std_logic_vector(15 downto 0);
The idea is that the EXECUTE block issues requests and reads back the results at a later time.
The main benefit of this module is that it stores the results read back from memory, in case the EXECUTE module is not yet ready to receive them.
We want the module to have low latency - ideally a total latency of one clock
cycle - from s_valid_i
to, say, msrc_valid_o
. In fact, this is achieved in
this implementation. We'll discuss the data path and the control path
separately.
So first of all, the WISHBONE request interface is driven combinatorially, i.e. the address and data signals are simply connected directly:
wb_addr_o <= s_addr_i;
wb_we_o <= s_op_i(C_WRITE);
wb_dat_o <= s_data_i;
Secondly, the WISHBONE response wb_data_i
is connected to two different
instancies of one_stage_buffer
.
i_one_stage_buffer_src : entity work.one_stage_buffer
port map (
clk_i => clk_i,
rst_i => rst_i,
s_valid_i => osb_src_valid,
s_ready_o => osb_src_ready,
s_data_i => wb_data_i,
m_valid_o => msrc_valid_o,
m_ready_i => msrc_ready_i,
m_data_o => msrc_data_o
); -- i_one_stage_buffer_src
i_one_stage_buffer_dst : entity work.one_stage_buffer
port map (
clk_i => clk_i,
rst_i => rst_i,
s_valid_i => osb_dst_valid,
s_ready_o => osb_dst_ready,
s_data_i => wb_data_i,
m_valid_o => mdst_valid_o,
m_ready_i => mdst_ready_i,
m_data_o => mdst_data_o
); -- i_one_stage_buffer_dst
Notice how the wb_data_i
signal connects directly to both buffers, and that
the output from these buffers are directly connected to the outputs of this
module.
We have to be careful though: When the WISHBONE response arrives we must make sure that the output buffers can accept the data, because the response exists only for a single clock cycle. We'll get back to this in the section about formal verification.
The control path is responsible for the WISHBONE request control signals, controlling the two output buffers, and the upstream ready signal.
Let's review the WISHBONE interface. When a request is made, the wb_cyc_o
must be held high until the acknowledge is received. So we need a flag to
indicate whether we're waiting for an acknowledge. This is all achieved by the
following:
wb_cyc_o <= ((s_valid_i and s_ready_o) or wait_for_ack) and not rst_i;
wb_stb_o <= wb_cyc_o and s_valid_i and s_ready_o;
p_wait_for_ack : process (clk_i)
begin
if rising_edge(clk_i) then
if wb_cyc_o and wb_ack_i then
wait_for_ack <= '0';
end if;
if wb_cyc_o and wb_stb_o and not wb_stall_i then
wait_for_ack <= '1';
end if;
if rst_i = '1' then
wait_for_ack <= '0';
end if;
end if;
end process p_wait_for_ack;
Both read and write requests are always followed by a corresponding acknowledge signal. It's not possible to perform read and write simultaneously. However, when an acknowledge signal arrives there is no indication of which request it originated from. This means we must keep track of the requests sent, in particular the read requests.
Therefore we instantiate a one_stage_fifo
as well, to keep track of this
information. This fifo only keeps track of read requests and only contains a
single bit to distinguish whether the result should be stored in the SRC or DST
pipeline.
osf_mem_in_valid <= s_valid_i and s_ready_o and (s_op_i(C_READ_SRC) or s_op_i(C_READ_DST));
i_one_stage_fifo_mem : entity work.one_stage_fifo
generic map (
G_DATA_SIZE => 1
)
port map (
clk_i => clk_i,
rst_i => rst_i,
s_valid_i => osf_mem_in_valid,
s_ready_o => osf_mem_in_ready,
s_data_i(0) => s_op_i(C_READ_SRC),
m_valid_o => osf_mem_out_valid,
m_ready_i => wb_cyc_o and wb_ack_i,
m_data_o(0) => osf_mem_out_data
); -- i_one_stage_fifo_mem
The above shows that whenever a read request is accepted, the request is stored in the fifo. Furthermore, when an acknowledge is received from the WISHBONE, then we read out the request information from the fifo.
Using this information we can now control writing into the two output buffers:
osb_src_valid <= wb_cyc_o and wb_ack_i and osf_mem_out_valid and osf_mem_out_data;
osb_dst_valid <= wb_cyc_o and wb_ack_i and osf_mem_out_valid and not osf_mem_out_data;
The final part is to control the s_ready_o
upstream signal. The main
limitation is that each output buffer can hold only one value. So if the
request is a read to SRC, and the output SRC buffer contains data that it can't
deliver, then we must wait. Similarly for DST.
This leads to the following logic:
s_ready_o <= not (s_op_i(C_READ_SRC) and msrc_valid_o and not msrc_ready_i)
and not (s_op_i(C_READ_DST) and mdst_valid_o and not mdst_ready_i);
Notice how the upstream ready signal depends on the contents of the downstream
s_op_i
signal. This is non-standard, but allows the module to accept a read
DST even though the SRC output is still waiting.
We mentioned above that a requirement for this design to work is that the output buffers are always ready to accept a response from the WISHBONE. Therefore, we being with the following two properties:
f_osb_src_overflow : assert always {osb_src_in_valid and not rst_i} |-> {osb_src_in_ready};
f_osb_dst_overflow : assert always {osb_dst_in_valid and not rst_i} |-> {osb_dst_in_ready};
First of all we must ensure the correct format of the input operation:
f_exe_op : assume always {s_valid_i} |-> {s_op_i = "001" or s_op_i = "010" or s_op_i = "100"};
To demonstrate the correct functionality of the module, let's add a cover statement. This will verify that the module can output four values back-to-back, alternating on the SRC and DST interfaces. Additinally, we restrict the ready signals correspondingly:
f_cover_burst2 : cover {msrc_valid_o and msrc_ready_i and not mdst_ready_i;
mdst_valid_o and mdst_ready_i and not msrc_ready_i;
msrc_valid_o and msrc_ready_i and not mdst_ready_i;
mdst_valid_o and mdst_ready_i and not msrc_ready_i};
Number of cells: 217
BUFG 1
FDRE 37
IBUF 58
LUT2 4
LUT3 40
LUT4 1
LUT5 1
LUT6 4
MUXF7 1
OBUF 70
Estimated number of LCs: 46