Resource efficiency is always a goal of the AEMB2 design. In order to further reduce resource consumption and improve operating speed, some minor changes are being made in the next generation EDK63 core architecture. The first block to experience some changes is the cache memory block.

Looking at the numbers for the EDK62 core, the tag memory block uses 3% (74/2444) of the total core resources. With an additional data cache planned for the EDK63 core, this will increase. Furthermore, the global enable signal depends on the result of the tag memory hit, which is a slow signal asynchronous signal. Therefore, it should be further optimised.

Memory Block
The first change is the use of synchronous memory blocks for both the data and tag memory blocks. This requires the use of an extra wide memory block. Since the AEMB2 is targeted at FPGA implementations, the availability of wide memory blocks is limited. However, the memory blocks can easily be configured as dual-port memory.

Exploiting this, the AEMB2 will use both ports of the dual-port memory to form a single-port double-wide memory block. The two ports will be separated with a different MSB/LSB page bit. A minimum of 256×72 single-port memory block is possible on a Spartan3A FPGA.

The result will save the use of LUTs as tag memory. The 72-bit wide memory can also store more state information.

Pipeline Cache
LUT was used as tag memory to provide a cache hint that is use for the global enable signal. By removing it, there will be no cache hint available. The cache hit/miss can only be determined with a 1-cycle latency. Therefore, the cache will need to be integrated directly into the processor pipeline to delay the hit/miss check from the FETCH stage into the DECODE pipeline stage.

The cache block will also be configured as a direct-mapped cache in order to avoid the complexities of multi-word cache line replacements. These complexities can be integrated in future versions.

The only disadvantage of the new cache architecture is a reduce in effective cache size with the same sized memory block. In the EDK62 core, a 2kb cache is configured for 512 cache entries while the EDK63 core will only configure it for 256 cache entries. However, the cache hit-rate is not going to be significantly affected by much due to the extremely small cache size to begin with.