Router’s buffer design and management strongly influence energy, area and performance of on-chip networks, hence it is crucial to encompass all of these aspects in the design process. At the same time, the NoC design cannot disregard preventing network-level and protocol-level deadlocks by devoting ad-hoc buffer resources to that purpose. In Chip Multiprocessor Systems (CMPs) the coherence protocol usually requires different virtual networks (VNETs) to avoid deadlocks. Moreover, VNET utilization is highly unbalanced and there is no way to share buffers between them due to the need to isolate different traffic types. This paper proposes CUTBUF, a novel NoC router architecture to dynamically assign VCs to VNETs depending on the actual VNETs load to significantly reduce the number of physical buffers in routers, thus saving area and power without decreasing NoC performance.
Moreover, CUTBUF allows to reuse the same buffer for different traffic types while ensuring that the optimized NoC is deadlock-free both at network and protocol level. In this perspective, all the VCs are considered spare queues not statically assigned to a specific VNET and the coherence protocol only imposes a minimum number of queues to be implemented. Synthetic applications as well as real benchmarks have been used to validate CUTBUF, considering architectures ranging from 16 up to 48 cores. Moreover, a complete RTL router has been designed to explore area and power overheads. Results highlight how CUTBUF can reduce router buffers up to 33% with 2% of performance degradation, a 5% of operating frequency decrease and area and power saving up to 30.6% and 30.7%, respectively. Conversely, the flexibility of the proposed architecture improves by 23.8% the performance of the baseline NoC router when the same number of buffers is used.