FP-NUCA: A Fast NOC Layer for Implementing Large NUCA Caches
NUCA caches have traditionally been proposed as a solution for mitigating wire delays, and delays introduced due to complex networks on chip. Traditional approaches have reported significant performance gains with intelligent block placement, location, replication, and migration schemes. In this paper, we propose a novel approach in this space, called FP-NUCA. It differs from conventional […]
Spatial Locality Aware Disk Scheduling in Virtualized Environment
Exploiting spatial locality, a key technique for improving disk I/O utilization and performance, faces additional challenges in the virtualized cloud because of the transparency feature of virtualization. This paper contributes a novel disk I/O scheduling framework, named Pregather, to improve disk I/O efficiency through exposure and exploitation of the special spatial locality in the virtualized […]
Goodput-Aware Load Distribution for Real-Time Traffic over Multipath Networks
Load distribution is a key research issue in deploying the limited network resources available to support traffic transmissions. Developing an effective solution is critical for enhancing traffic performance and network utilization. In this paper, we investigate the problem of load distribution for real-time traffic over multipath networks. Due to the path diversity and unreliability in […]
GPU Acceleration for Simulating Massively Parallel Many-Core Platforms
Emerging massively parallel architectures such as a general-purpose processor plus many-core programmable accelerators are creating an increasing demand for novel methods to perform their architectural simulation. Most state-of-the-art simulation technologies are exceedingly slow and the need to model full system many-core architectures adds further to the complexity issues. This paper presents a fast, scalable and parallel simulator, which uses a novel […]
FreeRider: Non-Local Adaptive Network-on-Chip Routing with Packet-Carried Propagation of Congestion Information
Non-local adaptive routing techniques, which utilize statuses of both local and distant links to make routing decisions, have recently been shown to be effective solutions for promoting the performance of Network-on-Chip (NoC). The essence of non-local adaptive routing was an additional network dedicated to propagate congestion information of distant links on the NoC. While the […]
A Novel Method for Scaling Iterative Solvers: Avoiding Latency Overhead of Parallel Sparse-Matrix Vector Multiplies
In parallel linear iterative solvers, sparse matrix vector multiplication (SpMxV) incurs irregular point-to-point (P2P) communications, whereas inner product computations incur regular collective communications. These P2P communications cause an additional synchronization point with relatively high message latency costs due to small message sizes. In these solvers, each SpMxV is usually followed by an inner product computation that involves […]
A PTAS Mechanism for Provisioning and Allocation of Heterogeneous Cloud Resources
Cloud providers provision their heterogeneous resources such as CPUs, memory, and storage in the form of virtual machine (VM) instances which are then allocated to the users. One of the major challenges faced by the cloud providers is to allocate and provision these resources such that their profit is maximized, and the resources are utilized […]
A Differentiated Quality Adaptation Approach for Scalable Streaming Services
Providing scalable video streaming services for heterogeneous users in dynamic networked environments requires efficient and adaptive quality management mechanisms which deliver quality-customized services according to the client’s preferences and adapt the services to cope with various network conditions. In this paper, we address the issue of quality adaptation for providing personalized scalable media streaming services […]
Social-Aware Replication in Geo-Diverse Online Systems
Distributing long-tail content is a difficult task due to the low amortization of bandwidth transfer costs as such content has limited number of views. Two recent trends are making this problem harder. First, the increasing popularity of user-generated content and online social networks create and reinforce such popularity distributions. Second, the recent trend of geo-replicating content […]
Efficient and Cost-Effective Hybrid Congestion Control for HPC Interconnection Networks
Interconnection networks are key components in high-performance computing (HPC) systems, their performance having a strong influence on the overall system one. However, at high load, congestion and its negative effects (e.g., Head-of-line blocking) threaten the performance of the network, and so the one of the entire system. Congestion control (CC) is crucial to ensure an efficient utilization of the […]









