Ayse K. Coskun

Automated Analytics for Improving Efficiency, Safety, and Security of HPC Systems

Performance variations are becoming more prominent with new generations of large-scale High Performance Computing (HPC) systems. Understanding these variations and developing resilience to anomalous performance behavior are critical challenges for reaching extreme-scale computing. To help address these emerging performance variation challenges, there is increasing interest in designing data analytics methods to make sense out of the […]

AI-based Scalable Analytics for Improving Performance, Resilience, and Security of HPC Systems

Next generation large-scale  High Performance Computing (HPC) systems face important cost and scalability challenges due to anomalous system and application behavior resulting in wasted compute cycles and the ever-growing difficulty of system management. There is an increasing interest in the HPC community in using AI-based frameworks to tackle analytics and management problems in HPC so […]

SHF: Small: Collaborative Research: Managing Thermal Integrity in Monolithic 3D Integrated Systems

Monolithic three-dimensional (M3D) technology is an emerging integrated circuit (IC) paradigm to maintain the exponential growth in transistor density in future computing platforms. At a time when the fundamental limits of conventional scaling are fast approaching, M3D ICs have the potential to enhance energy efficiency at lower cost. A fundamental challenge in building complex M3D […]

CSR: Small: A Just-in-Time, Cross-Layer Instrumentation Framework for Diagnosing Performance Problems in Distributed Applications

Distributed applications running in data centers are critical to society (e.g., for shopping, banking). Engineers must diagnose and fix problems observed in data centers quickly; however, doing so is extremely challenging. A significant hurdle is that engineers must spend significant time and effort exploring what instrumentation (e.g., log messages about specific application behaviors) is needed […]

CI-New: Collaborative Research: Modeling the Next-Generation Hybrid Cooling Systems for High-Performance Processors

Design of future high-performance chips is hindered by severe temperature challenges. For example, existing cooling mechanisms cannot efficiently cool the extremely high power densities that are expected in exascale systems. Emerging cooling technologies, which may address these temperature challenges, are not easily accessible for experimentation to computer engineers. In fact, there is a substantial lag […]

CAREER: 3D Stacked Systems for Energy-Efficient Computing: Innovative Strategies in Modeling and Runtime Management

Energy efficiency is a central issue in computing. In large-scale computing clusters, operational and cooling costs impose significant sustainability challenges. Embedded systems run increasingly complex, performance demanding workloads, making the well-known energy management policies inadequate. High power densities also cause high on-chip temperatures and large thermal variations, both of which degrade system reliability. The research […]