Secure Architecture Design
Sphinx: Hardware-Software Co-design Framework for Binary Code Diversification Based Secure Execution
In the Sphinx project, we are developing a hardware-software attack resistant computing system where the privacy and integrity of application executions are maintained. The design consists of two parts: a software obfuscation module and a dedicated hardware execution engine. During compile time, obfuscation instructions are added to the assembly code to produce a new program. The technique allows for multiple versions of a program to be produced and provides moving target security capabilities. For each version, an encrypted obfuscation mask code is produced to distinguish real and obfuscation instructions. A copy of the obfuscated executable and associated mask file is securely distributed to certified users. The dedicated secure hardware execution engine does just-in-time decryption of mask file and execution of the obfuscated program.
Hermes: Secure Heterogeneous Multicore Architecture
With the emergence of general-purpose system-on-chip (SoC) architectures in an array of application domains, some key security challenges arise. In these systems, tenants, i.e., intellectual property (IP) cores or processing units, may come from different providers and executable code may have varying levels of trust. It is therefore important to support multi-level user-defined security protocols that can isolate hardware subsystem and code while enabling optimal sharing of computing resources and data among the tenants. In this work, we are developing security mechanisms for integrating multiple tenants, secure to non-secure cores, into the same chip design, maintaining their individual security, preventing data leakage and corruption while promoting collaboration among the tenants.
Adaptive and Resilient Architecture Design
Helios Project: Adaptive-Approximate Computing Architecture
The Helios project is investigating approaches for designing computer systems that dynamically adapt and optimize their execution behavior according to a set of high-level application goals. The project will explore computer architectures that can automatically detect volume, variety, velocity or veracity variations in the application input streams and reconfigure themselves to meet user-directed performance to power ratios or real-time constraints. The goal for the project is to develop an adaptive-approximate computing system where accuracy can be tradeoff for compute time to withstand real-time changes in the input streams, or precision for power trade-offs can be made in the presence of power availability variations.
Coeus Project: Reconfigurable Architecture Design for Deep Learning Acceleration
Deep learning based algorithms currently provide the best solution to many computing problems from image recognition to health data analysis. Creating a domain-specific architecture template targeted mainly at deep learning based applications will lead to a more effective execution, since their specific performance, communication, and programmability requirements can be better addressed. Four important research questions that we addressing in this project are: (1) what is the appropriate granularity of the reconfiguring processing elements (fine-grained architectures at the granularity of field-programmable gate arrays and extend to complex processor cores), (2) what degree of homogeneity or heterogeneity should the micro-architectural of processor elements exhibit, (3) what is the effective memory organization and technology (from data placement to data movement), and (4) how to implement network intelligence to support and adapt to the communication and memory requirements between layers.
High-Performance Graph Processing Architecture Design
As data collection capabilities improve, both the amount of data available for analysis and the complexity of algorithms rapidly increase. In many applications, ranging from target identification and social network analysis to anomaly detection, the data of interest can be represented as a graph. A graph G = (V,E) is a pair of sets: a set of vertices, V, representing graph nodes, and a set of edges, E, representing relationships between the nodes. Graph-based algorithms and applications are primarily relations and events driven. They exhibit certain unique computational characteristics that often fail beyond the capabilities of current CPUs or GPUs. Our laboratory is investigating a new graph processing architecture that uses on self-timed circuits to leverage the event-driven nature of graph-based applications.
Resilient and Fault-Tolerant Interconnect Network-on-Chip Design
On-chip network (OCN) design has become increasingly challenging due to high levels of integration and complexity of modern systems-on-chip (SoCs). As feature size shrinks, transistors become less reliable and component failures increase. Transistor scaling and integration result in reliability challenges, including interference from electric fields, shrinking of the maximum-minimum voltage window, thermo-mechanical limitations, and soft, transient and intermittent errors. Therefore, beside high-throughput and effective load-balancing routing algorithms, fault-aware design techniques are also required. Our research efforts focus on modeling and evaluating bandwidth-adaptive, fault-aware, and self-reconfigurable on-chip network designs.
Architecture Design and Exploration Tools
The ASCS Lab is taking over the development and maintenance of the Heracles tool previously housed at the MIT CSAIL Laboratory. The tool Heracles presents designers with a global and complete view of the inner workings of a multiprocessor machine cycle-by-cycle from instruction fetches at the microprocessor core at each node to the flit arbitration at the routers, with RTL level correctness. A flit is the smallest unit of information recognized by the flow control method. This enables the designer to explore different implementation approaches: core microarchitecture, levels of caches, cache sizes, routing algorithm, router micro-architecture, distributed or shared memory, or network interface, and to quickly evaluate their impact on the overall system performance.