Research ExperienceS for Undergraduates


Exercise - Explore Emerging Computing in Science and Engineering








Photo Gallery






Sample Research Projects:

The research project background materials and problem sets will be posted in the spring pre-program online workshop. The research projects will be finalized in the first week of the summer REU program. The research projects will be designed to allow undergraduate student participants to be at the forefront of new innovations, yet work within an environment that is manageable for their level of expertise. Some sample research projects are listed as follows.

Graph mining for large-scale networks using MapReduce
Faculty Mentor: Dr. Enyue (Annie) Lu

Analyzing patterns in large-scale graphs, such as social and cyber networks (e.g. Facebook, Linkedin, Twitter), with millions, even billions of edges has many important applications such as community detection, blog analysis, intrusion and spamming detections, and many more. Currently, it is impossible to process information in real-world large-scale networks with millions even billions of objects with a single processor. To overcome single processor limitations, a cluster of computers with multiple processing elements operated in parallel connected by a distributed network are used to solve large-size problems and reduce processing time.

In this project, students will try to enumerate and identify important graph patterns. The network is modeled as a graph. Each person is represented as a vertex and a mutual friendship between people is represented as an edge in the graph. Finding a pattern in a real-world network is equivalent to finding a subgraph in a large-scale graph. We will map graph decomposing operations into a series of MapReduce processes. The proposed MapReduce algorithms will be implemented in Amazon Elastic MapReduce. We will also do performance comparison and analysis for the proposed MapReduce algorithms and simulation results.

Implementation of parallel iterative improvement stable matching algorithms
Faculty Mentor: Dr. Enyue (Annie) Lu

In a graph, a set of independent edges (no two edges in the set are adjacent) is called matching. Matching algorithms are widely used in many applications including database search, image processing, pattern analysis, and scheduling. The stable matching problem was first introduced by Gale and Shapley in 1962. Given n men, n women, and 2n ranking lists in which each person ranks all members of the opposite sex in the order of preference, a stable matching is a set of n pairs of man and woman with each man and woman in exactly one pair, where there is no pair who are not matched who both prefer the other to their current partner. Gale and Shapley showed that every instance of the stable matching problem admits at least one stable matching, which can be computed in O(n2) iterations. For real-time applications such as switch scheduling, the algorithm proposed by Gale and Shapley is not fast enough.

To date, the most well-known parallel algorithms for the stable matching problem are all run on theoretical parallel computing models such as CRCW PRAM. In this project, students will implement a parallel iterative improvement (PII) stable matching algorithm with linear time average performance. For real-time applications with a hard time-constraint, the PII algorithm can terminate at any time during its execution, and the matching with the minimum number of unstable matching pairs can be used as an approximation to a stable matching. The PII algorithm will be implemented using MPICH2 from Argonne National Lab, NVIDIA CUDA-enabled GPUs, and MapReduce computing on Amazon EC2. The goal of the project is to examine the parallelism of the stable matching by using the practical PII algorithm. We will implement the PII algorithm on the software level, measure its speedup against the sequential approaches, find out its efficiency and applicability, and investigate the parallelism limits of the stable matching algorithms.

Anomaly Detection for Network Data Using MapReduce
Faculty Mentor: Dr. Enyue (Annie) Lu

As the volume of collected data grows at an unprecedented rate with the recent technology advances, information retrievals on very large data sets have becoming a beneficial and challenging task. Anomaly detection, which is used to identify abnormal events or patterns that do not conform to expected events or patterns, has been a very useful methodology to perform predictive modeling checking of available data and become increasingly complex to detect intrusion through network records due to large volumes of network traffic data.

In this project, we will develop a new framework that combines graph modeling with MapReduce computing techniques to tackle anomaly detection on network record data at extreme scales. Graph is an expressive data structure and has been widely used to model complex data in many applications. By graph modeling, data will be represented as vertices and the relationship (e.g., similarity in spatial, temporal, or semantic attributes) within data are represented as edges in a graph. We will detect the anomalies by analyzing the graph generated by the network data. We plan to tackle the problem in the following three research steps. In the first step, we analyze the characteristics of the data and develop an effective graph model for the data. In the second step, we will develop efficient MapReduce graph-based anomaly detection algorithms and analyze their performance. In the last step, we will test the proposed algorithms and verify their performance using real-world network data.

GPU accelerated medical image reconstruction and processing
Faculty Mentors: Dr. Yuanwei Jin and Dr. Enyue (Annie) Lu

Image reconstruction and processing is a rapidly developing field based both on engineering, mathematics, and computer science. Algebraic Reconstruction Technique (ART) is a well known reconstruction method for computed tomography (CT) scanners. Although the ART method has many advantages over the popular filtered back-projection approaches, due to its high complexity, it is rarely applied in most of today’s medical CT systems. The typical medical environment requires fast reconstructions in order to save valuable time. Industrial solutions address the performance challenge using dedicated special-purpose reconstruction platforms with digital signal processors (DSPs) and field programmable gate arrays (FPGAs). The most apparent downside of such solutions is the loss of flexibility and their time-consuming implementation, which can lead to long innovation cycles. In contrast, research has already shown that current GPUs offer massively parallel processing capability that can handle the computational complexity of two-dimensional or three-dimensional cone beam reconstruction.

In this project, students will accelerate a new iterative image reconstruction algorithm called “propagation and backpropagation (PBP)” image reconstruction method using Matlab computing with NVIDIA CUDA-enabled GPUs. Through the project, students will learn the basics of Matlab parallel computing for medical imaging with GPU support and gain understanding of the benefits of parallel processing for large scientific computing tasks based on a real-world medical imaging problem. Furthermore, students will be able to verify their algorithms using experimentally collected data through data measurement systems funded by a Department of Defense (DOD) award and an NSF Major Research Instrumentation (MRI) award.

Deep Learning and Data Analytics for Remote Sensing Applications

Faculty Mentors: Dr. Yuanwei Jin

With massive amounts of computational power, machines can now recognize objects and translate speech in real time. Research in this area attempts to make better representations and create models to learn these representations from large-scale unlabeled data. Deep learning is part of a broader family of machine learning methods based on learning representations of data. Deep learning algorithms attempt to learn multi-level representations of data, embodying a hierarchy of factors that may explain them. Various deep learning architectures such as deep neural networks, convolutional deep neural networks, and deep belief networks have been applied to fields like computer vision, automatic speech recognition, and natural language processing where they have been demonstrated to be effective at uncovering underlying structure in data and producing state-of-the-art results on various tasks.


In this project, we will focus on remote sensing applications such as radar target recognition and feature extraction of acoustic dispersion characteristics. For example, automatic target recognition based upon a sequence of synthetic aperture radar (SAR) images is an important task for both military and civilian applications. By employing emerging deep learning method applicable to SAR images and implementing the algorithms on commercial off-the-shelf graphics processing units (GPUs), significant improvement in recognition performance is expected.

Exploring the Design of Optical Interconnected Multicore Computer Architectures
Faculty Mentor: Lei Zhang

In pursuing more powerful computing capability, multiple and even many computing cores are integrated into a chip. As a result the bottleneck of computing is shifted from the how fast a core can compute to how fast cores can transfer data to each other. By replacing the traditional electrical wire to optical waveguide, in the Optical Network-on-Chip (ONoC), computing cores are integrated to one chip by which they can communicate via lights. In addition, the ONoC offers better energy conservation because of the higher power efficiency in optical transmission. These outstanding properties enable the ONoC to be the most promising candidates in constructing the next generation super computers.

In this project, we will explore the ONoC system design and development process. Students will be exposed to the advanced computer architecture concepts, optical computing theories, optoelectronics fundamentals, photonic VLSI design basics, and dynamically reconfigurable ONoC architectures. Through the project, participants will study the methodology of computing system architecture design, explore network topologies, play with mathematical tools, and develop software for simulation.

Personality-Augmented Intelligent Agents and Their Behaviors in HPC
Faculty Mentor: Dr. Randall Cone

Visual representation and analysis of textual works have often aided human learning and understanding. In the Digital Age this is particularly true, given the advent of natural language processing, the wholesale availability of general programming languages, and the maturation of digital visualization. In our research, we eschew disciplinary boundaries to view and analyze classic literary and other textual works in unconventional ways. We study these texts with a sequence of progressively sophisticated content analysis and feature extraction software packages, many of which renders a useful artistic visual representation of a given text. To examine the entire corpus of an author’s (or group of authors’) works, appealing to the power of HPC is a natural choice.

We have recently begun to take the above mentioned content analysis and feature extraction research into the realm of Artificial Intelligence (AI). Using a bootstrap of cognitive and emotional reaction vectors, we endow artificially intelligent agents with personalities, then observe their reactions to sets of textual information. This work currently incorporates the following technologies: WordNet, Word2Vec, neural networks, and a novel AI framework written in the Python programming language. Our future plans are to extend this work into distributed computing environments and HPC (via mpi4py), strongly coupling it with a study wherein we establish groups of personality-endowed AI to study a population of such intelligences and their behavior.

Massively Parallel Machine Learning
Faculty Mentor: Joseph Anderson

Many modern computing challenges rely on massive amounts of data gathered -- from the user, the environment, a network, or other sources. The difficulty of processing this data has swiftly outpaced even industrial-grade computing hardware. Instead of trying to simply apply standard algorithms and approaches on a larger scale, the goal of this project is to explore which computational techniques can be adapted or re-formulated specifically for high-performance computing environments. Applications include medical imaging data, signal processing and recovery, compressed sensing, and neural networks.

High-Dimensional Convex Geometry and Optimization
Faculty Mentor: Joseph Anderson

Every person, as well as machine, constantly makes choices. The classical way to approach making choices is to consider balancing the cost of the choice versus the benefit of the choice; this is often referred to as optimization. In many modern machine learning applications (autonomous robots, self-driving cars, signal processing, medical imaging), these choices become more complicated and also more consequential. Convex geometry is a field of mathematics that offers powerful analytical tools for scientists to not only develop efficient computational techniques for such decisions, but offer rigorous theoretical guarantees about their performance.

This project will consider several important open problems from convex geometry, which have important relationships to machine learning and data analysis. Students will balance rigorous theory with experimental study, using software to approach currently open questions in convex geometry and optimization theory.

Parallel Processing for AI Opponents in Games
Faculty Mentor: Joseph Anderson

Computer games often rely on the implementation of “smart” adversaries to engage their players in challenging, yet rewarding, game content. However, as games become more sophisticated and allow the player more and more freedom of interaction with the world, the problem of simulating a formidable opponent -- one who is unpredictable, skilled, yet believable -- becomes computationally challenging. Traditional techniques for Artificial Intelligence break down quickly when there are too few constraints on the opponents’ behavior.

This project will focus on developing algorithms for AI behavior which take advantage of high-performance computing environments, with a focus on parallel computing and large-scale simulation.