Flyer
Application
SCHEDULE
PROJECTS
People
Contact
Photo Gallery
REU
HOME


Sample Research Projects:
The research project background materials and problem
sets will be posted in the spring preprogram online workshop. The
research projects will be finalized in the first week of the summer REU
program. The research projects will be designed to allow undergraduate
student participants to be at the forefront of new innovations, yet work
within an environment that is manageable for their level of expertise.
Some sample research projects are listed as follows.
Graph mining for largescale networks using MapReduce
Faculty Mentor: Dr. Enyue (Annie) Lu
Analyzing patterns in largescale graphs, such as social and cyber
networks (e.g. Facebook, Linkedin, Twitter), with millions, even
billions of edges has many important applications such as community
detection, blog analysis, intrusion and spamming detections, and many
more. Currently, it is impossible to process information in realworld
largescale networks with millions even billions of objects with a
single processor. To overcome single processor limitations, a cluster of
computers with multiple processing elements operated in parallel
connected by a distributed network are used to solve largesize problems
and reduce processing time.
In this project, students will try to enumerate and identify important
graph patterns. The network is modeled as a graph. Each person is
represented as a vertex and a mutual friendship between people is
represented as an edge in the graph. Finding a pattern in a realworld
network is equivalent to finding a subgraph in a largescale graph. We
will map graph decomposing operations into a series of MapReduce
processes. The proposed MapReduce algorithms will be implemented in
Amazon Elastic MapReduce. We will also do performance comparison and
analysis for the proposed MapReduce algorithms and simulation results.
Implementation of parallel iterative improvement stable matching
algorithms
Faculty Mentor: Dr. Enyue (Annie) Lu
In a graph, a set of independent edges (no two edges in the set are
adjacent) is called matching. Matching algorithms are widely used in
many applications including database search, image processing, pattern
analysis, and scheduling. The stable matching problem was first
introduced by Gale and Shapley in 1962. Given n men, n women, and 2n
ranking lists in which each person ranks all members of the opposite sex
in the order of preference, a stable matching is a set of n pairs of man
and woman with each man and woman in exactly one pair, where there is no
pair who are not matched who both prefer the other to their current
partner. Gale and Shapley showed that every instance of the stable
matching problem admits at least one stable matching, which can be
computed in O(n2) iterations. For realtime applications such as switch
scheduling, the algorithm proposed by Gale and Shapley is not fast
enough.
To date, the most wellknown parallel algorithms for the stable matching
problem are all run on theoretical parallel computing models such as
CRCW PRAM. In this project, students will implement a parallel iterative
improvement (PII) stable matching algorithm with linear time average
performance. For realtime applications with a hard timeconstraint, the
PII algorithm can terminate at any time during its execution, and the
matching with the minimum number of unstable matching pairs can be used
as an approximation to a stable matching. The PII algorithm will be
implemented using MPICH2 from Argonne National Lab, NVIDIA CUDAenabled
GPUs, and MapReduce computing on Amazon EC2. The goal of the project is
to examine the parallelism of the stable matching by using the practical
PII algorithm. We will implement the PII algorithm on the software
level, measure its speedup against the sequential approaches, find out
its efficiency and applicability, and investigate the parallelism limits
of the stable matching algorithms.
Anomaly Detection for Network Data Using
MapReduce Faculty Mentor: Dr. Enyue (Annie) Lu As the
volume of collected data grows at an unprecedented rate with the recent
technology advances, information retrievals on very large data sets have
becoming a beneficial and challenging task. Anomaly detection, which is
used to identify abnormal events or patterns that do not conform to
expected events or patterns, has been a very useful methodology to
perform predictive modeling checking of available data and become
increasingly complex to detect intrusion through network records due to
large volumes of network traffic data.
In this project, we will
develop a new framework that combines graph modeling with MapReduce
computing techniques to tackle anomaly detection on network record data
at extreme scales. Graph is an expressive data structure and has been
widely used to model complex data in many applications. By graph
modeling, data will be represented as vertices and the relationship
(e.g., similarity in spatial, temporal, or semantic attributes) within
data are represented as edges in a graph. We will detect the anomalies
by analyzing the graph generated by the network data. We plan to tackle
the problem in the following three research steps. In the first step, we
analyze the characteristics of the data and develop an effective graph
model for the data. In the second step, we will develop efficient
MapReduce graphbased anomaly detection algorithms and analyze their
performance. In the last step, we will test the proposed algorithms and
verify their performance using realworld network data.
GPU accelerated medical image reconstruction and processing
Faculty Mentors: Dr. Yuanwei Jin and Dr. Enyue (Annie) Lu
Image reconstruction and processing is a rapidly developing field based
both on engineering, mathematics, and computer science. Algebraic
Reconstruction Technique (ART) is a well known reconstruction method for
computed tomography (CT) scanners. Although the ART method has many
advantages over the popular filtered backprojection approaches, due to
its high complexity, it is rarely applied in most of today’s medical CT
systems. The typical medical environment requires fast reconstructions
in order to save valuable time. Industrial solutions address the
performance challenge using dedicated specialpurpose reconstruction
platforms with digital signal processors (DSPs) and field programmable
gate arrays (FPGAs). The most apparent downside of such solutions is the
loss of flexibility and their timeconsuming implementation, which can
lead to long innovation cycles. In contrast, research has already shown
that current GPUs offer massively parallel processing capability that
can handle the computational complexity of twodimensional or
threedimensional cone beam reconstruction.
In this project, students will accelerate a new iterative image
reconstruction algorithm called “propagation and backpropagation (PBP)”
image reconstruction method using Matlab computing with NVIDIA
CUDAenabled GPUs. Through the project, students will learn the basics
of Matlab parallel computing for medical imaging with GPU support and
gain understanding of the benefits of parallel processing for large
scientific computing tasks based on a realworld medical imaging
problem. Furthermore, students will be able to verify their algorithms
using experimentally collected data through data measurement systems
funded by a Department of Defense (DOD) award and an NSF Major Research
Instrumentation (MRI) award.
Deep Learning and Data Analytics for Remote Sensing Applications
Faculty Mentors: Dr. Yuanwei Jin
With massive
amounts of computational power, machines can now recognize objects and
translate speech in real time. Research in this area attempts to make
better representations and create models to learn these representations
from largescale unlabeled data. Deep learning is part of a broader
family of machine learning methods based on learning representations of
data. Deep learning algorithms attempt to learn multilevel
representations of data, embodying a hierarchy of factors that may
explain them. Various deep learning architectures such as deep neural
networks, convolutional deep neural networks, and deep belief networks
have been applied to fields like computer vision, automatic speech
recognition, and natural language processing where they have been
demonstrated to be effective at uncovering underlying structure in data
and producing stateoftheart results on various tasks.
In this project, we will focus on remote sensing applications such as
radar target recognition and feature extraction of acoustic dispersion
characteristics. For example, automatic target recognition based upon a
sequence of synthetic aperture radar (SAR) images is an important task
for both military and civilian applications. By employing emerging deep
learning method applicable to SAR images and implementing the algorithms
on commercial offtheshelf graphics processing units (GPUs),
significant improvement in recognition performance is expected.
Exploring the Design of Optical Interconnected Multicore Computer
Architectures
Faculty Mentor: Lei Zhang
In pursuing more powerful computing capability, multiple and even many
computing cores are integrated into a chip. As a result the bottleneck
of computing is shifted from the how fast a core can compute to how fast
cores can transfer data to each other. By replacing the traditional
electrical wire to optical waveguide, in the Optical NetworkonChip
(ONoC), computing cores are integrated to one chip by which they can
communicate via lights. In addition, the ONoC offers better energy
conservation because of the higher power efficiency in optical
transmission. These outstanding properties enable the ONoC to be the
most promising candidates in constructing the next generation super
computers.
In this project, we will explore the ONoC system
design and development process. Students will be exposed to the advanced
computer architecture concepts, optical computing theories,
optoelectronics fundamentals, photonic VLSI design basics, and
dynamically reconfigurable ONoC architectures. Through the project,
participants will study the methodology of computing system architecture
design, explore network topologies, play with mathematical tools, and
develop software for simulation.
PersonalityAugmented Intelligent Agents and Their Behaviors in HPC
Faculty Mentor: Dr. Randall Cone Visual representation and
analysis of textual works have often aided human learning and
understanding. In the Digital Age this is particularly true, given the
advent of natural language processing, the wholesale availability of
general programming languages, and the maturation of digital
visualization. In our research, we eschew disciplinary boundaries to
view and analyze classic literary and other textual works in
unconventional ways. We study these texts with a sequence of
progressively sophisticated content analysis and feature extraction
software packages, many of which renders a useful artistic visual
representation of a given text. To examine the entire corpus of an
author’s (or group of authors’) works, appealing to the power of HPC is
a natural choice.
We have recently begun to take the above
mentioned content analysis and feature extraction research into the
realm of Artificial Intelligence (AI). Using a bootstrap of cognitive
and emotional reaction vectors, we endow artificially intelligent agents
with personalities, then observe their reactions to sets of textual
information. This work currently incorporates the following
technologies: WordNet, Word2Vec, neural networks, and a novel AI
framework written in the Python programming language. Our future plans
are to extend this work into distributed computing environments and HPC
(via mpi4py), strongly coupling it with a study wherein we establish
groups of personalityendowed AI to study a population of such
intelligences and their behavior.
Massively Parallel
Machine Learning Faculty Mentor: Joseph Anderson Many
modern computing challenges rely on massive amounts of data gathered 
from the user, the environment, a network, or other sources. The
difficulty of processing this data has swiftly outpaced even
industrialgrade computing hardware. Instead of trying to simply apply
standard algorithms and approaches on a larger scale, the goal of this
project is to explore which computational techniques can be adapted or
reformulated specifically for highperformance computing environments.
Applications include medical imaging data, signal processing and
recovery, compressed sensing, and neural networks.
HighDimensional Convex Geometry and Optimization Faculty Mentor:
Joseph Anderson Every person, as well as machine, constantly
makes choices. The classical way to approach making choices is to
consider balancing the cost of the choice versus the benefit of the
choice; this is often referred to as optimization. In many modern
machine learning applications (autonomous robots, selfdriving cars,
signal processing, medical imaging), these choices become more
complicated and also more consequential. Convex geometry is a field of
mathematics that offers powerful analytical tools for scientists to not
only develop efficient computational techniques for such decisions, but
offer rigorous theoretical guarantees about their performance.
This project will consider several important open problems from convex
geometry, which have important relationships to machine learning and
data analysis. Students will balance rigorous theory with experimental
study, using software to approach currently open questions in convex
geometry and optimization theory.
Parallel Processing
for AI Opponents in Games Faculty Mentor: Joseph Anderson
Computer games often rely on the implementation of “smart” adversaries
to engage their players in challenging, yet rewarding, game content.
However, as games become more sophisticated and allow the player more
and more freedom of interaction with the world, the problem of
simulating a formidable opponent  one who is unpredictable, skilled,
yet believable  becomes computationally challenging. Traditional
techniques for Artificial Intelligence break down quickly when there are
too few constraints on the opponents’ behavior. This
project will focus on developing algorithms for AI behavior which take
advantage of highperformance computing environments, with a focus on
parallel computing and largescale simulation.

