REU SITE: Exercise - Explore Emerging Computing in Science and Engineering |
|||
|
|||
|
Sample Research Projects: The research project background materials and problem
sets will be posted in the spring pre-program online workshop. The
research projects will be finalized in the first week of the summer REU
program. The research projects will be designed to allow
undergraduate
student participants to be at the forefront of new innovations, yet work
within an environment that is manageable for their level of expertise.
Some sample research projects are listed as follows. Anomaly Detection for Network Data Using
MapReduce Graph Clustering with Applications on Covid-19 Growth Data Faculty Mentors: Dr. Enyue (Annie) Lu
Coronavirus
Disease 2019 (Covid-19) has affected all people across the world.
According to the Center for Disease Control (CDC), more than 600,000
lives have been lost due to Covid-19 across the U.S.. Much of this has
been attributed to a lack of preparedness and a lack of resources.
Different counties across the U.S. have had varying rates of Covid-19
case growth due to area density, strictness of guidelines, deployment of
Covid-19 vaccines, or other factors. The ability to analyze Covid-19
growth data across counties in the US and find similarities and possible
relations between them could help predict future trends in the nation,
so that resources can be allocated in a way that would allow the spread,
number of cases, and mortality rate of the virus to decrease
substantially in all counties.
Graph is an
expressive data structure and has been widely used to model complex data
in many applications. Data-driven graph construction and graph learning
methods have been proven to be an effective way of designing general
machine learning algorithms and have achieved promising research
results. In this project, we plan to leverage our prior work on graph
mining MapReduce cloud computing algorithms for large-scale graph to
develop graph clustering machine learning algorithms with applications
in public health. The REU students will investigate graph clustering
techniques for Covid-19 data and apply graph clustering machine learning
algorithms to identify centers and dimeters of the Covid-19 case
clusters. We will test the accuracy of our algorithms on a testbed that
has been be created based on our preliminary work on network data using
Apache Spark. We will also test the scalability of our algorithms in
Amazon Elastic MapReduce, Google Colab, and XSEDE infrastructure. Image Processing and Computer Vision Algorithms for Sustainable Shellfish Farming Faculty Mentors: Drs. Enyue (Annie) Lu, Yuanwei Jin, Lei Zhang Aquaculture
of shellfish such as oysters, mussels, and scallops provides a
sustainable, environmentally beneficial source of high-protein food, as
well a way to grow the economy in rural coastal areas. As the demand for
seafood continues to surpass the supply of wild-caught fish and
shellfish, sustainable aquaculture is becoming recognized as a solution
for feeding a future global population of nine billion. Current
practices and technologies used in shellfish farming lack the
advancement found in today’s digital, automated world. Transforming
traditional shellfish farming to sustainable smart farming demands new
technologies, such as sensing and imaging, machine learning, artificial
intelligence, high performance computing, computer vision, and robotics
in the seeding, dredging, and harvesting processes.
In this
project, we propose to develop innovative image processing and computer
vision algorithms by applying machine learning and high performance
computing. By leveraging our current funded USDA research, this project
will enable undergraduate students to develop algorithms based upon our
previous REU work on imaging processing algorithms using deep neural
networks. Students will dive into two sources of authentic data – data
collected in water tanks in the SPIS-Lab at UMES and the data collected
by underwater drones at Pacific Shellfish Institute to decipher what
happens in the video frames in order to detect oysters and recognize the
activities (active versus resting) of each individual oyster. Students
will further design effective algorithms to monitor the oysters’ amount
of interaction, length of interaction, growth, and its overall
adaptivity to its surroundings, and develop a smart shellfish farming
software system for crop inventory monitoring and identification of
behaviors of oysters and their relationship with their habitat remotely.
Finally, students will test the algorithms over commodity off-the-shelf
GPU clusters for HPC implementation.
GPU accelerated medical image reconstruction and processing In this project, students will accelerate a new iterative image reconstruction algorithm called “propagation and backpropagation (PBP)” image reconstruction method using Matlab computing with NVIDIA CUDA-enabled GPUs. Through the project, students will learn the basics of Matlab parallel computing for medical imaging with GPU support and gain understanding of the benefits of parallel processing for large scientific computing tasks based on a real-world medical imaging problem. Furthermore, students will be able to verify their algorithms using experimentally collected data through data measurement systems funded by a Department of Defense (DOD) award and an NSF Major Research Instrumentation (MRI) award. Deep Learning and Data Analytics for Remote Sensing Applications Faculty Mentor: Dr. Yuanwei Jin With massive amounts of computational power, machines can now recognize objects and translate speech in real time. Research in this area attempts to make better representations and create models to learn these representations from large-scale unlabeled data. Deep learning is part of a broader family of machine learning methods based on learning representations of data. Deep learning algorithms attempt to learn multi-level representations of data, embodying a hierarchy of factors that may explain them. Various deep learning architectures such as deep neural networks, convolutional deep neural networks, and deep belief networks have been applied to fields like computer vision, automatic speech recognition, and natural language processing where they have been demonstrated to be effective at uncovering underlying structure in data and producing state-of-the-art results on various tasks.
In this project, we will focus on remote sensing applications such as radar target recognition and feature extraction of acoustic dispersion characteristics. For example, automatic target recognition based upon a sequence of synthetic aperture radar (SAR) images is an important task for both military and civilian applications. By employing emerging deep learning method applicable to SAR images and implementing the algorithms on commercial off-the-shelf graphics processing units (GPUs), significant improvement in recognition performance is expected.
Flood detection by Deep Learning with Uninhabited Aerial Vehicle Radar
Imagery
Floods are the most frequent, disastrous, and widespread natural
hazards. Research shows that floods account for more than 70% of hazard
events occurring globally between 1994 and 2013. One of the leading
causes of flooding in the US is hurricane.
As climate change intensifies, the frequency and severity of floods is
expected to increase, posing significant challenges for societies
worldwide.
One modern approach to minimize the risk associated with massive
inundation is building radar technology, for example, NASA’s
Uninhabited
Aerial
Vehicle
Synthetic
Aperture
Radar
(UAVSAR) to automate
flood detection, allowing organizations and first responders to aid in
the response and recovery efforts sooner. Unlike optical images, UAVSAR
has its capability for disaster segmentation due to its ability to work
in a variety of lighting and weather conditions even when clouds are
present in the atmosphere.
In this project, we will use images captured by NASA’s
UAVSAR for flood detection based upon emerging machine learning and image
processing techniques. Machine learning often refers to the use
of data and learning algorithms to enable machines or computers to
perform tasks by imitating intelligence human behaviors. In the field of
flood detection, reports have shown that machine learning can be applied to flooded water segmentation utilizing UAVSAR include edge
detection, random forests and clustering methods (k-means and fuzzy
c-means).
Moreover, deep learning (DL), a type of machine learning but with a more
complex structure of algorithms to process and learn from data, has
proven to be of great use for several applications of UAVSAR. We’ll
focus on two major research tasks: (1)
Revisiting flood zone maps using AI/DL and remote sensing integration;
and (2) Measuring coastal land cover changes from flooding using
AI/DL models. By leveraging our prior work of former REU students in
the hurricane Florence
in 2018 in North Carolina, this
work will utilize of deep learning and AI methods to segment bodies of
water in flooded areas caused by hurricanes using UAVSAR imagery.
Exploring the Design of Optical Interconnected Multicore Computer
Architectures In this project, we will explore the ONoC system design and development process. Students will be exposed to the advanced computer architecture concepts, optical computing theories, optoelectronics fundamentals, photonic VLSI design basics, and dynamically reconfigurable ONoC architectures. Through the project, participants will study the methodology of computing system architecture design, explore network topologies, play with mathematical tools, and develop software for simulation.
Personality-Augmented Intelligent Agents and Their Behaviors in HPC Real-time
Model Adaptation for Human Activity Recognition
In this
project, we will address the above research challenge by combining the
following three strategies: 1) Parallel computing based HAR model
training on GPUs. We will investigate how to minimize the training time
of deep learning HAR models by taking full advantage of the parallel
computing ability of GPUs; 2) Sample selection for model adaptation.
Sample selection eliminates redundant training samples to further reduce
the computation cost of model adaptation. We will investigate how to
evaluate and select the most representative training samples; and 3)
Incremental learning based model update. Model adaptation contains model
retraining and/or model update. Compared with model retraining, model
update adapts a model incrementally and requires less computation cost.
We will investigate how to update deep learning HAR models using
incremental learning methods. Through this project, students will learn
the process of HAR and the related deep learning models, such as
Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM).
In addition, they will accumulate hands-on experience by conducting
experiments on a public dataset using Matlab.
Online Deep Network Model Adaptation for Facial Emotion Recognition
Facial
emotion recognition (FER) aims to identify and distinguish human
emotional states (joy, anger, fear, disgust, sadness, etc.) through
analysis of facial images and/or videos, employing machine learning and
artificial intelligence technologies. Its applications span healthcare,
education, advertising, public safety, social media, and entertainment.
In recent years, deep network models, especially convolutional neural
networks (CNN), have revolutionized FER, outperforming traditional
machine learning approaches that rely on manual feature engineering and
simple algorithms. Nevertheless, achieving high-accurate FER remains
challenging, particularly in diverse
application scenarios, including highly personalized facial
expressions (impacted by socio-cultural contexts, individual
differences, etc.) and dynamic environments (impacted by lighting
conditions, camera angles, backgrounds, facial occlusions, etc.).
Therefore, it is essential to adapt FER models to diverse users and
environments. However, retraining deep network models demands a large
number of labeled samples and involves intensive computation to re-learn
millions of model parameters, leading to unacceptable delays in
real-world applications. One important research question is how to adapt
deep network models online and in real-time for high-accurate emotion
recognition across diverse application scenarios.
Reeb Graph and Persistent Diagram of 3D Mesh Models Faculty Mentor: Junyi Tu Topological Data Analysis (TDA) has emerged as a new and promising field for processing, analyzing and understanding complex data and has gained great impetus in the last two decade. TDA has been applied in machine learning, computer vision, drug design, computer graphics and many other fields. The popularity of topology-based techniques is due in large part to their ability to capture the intrinsic property of data, robustness and their applicability to a wide variety of datasets and scientific domains. Reeb graph was originally proposed as a data structure to encode the geometric skeleton of 3D objects, but recently it has been re-purposed as an important tool in TDA. Reeb graph encodes the evolution of level sets obtained from a scalar function by sweeping the entire domain space and tracking the topology changes such as birth and death of the connected components in the level sets.
The
scalar fields on the 3D mesh model determine the shape of a Reeb graph.
In this project, students will explore different scalar fields on the 3D
mesh model. The first one is the Gaussian distribution on the vertices
of 3D mesh model, and the second is the geodesic distance integral. We
will deploy our algorithms on HPC machines to speed up the computing
process. After computing Reeb graphs, we will obtain the persistent
diagrams using the software. Another goal of this project is to have a
better understanding of visualizations between Reeb graph and persistent
diagram, and add an interaction interface to the visualization/user
study in WebGL browser. The interaction interface will have the ability
to adjust the number of contours shown, reduce low persistence features
in Reeb graphs, and remove low persistence points in persistence
diagrams.
Illustrating n-tuple graphs and their internally disjoint paths Faculty Mentor: Alexander Halperin
Can subcollections of nodes be used a coordinates for an even larger
structure? Consider a graph G, which is a collection of points connected
by lines, whose vertices have labels 1 through k. Now, imagine an
n-tuple graph Un(G), whose points correspond to n-element subsets of
{1,…,k} and whose lines connect nearly identical subsets. While small
examples can be done by hand, illustrating Un(G) for large G is
nearly impossible without HPC because of the large number of points and
lines. It was recently shown that each pair of points in Un(G) has
(n-1)(d-n+1)+1 internally disjoint paths between them. Viewing these
(n-1)(d-n+1)+1 internally disjoint paths would provide insight into
their structure and symmetry throughout Un(G).
|
||
|
|||
|
|||
|