News ‎ > ‎

USF CS Night 2016: Student Projects

posted Dec 8, 2016, 10:34 AM by Rosa Maria Garay [ updated Dec 13, 2016, 12:05 PM ]
Following are some of the student projects presented at USF CS Night 2016. Projects were presented from
the Masters and Senior Project courses, the Bioinformatics course, and from some faculty-sponsored projects
.

Using Animation to Alleviate Overdraw in Multi-class Scatterplot Matrices

Presenter:                 Helen Chen (hhchen@dons.usfca.edu)

Faculty Advisors:      Profs.Sophie Engle and Alark Joshi

Abstract:
Scatterplots are a widely-used technique for visualizing multivariate datasets. Even though scatterplots play an important role in data visualization,
they have known issues with overdraw. Overdraw occurs when points or glyphs are drawn on top of each other and obscure the underlying data.
Overdraw affects the ability of viewers to correctly understand the data distribution and discern relationships among subgroups of the data.
There are a variety of techniques for alleviating overdraw, none of which involve animation. Our research aims to use animation to visualize
multidimensional data for multi class scatterplot matrices and compare its efficacy in alleviating overdraw against that of other techniques.

Student Record Verification App - A Decentralized Application

Presenters:                Mayank Thirani
Ryan Zhu
Jakob Tarnow

Sponsor:                     Jim Huang

Faculty Advisors:     CS 690 Master's Project, Prof. Olga Karpenko

Abstract:
The Student Record Verification App will utilize the characteristics of Blockchain to solve the trust problem between recruiters from human resource
departments around the world needing to verify the applicants' claim of authenticity of their education degree without relying on intermediary third party
making that verification. Each transaction in the Blockchain is verified by consensus of a majority of the participants in the network. The Blockchain contains
all verifiable student records and past transactions. Allowing a recruiter to validate that a student’s educational background matches that of it’s respective
registrar via Blockchain, removes the need for a central entity and leads to faster attestation of student records.

Ten-X Hackday Tool

Presenters:               Jeremiad Raymond
Teng Hu
Yi Xiao

Sponsor:                   Jon Rahoi

Faculty Advisor:        CS490 Senior Project, Prof. Jeffrey Johnson

Abstract:
Ten-X Hackday Tool is an online web application designed as a platform to our sponsor company (Ten-X) for an annual programming competition
called “Hackday”. This platform will be used to collect, store, and process data entered by participants and graders. The purpose of this project is to
provide a better platform for Hackday organizers to control the flow of Hackday by handling the most repetitive tasks.
Sudokil

Presenter:              Dominic Mortlock
Sponsor:                The client would be students interested in learning more about unix and scripting concepts. Also, the game aims to be a fun
challenge that people can practice both their unix knowledge and their problem solving skills.

Faculty Advisor:    Prof. David Wolber

Abstract:
Sudokil is a hacking/scripting themed puzzle game about using Unix-like commands on a terminal to control computers, robots, and various other devices.
Progress through levels and get access to different puzzle elements while collecting more scripts, permissions and tools.
Customer Ticket Classification Engine - Applying Machine Learning Algorithms to SnapLogic Metadata

Presenters:              Min Chen
Shiyi Tan

Sponsor:                  Prof. Gregory Benson

Faculty Advisor:     CS690 Master's Project, Prof. Karpenko

Abstract:
SnapLogic customer service team needs to prioritize customer tickets and measure customer satisfaction, which was previously done manually and
was very time consuming. To automate this process, we built two engines, one for prioritizing tickets and one for the sentiment analysis of customer
comments. We first analyzed the ticket data and fit the two models, then used the models to predict the ticket priority and the sentiment of the comment
(“neutral” vs “negative”). If the ticket is labeled as “high priority” or contains a negative comment, our system sends an alert to the customer support team.
That allows the team to handle interactions with customers more wisely and saves their time.

Snap Recommendation Engine

Presenter:               Thanawut Ananpiriyakul

Sponsor:                 Prof. Gregory Benson

Faculty Advisor:     CS690 Master's Project, Prof. Karpenko

Abstract:
SnapLogic has been providing data integration services for years. A snap is a pre-built component that performs an operation on data. A pipeline is a
graph (DAG) of snaps which executes a specific task. In order to successfully build a pipeline, the user needs to select the right snap and connect it correctly
to the previous snap. For this project, we built the engine that recommends the most likely next snaps to users. We achieved 88% hit rate in the final prototype
implemented in Python. It means that 88% of time "deciding on the type of snap + searching for it among 100 types of snaps + dragging and dropping it to
canvas" will be reduced to "1 click.”

My Smart Financial Advisor - A Mobile Application for Mutual Fund Investment Management

Presenters:                Richard Wang
Chen-Ning Chi
Kaynat Quayyum

Sponsor:                    Stephen Y. Pak, The Core Group

Faculty Advisor:         CS690 Master's Project, Prof. Olga Karpenko

Abstract:
Mutual fund investment currently makes up a vast proportion of the retirement assets for Americans. At the same time, as mobile devices attain increasing
capabilities and popularity, more people switch from PC to mobile devices such as tablet computers and smartphones. We provide a platform to buy and sell
mutual fund shares on both iOS and Android devices. This enables users to manage mutual fund investment anywhere and anytime. Our application is
implemented in C# using Xamarin that allows us to build iOS and Android apps from a single shared codebase. Our app provides a good user experience
with high level of security.

An Exploration of Single Nucleotide Polymorphisms on Type 2 Diabetes Outcome

Presenters:                Michael Totagrande
Irina Popova

Sponsor:                    Sean Kimbro, North Carolina Central University and La Creis Kidd from University of Louisville

Faculty Advisor:         CS640 Bioinformatics, Professor Patricia Francis-Lyon

Abstract:
Type 2 Diabetes (T2D) affects millions and is characterized by the inability to produce enough insulin, resulting in improper glucose regulation. With
numerous direct risk factors, including increased body mass index (BMI), race, high blood pressure, and the presence or absence of certain single
nucleotide polymorphisms (SNPs), T2D is a complicated disease. Herein, we explore over 600,000 SNP frequencies for more than 2000 individuals
in order to determine their impact on T2D outcome.

Using Face Tracking for Computationally-Efficient Visualization of Large Vector Data

Presenter:                Thanawut Ananpiriyakul

Faculty Advisor:       Prof. Alark Joshi

Abstract:
Visualizing large vector data is computationally expensive. Given that human beings can only visualize a certain region of a screen at a time, we have developed
a novel face tracking-based technique for visualization of large vector data. This focus+context visualization of vector fields reduces visual clutter and helps the
user visualize features of interest. We chose to use streamline and glyph-based methods to represent the vector data. Users can interact with the data in real time,
choosing regions of interest through a mouse, a touch interface, or their face. The presented visualization technique results in frame rate that is almost 5 times
higher than the full detail visualization of vector data.

Exploring Leap Motion for Intuitive Interaction of Scientific Data

Presenter:                Shiyi Tan

Faculty Advisor:     Directed Study, Prof. Alark Joshi

Abstract:
We explore use of the Leap Motion with intuitive interaction of medical data, trying to help practitioners interact with large, high-resolution datasets.
We use VTK for the visualization pipeline that includes data processing, surface extraction/volume rendering, and basic user interaction. We facilitate
freeform interaction without the use of a mouse and keyboard using the Leap Motion. With the Leap Motion controller, users can explore the 3D data
and perform basic interaction such as rotation, translation, and zooming in.

Computational Enzymology

Presenters:                Stephanie Martin
Meriam Vejiga
Adrian Ramirez

Sponsor:                    Distributed Bio is a an antibody discovery, engineering, informatics and services company focused on
producing next generation antibody libraries and revolutionary vaccines.

Faculty Advisor:         CS640 Bioinformatics, Professor Patricia Francis-Lyon

Abstract:
Enzymes are the original organic chemists (OCC), capable of catalyzing a wide variety of reactions that have great therapeutic potential. Many enzymes
have been cataloged and annotated using the Gene Ontology, a gene annotator's reference, and categorized by the Enzyme Commission, a database that
classifies enzymes based on the nature of their enzymatic activity. We took advantage of these two databases to mine for homology groups with similar
enzymatic activity, but different substrates. We characterized these enzyme groups by sequence variability and enzymatic variability. This work provides a
foundation for the creation of a new class of enzyme replacement therapy and for the creation of a new generalized synthesis technology.

Mechanistic Indicators of Childhood Asthma

Presenter:                Stephanie Styx

Sponsor:                   Dr. ClarLynda Williams-DeVane from North Carolina Central University sponsored the mechanistic indicators of childhood asthma project.
Her objective for this project is to identify key environmental exposure contributors to asthma subtypes of varying severity.

Faculty Advisor:        CS640 Bioinformatics, Professor Patricia Francis-Lyon

Abstract:
Understanding the relationship between environmental factors and their affect on asthmatic children. Through principal component analysis, we looked at the
correlation of how much variance there is when asthmatic children are exposed to similar or different environmental factors. The cohort of patients analyzed in
this project were asthmatic African American children from Detroit, Michigan.

Computational Enzymology

Presenters:                Stephanie Martin
Meriam Vejiga
Adrian Ramirez

Sponsor:                   Dr. Jacob Glanville, former Principal Scientist at Pfizer, PhD in Computational and Systems Immunology at Stanford University School of Medicine
and current Chief Science Officer of Distributed Bio.

Faculty Advisor:         CS640 Bioinformatics, Professor Patricia Francis-Lyon

Abstract:
Knowing the three dimensional structure of proteins is essential to understanding how the protein functions. Currently protein structures are determined
through x-ray crystallography, which can be difficult and laborious for some projects. Dr. Jake Glanville, CSO of Distributed Bio, created a coding package to
predict the structure of B cell and T cell receptors. This code draws on probabilistic alignment and hidden markov modeling and uses Hmmr3.0 and the
NCBI BLAST toolkit to identify potential templates for homology modeling then generates a model using UCSF’s Modeller. We tested the potential of the script
pdb-getModels.pl to accurately produce models by using an input, self-models=0, to remove any template with more than 95% identity to the query sequence,
ensuring the program didn’t fetch the known crystal structure of the query for homology modeling. After generating hundreds of models, we used another script,
rmsd-Calculate.py, to calculate the root mean squared deviation (RMSD) of the generated model superimposed on the published structure to validate whether
this package has the potential to accurately predict the variable regions of antibodies.

Ozone exposure causes differential expression of genes involved in cell growth and DNA binding

Presenter:                 Chelsea Yee,
Amrita Rishi

Sponsor:                    Dr. Mehrdad Arjomandi

Faculty Advisor:        CS640 Bioinformatics, Professor Patricia Francis-Lyon

Abstract:
Ozone - a gas with high oxidation potential is a major component of air pollution and has been found to damage the respiratory tissues in humans.
To our knowledge, no one has yet published the results of an exacerbation study utilizing ozone as a model for the impact of air pollutants. An ongoing
study by Dr. Mehrdad Arjomandi and associates at UCSF aims to establish the impact of ozone-induced injury and inflammation in asthma and other
lung diseases. Currently, this study aims to determine the differentially expressed genes (DEGs) in subjects, both with and without asthma, that were
exposed to medium (100ppm) and ambient (200ppm) levels of ozone. Gene expression levels for 18 subjects were determined by Affymetrix microarray
in an ozone-exacerbation study performed by Dr. Arjomandi’s team at UCSF. In partnership with Dr. Arjomandi and Prof. Francis-Lyon(USF), our team
performed statistical analysis of the Affymetrix microarray data in R using the limma package to identify DEGs in airway epithelial tissues in response to
ozone exposure. A total of 68 DEGs was determined from the Affymetrix microarray data for all 18 patients. Among the 68 DEGs, 4 were more frequently
differentially expressed (adjusted p-value < 0.1): MAPRE3, HKR1, MOB3B and ZFR. Genes MAPRE3 and HKR1 were up-regulated whereas MOB3B and
ZFR were down-regulated. Further studies providing new knowledge of the function and downstream effects of these genes can lead to the possibility of
new gene therapy and pharmacological targets.

Muse Mobile App

Presenter:                MD Naseem Ashraf

Faculty Advisor:       CS640 Bioinformatics, Professor Patricia Francis-Lyon

Abstract:
An Android app that leverages Muse headbands to record and transmit eegs from mobile devices easily and quickly.

AI for Princes of California

Presenters:                Kyle Baker
Austin Bushree
Cole Howard

Sponsors:                Jon Rahoi and Justin Sher. Noo Games

Faculty Advisor:        CS490 Senior Project, Prof. Jeffrey Johnson

Abstract:
Princes of California is a strategic board game that is similar to a hybrid of Monopoly and Poker. A single turn consists of playing a tile on the board and
buying up to three shares of any companies that have been built from the tiles on the board. The current built-in opponent makes random moves and is
easily defeated by human players. Our project seeks to use multiple techniques to build a competitive AI opponent for this game.
We will be implementing a heuristic algorithm based on the strategies we have developed while playing the game. We will also be fine-tuning a neural
network using TensorFlow, an open source machine learning package. The network will be trained by playing against random bots. The gameplay tactics
change based on the number of players, so our AI will be trained separately for 2, 3, 4, 5, and 6 player games. The ultimate goal of our project is to build an
AI that strategically places tiles and buys shares of companies to create an entertaining opponent for online players.
Fitness App for Vue Smart Glasses

Presenters:                Scott Zhu
Ji Lu
Shengcai Cheng

Sponsor:                    Jason Gui, Vigo Technologies

Faculty Advisor:        CS690 Master's Project, Prof. Olga Karpenko

Abstract:
Vue is a wearable device, a pair of “smart” glasses designed for everyday use. Our team developed the companion app for Vue on iOS and Android.
The app provides fitness features such as step tracking, calorie counting and inactivity alert that help people lead healthier and more active lives. It
also provides some additional features such as finding the device using the app, and delivering notifications.

Visualization of Hierarchical Time-Series Data Using the Sunburst Technique

Presenters:                Joey Estella,
Marissa Masangcay,
Lyndon Ong Yiu,
Mohammad Bazarbay

Sponsors:                  Profs. Sophie Engle and Alark Joshi

Advisor:                      CS490 Senior Project, Prof. Jeffrey Johnson

Abstract:
The Visualizing Time Series Data project addresses the need to visualize new ways to aggregate large time series data. Often times, data becomes
too large when in its raw form. Then the problem becomes how to aggregate that data, i.e., what kind of metrics (mean, median etc.) and levels (days,
hours, etc.) need to be used to summarize and see patterns and trends from this data.
Our visualization tool attempts to address this problem. Our tool features an interactive dashboard where users are able to view organized data in a
sunburst visualization that displays the data in a meaningful way. Included in the interactive dashboard is an interactive sunburst visualization with a
complementary line chart that corresponds to data in the sunburst. This tool gives added context to otherwise ‘normal’ looking data in order for the user
to gain meaningful and significant conclusions about the data at hand. This tool also features a non-interactive dashboard where various static sunburst
visualizations are displayed in a grid for easy comparisons. These static visualizations feature multiple metrics (mean, median, etc.) across different levels
(days, hours, etc.).

Real Estate Recommendation Engine

Presenters:                Rob Reeves,
Simon Kwong,
Zhe Xu

Sponsor:                    Jon Rahoi, Ten-X

Faculty Advisor:        CS690 Master's Project, Prof. Olga Karpenko

Abstract:
Buying or selling a property is not something most of us do every day. But when the time comes, searching for a new home or office is exhausting. It's stressful,
agitating, and searchers often find themselves settling for less. We developed a recommendation engine for a Ten-X real estate marketplace, that will assist in
alleviating some of that stress. The engine recommends available properties based on past user activity. It uses a graph-based recommendation algorithm that
combines collaborative and content-based filtering.The goal is to maximize the likelihood a user will interact with the recommended properties embedded in the ad.

Comments