Graduates of the MS in Data Science Program
We are proud of our graduates. Their theses are available for download through the CCSU library.
- Hunter Wagner (2019), Feature Engineering Yelp Preview Data : Using Yelp to Predict Food Safety Infractions Cited by Toronto's DineSafe Program.
Hunter is a Senior Business Intelligence Analyst at USAA, Inc., in Washington DC.
- W. David Dotson (2019), Analysis of Predictive Models in the Attribution of Text to Medieval Copyists
David is a Health Scientist at the Centers for Disease Control, Office of Public Health Genomics, Atlanta.
- Indu Kumari Bhatt (2018), Efficacy of Deep Learning in Image Classification
Indu is a Data Analyst / Data Scientist at A Startup in San Francisco.
- Amy Wiseman (2018), Closed-end funds: Factors predicting premium-discount
- Patrick Keller (2018), Exploring Player Synergy Using Association Rules
Patrick is the Campus Director of Institutional Research for Quinebaug Valley CC and Three Rivers CC in Connecticut.
- Valeriia Ilginisova (2018), Applying Data-driven Error Costs to Predict the Default Rate in the Online Peer-to-peer Lending Platform LendingClub
Valeriia is a Data Scientist at Wheel the World, in Nairobi, Kenya.
- Zachary Felix (2018), Classifying College Basketball Players by Skillset
- Ramin Reybod (2018), Applying Data Mining on Relational Data and Networks Analysis with Emphasis on Link Prediction
Ramin is a Senior Analytics Consultant at CIBC, Inc., in Ontario, Canada.
- Albert James Viscio (2017), Diagnosis of Alzheimer's Disease Based on a Parsimonious Serum Autoantibody Biomarker Derived from Multivariate Feature Selection
James is a Data Scientist, Analytics and Research, at Travelers Insurance Company in Hartford, CT.
- Rajini Prabhu (2017), Predictive Modeling of Credit Card Default
- Nicholas Restifo (2016), Modeling the NBA Draft : Leveraging Data Mining Techniques to Project Future NBA Talent
Nick is a Basketball Analytics Scientist with the Minnesota Timberwolves.
- Kevin Bahr (2016), Text Classification and Topic Modeling of Local Newspaper Articles from Local Newspaper RSS Feeds.
- Jim Eggers (2016), Consumer Sentiment Analysis of Twitter: A Multi-Industry Analysis.
- Brian Ironside (2016), Improving the Performance of Ensemble Classifier Models through the Local Specialization of Base Classifiers.
- James Cunningham (2016), Employing Data Mining Methods to Assess the Efficacy of Classical Credit Risk Models.
- Cathy Farrell (2016), Trinary Predictive Classification of Diabetic Episode Recurrence.
- Philip Hickey (2016), Identifying the Attributes that Contribute to Success in the PGA Tour’s FedEx Cup Using ShotLink Data.
- Nicholas Restifo (2016), Modeling the NBA Draft: Predicting the Best Future NBA Players
- Rick Rountree (2015), Opinion Mining of Unstructured Text with Application to Extracted User Article Comment Text from the New York Times Website
- Marcos Martins-Souza (2015), Use of Parallel Computing to fit OLS Regression Models Using SAS
- Marc Glettenberg (2015), Predictive Classification of Player Skill Level Using Telemetric Data in the StarCraft 2 Video Game
- Xavier Antony (2015), Classification of Fetal Heart Rate Patterns
- Andrew Hendrickson (2015), Classification Modeling of Freddie Mac Home Loan Delinquency
- Jeffrey Richardson (2014), Applying Association Rules to Optimize Enterprise Software Development
- Jyota Snyder (2014), Mining Gene Expression Data Generated by Next-Generation Sequencing Technology
- Ben Dickman (2014), Using Data Mining to Predict Success in a Nursing Program
- Eric Flores-Acosta (2014), Implementation of Stability-Based Cluster Validation Using Prediction Strength
- John Almeida (2014), Using Predictive Analysis to Enhance the Efficiency of Commuter Transportation Networks
- Alex Bitiukov (2014), Implementation of Customer Lifetime Value Model in the Context of Financial Services.
- Richard Aceves (2014), Predicting Change in County-Level Presidential Election Voter Turnout Using Data Mining Methods
- Jill Willie (2014), Assessment of Similarity-Based Cluster Validation Methods
- Sairam Tadigadapa (2014), The Application of Decision Trees for Diagnosing Liver Disease
- Paolo Carbone (2014), Distributor Price Optimization Using Market Segmentation
- Abe Weston (2014), Predictive Modeling of Pay-per-Click Keywords Bid Value
- Jeffrey Allard (2013), An Application of Gradient Boosted Decision Trees and Random Forests to Prospect Direct Marketing Response Modeling
- Thomas Wilk, Jr. (2013), Improving Workplace Accident Fatality Classification Models with Text Mining and Ensemble Methods
- George DeVarennes (2013), Applying Cost Benefit Analysis to a Trinary Classification Model
- Daniel Aloi (2013), Using Crime Prediction Models to Aid Law Enforcement in Resource Allocation and Decision Making
- William E. Rowe (2013), Classifying Web Pages by Image Attributes
- Martin Couture (2013), Applying Data Mining Techniques in Classifying Personal Automobile Insurance Risk
- Steven Cultrera (2013), Analysis of the Impact of Weather on Runs Scored in Baseball Games at Fenway Park
- Kay Batta (2013), Applying Misclassification Costs to Ameliorate the False Positive Rate in Bioassay Screening
- Scott W. Burk, PhD (2012), Measuring Serial Emotional Content in the Enron Email Corpus
- Senthil Murugan (2012), Mining for Profitable Low-Risk Delta-Neutral Long Straddle Option Strategies
- Rajiv Sambisavan (2012), Modeling of Flight Delays
- Giancarlo Crocetti (2012), Topical Discovery of Web Content
- Malcolm Houtz (2012), Applying Natural Language Processing and Document Classification to Text Mining RSS Feeds in order to Classify Documents as Interesting or Not, to an Analyst at the Company, Alliant
- Edwin Rivera (2012), Anti-Money Laundering Behavior: Reducing the Number of Non-Productive Alerts in Structuring through Effective Data Mining.
Edwin is an Optimization Statistician and VP at Citi Group in Florida.
- Judith Gu (2012), Using Data Mining to Model Market Reaction to Corporate Earnings Announcements
- Sampson Adu-Poku (2012), Comparing Classification Algorithms in Data Mining.
Sampson is Senior IT Project Consultant at United Health Group.
- Judith Spomer (2009), Latent Semantic Analysis and Classification Modeling in Applications for Social Movement Theory
- Thierry Vallaud (2009), Estimating Potential Customer Value Using Classification of Customer Data
- Donald Wedding, PhD (2009), Extending the Data Mining Software Packages SAS Enterprise Miner and SPSS Clementine to Handle Fuzzy Cluster Membership: Implementation with Examples
- Kathleen Alber (2007), Identifying Patterns of Potentially Preventable Emergency Department Utilization by American Children
- Steven Barbee (2007), The Discovery by Data Mining of Rogue Equipment in the Manufacture of Semiconductor Devices
- Eric Taylor (2005), Comparing Unsupervised Multivariate Normal Cluster Results between Datasets and Consolidating Similar Clusters
- James Steck (2005), NETPIX: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models.
James completed all the program course work from his home in Washington state. He now works as a statistician at Nordstrom Corporation in Washington.
- Rafiqul Islam (2004), Knowledge Discovery in Microarray Data