Price Optimization Decision Support Tool (2015 – to date).
Developed predictive models using R and machine learning to estimate
parameters used in price optimization methodologies. Used optimization
algorithms to find optimal pricing strategies for multiple products
considering elasticities, discounts, variable costs and other parameters.
Created proof of concept for a Decision Support Tool that is used for
scenario-based optimization problems.
Software Engineering Defect Prediction (2015 – to date).
Developed predictive models using R and machine learning to leverage code
metrics and code process metrics to predict whether a software product is
defective or not. Implemented supervised learning classification algorithms to
train code metrics data and help client anticipate what new software products
could be defective. Client is using predictions to try to allocate software
testing resources more efficiently.
Consumer’s Sentiment Analysis of Popular Mobile Phone Brands using Social Media Data (2015 – to date).
Developed preliminary consumer’s sentiment
analyses of popular mobile phone brands were performed using twitter data.
Sentiment analysis included basic tasks of determining the polarity (e.g.
positive, negative or neutral) of expressions included in the tweets. Beyond
polarity, an attempt to classify emotions (e.g. joy, anger, etc.) was made
about the devices in general.
In addition to analyses about the devices as a whole, the datasets were
analyzed to try to determine polarity and emotions about specific device
features (camera, screen, etc.). Some interesting highlights from the results
include: the general polarity seems to be more positive than negative for all
devices, and negative emotions seem to be unimportant for all devices. This
preliminary study can be improved by analyzing larger data sample sizes from
twitter and other social media sources.
MSW DST Development (1994 - 2015).
Developed a
quantitative framework to aid in decision making for integrated municipal
solid waste (MSW) management. The MSW Decision Support Tool (MSW DST) uses a
flexible framework to represent many site-specific issues and considerations.
It incorporates both cost and environmental objectives. The environmental
objectives are defined in terms of life cycle inventories of energy and
emissions (of carbon monoxide, fossil- and biomass-derived carbon dioxide,
nitrogen oxides, sulfur oxides, particulate matter [PM], and PM10) and
greenhouse gases) associated with MSW management strategies. The application
of the MSW DST was demonstrated through realistic hypothetical case studies.
Several MSW management scenarios of typical interest to U.S. municipalities
were studied. Through these illustrative applications, the flexibility and
capabilities of the MSW DST were demonstrated.
The MSW DST has an optimization module that selects the best group of
technology options based on cost or environmental criteria. Developed the
mathematical model that constitutes the optimization core in the tool. This
mathematical model is represented by a set of linear equations that constitute
the input of a linear programming (LP) solver. The first version of the MSW
DST uses the powerful commercial LP solver CPLEX ®. The MSWDST is comprised of
multiple modules. The MSW models are written in VB.NET to represent the
objectives functions and thousands of constraints and decision variables in a
Linear Programing formulation. The MSW models use object oriented programming
to represent the optimization problem components as objects and to convert
these abstractions into LP and Mathematical Programming
System (MPS) file formats in memory. The LP or MPS optimization problem
is then loaded into CPLEX via Dynamic Link Libraries (DLLs) to find for an
optimal solution using the Simplex algorithm. If an infeasible solution is
found, potential causes for its infeasibility are suggested. If a feasible
solution is found, the optimal decision variables are re-arranged and
interpreted to represent the subject matter objects and to create reports with
the optimal solution.
The MSWDST also includes multi-objective optimization capabilities to choose
the objective function among competing objective functions such as cost,
environmental emissions, energy consumption and recycling levels. The CPLEX
DLL engine was used repeatedly to obtain the Pareto surface for convex
multi-objective instances. Additionally, the CPLEX DLL engine was used to
obtain near optimal solutions for a specific objective function. The Modeling
to Generate Alternatives (MGA) methodology was used to alter the LP
formulation submitted to the CPLEX DLL to obtain multiple interesting near
optimal solution.
Impact: Reduced waste management and engineering costs for business operations
and municipalities with improved decision making in technology adoption,
potential new markets and regulation compliance.
Topological Insulators for Meso Dynamic Architectures (2014 - 2015).
Performed experimental designs for a project to
study the metalorganic chemical vapor deposition (MOCVD) growth of ultrathin
(≤ 300nm) Bi0.1Sb1.9Te3 thin films. It is a
unique semiconductor, which was being explored for its potential as an
efficient thermoelectric material for refrigeration or portable power
generation. In this project, a series of statistically designed experiments
(SDEs) were conducted to optimize the Bi0.1Sb1.9Te3
growth process. In these experiments, several materials’ properties were
tracked (mobility, resistivity, carrier concentration, Seebeck, film
thickness, growth rate, elemental percentage, surface morphology); however,
the primary focus was power factor (measured in µW/K2-cm).
Options for Sustainable Waste Management in the City of Durham (2014 - 2015).
Developed a sustainable waste management system to reduce the resources expended by the City of Durham to manage its waste while minimizing impacts to health and the environment. The system shifts the view of waste from unusable materials to valuable commodities that can be used to grow industries and associated jobs. Current sustainability programs are expected to result in many benefits, including decreasing the use of virgin materials in products or processes, economic development opportunities for material recyclers, and social benefits. In addition to benefits, additional (and perhaps unforeseen) economic, social, and environmental impacts may result from new municipal solid waste (MSW) management strategies. Thus, decision makers must balance the objectives of promoting sustainable waste management with the need to protect human health and the environment, as well as to minimize any negative economic or social impacts. A MSW Decision Support Tool (MSW DST) was used for this study. The study provided a profile of current solid waste operations and infrastructure provided by the City of Durham. It presented and summarized results from the analyses of targeted waste management options and strategies that were defined in collaboration with City of Durham staff.
Smart Grid Data and Electric Power Load Forecasting (2013 - 2015).
Developed accurate models for electric power load
forecasting that are essential to the operation and planning of a utility
company. These load forecasting models help electric utilities make important
decisions, including purchasing and generating electric power, load switching,
and infrastructure development. Developed methodologies using agent-based
model simulations and synthetic populations that could help develop a new
electric forecasting paradigm. In this new paradigm, the forecast of future
electricity consumption quantities and geographical locations could be
analyzed in concurrent rather than separate models. The “how much,” “when,”
and “where” could be simulated and answered at once in one combined
simulation.
System Reliability Model for Solid State Lighting (SSL) Luminaires (2011 - 2015).
Developed reliability model and accelerated life testing (ALT) methodologies
for predicting the lifetime of integrated SSL luminaires. Standard SSL test
methods, including Illumination Engineering Society LM-79-08 and LM-80-08,
were used to evaluate luminaire and component performance. An initial
reliability model based on assumed Arrhenius behavior was built. In the
absence of comparable datasets, initial ALT studies were conducted using the
Joint Electron Device Engineering Council’s standard test methods.
Temperature, relative humidity, particle ingress, and atmospheric pollutant
exposure were used as environmental stressors. Statistically valid sample
sets, based on the assumed Arrhenius behavior, were used in this initial
study. Phase II of this project created a multivariable reliability model
based on measured statistical distributions of experimental values and
degradation factors, with greatly improved accuracy over the initial model.
This model was created by statistical analysis of the experiment data obtained
during Phase I and includes the effects of environmental stressors on system
reliability. ALT methodologies are refined through additional environmental
stressors including step-stress methodologies to significantly reduce test
duration. The multivariable reliability model is refined through additional
ALT studies using these modified techniques. Validation of the model was done
by performing additional ALTs, including lumen maintenance and system
reliability testing on select luminaires. The final outcome from this project
was a multivariable reliability prediction tool for SSL luminaires and new ALT
methodologies for evaluating the system performance of SSL luminaires in less
than 3,000 hours of testing. Designed and developed reliability models,
Kaplan-Meier models, and Arrhenius models. Used multivariate regressions
models, statistical learning and cluster analysis.
Impact of Genomics and Personalized Medicine on the Cost-effectiveness of Preventing and Screening for Breast Cancer in Younger Women (2011 - 2015).
Developed mathematical models to compare the costs and benefits of
personalized medicine to identify the approaches that will be the most
cost-effective to screen younger women to identify those at increased risk of
developing breast cancer. Results from this study can be used to address
critical questions related to new technologies more likely to be cost
effective for screening young women; threshold values for these new
technologies to be cost effective, the feasibility of these technologies in
the real-world clinical setting; the impact of genomics-based screening
technologies on the current screening pathways; and the costs and benefits of
initiating genomics testing at specific age thresholds. The results from this
modeling study provides important evidence for developing guidelines and
recommendations related to breast cancer screening programs for young women.
Worked with the Centers for Disease Control and Prevention to study the impact
of personalized medicine on the cost-effectiveness of screening young women
for breast cancer using an agent-based model to simulate individual behaviors
and interactions and to assess their collective impacts at the population
level. Designed and developed agent-based models in Repast Simphony, risk
assessment models, incidence and prevalence models, Gompertz growth models,
natural history models, screening models, and treatment models.
Time-Varying Factors Associated with Lipid Lowering Medications for Primary Prevention of Cardiovascular Disease (2013 to 2014).
Developed
multi-state Markov models and micro-simulations with Agent-based models to
predict and prevent cardiovascular disease.
This project used
de-identified
clinical data derived from electronic medical records adopted and maintained
since 1997 by Midwest Heart Specialists/Advocate Medical Group, a 50-physician cardiology practice. The objectives
are to determine health status trajectories for
primary prevention starting with the development of elevated levels of
low-density lipoprotein cholesterol (>100 mg/dL) progressing through
revascularization in patients without coronary artery disease at the start of
observation; and
to use the information derived from a multi-state Markov model to
construct a simulator of cardiovascular disease
(CVD)
development accounting for known CVD risk factors, that will be useful in
investigation of predictive analytic questions.
Designed and developed
agent-based models in Repast Simphony.
Translational Cocaine Addiction: From Man to Mouse to Man (2012).
Developed an ontology-based network model of cocaine abuse and addiction. Drug addiction cannot be adequately addressed solely within a single discipline and instead requires a more comprehensive approach. We first used mouse system genetics to identify genes, gene networks and pathways associated with cocaine dependence. As in our previous studies with nicotine and heroin dependence, previously identified candidate genes for cocaine abuse phenotypes in humans and model animals were used to initiate the mouse systems genetic studies. The findings of the mouse systems genetic studies were then integrated with known environmental factors, such as drug availability, social stressors, peer support, and environmental exposures, to build an ontology-based network model of cocaine abuse and addiction using the Protégé ontology editor and framework. This model can be used to provide a framework for future cocaine-addiction studies.
Comparative Effectiveness of Alcohol Treatments (2011 to 2014).
Developed a predictive framework to improve the quality of comparative effectiveness research by identifying subpopulations most responsive to alcohol treatment. The framework provides an analysis flow, linking theoretical, exploratory, predictive, and Markov models aimed to fill gaps not currently addressed by each of methods used separately. Designed and developed Agent Based Models for Alcohol Use and Treatment and simulators associated with those models in Repast Simphony, Random Forest Models, and Survival Analysis Models.
Methods for Assessing Vulnerability and Resilience of Critical Infrastructure (IHSS Brief) (2010).
Developed an inclusive approach that incorporates physical, social,
organizational, economic, and environmental variables in addition to empirical
measurements and operationalization of resilience and vulnerability. The
objective was to help improve the understanding and management of risk
associated with threats to complex infrastructure systems. The framework uses
network theory, model-based vulnerability analysis, and reliability theory
(fault tree analysis).
Services Accountability Improvement System (SAIS) (2009 to 2011).
Developed and maintained data warehouses for the SAIS system. SAIS is a
service of the Substance Abuse and Mental Health Services Administration
(SAMHSA). SAIS is intended for use by the Center for Substance Abuse
Treatment’s (CSAT’s) Discretionary Services and Best Practices grantees, and
by SAMHSA and CSAT staff. SAIS was developed as part of the effort mandated by
the Government Performance and Results Act (GPRA) of 1993. GPRA is intended to
increase program effectiveness and public accountability by promoting a focus
on results, service quality, and customer satisfaction. Led the monitoring of
the SAIS IT infrastructure through monitoring systems he helped developed.
Wrote several standard operating procedure (SOP) manuals including the SOP to
conduct the monitoring activities. Performed a variety of activities to
improve the SAIS SQL server production database.
Violent Intent Modeling and Simulation (VIMS) (2009).
Applied agent-based modeling and cellular automata concepts to create a
prototype model and simulator based on published literature regarding Civil
Violence. VIMS was conceived by the Human Factors/Behavioral Sciences Division
of the U.S. Department of Homeland Security's Science and Technology
Directorate. The VIMS project team developed social science models as the core
of an analytic decision support tool to interpret the motivations and
behaviors of violent groups and identify factors indicating that a group may
engage in ideologically motivated violent activity. This task generated a
report for the VIMS project.
Modeling the Effectiveness of Hepatitis Vaccination When Accounting for Transmission Dynamics (2008).
Developed a compartmental
susceptible-exposed-infected-recovered model in MATLAB to study the
effectiveness of hepatitis A vaccination. Helped analyze the results from the
dynamic transmission model that accounts for natural declines in force of
infection, foreign sources of infection, and vaccination coverage rates.
Economic Issues in Seasonal Influenza Vaccination (2006 to 2007).
Developed a model that
estimates the likelihood that the influenza vaccination will result in
positive net benefits for several specific population subgroups. Used software
that accounts for uncertainty and variability in the impacts of influenza and
influenza vaccination, both across population subgroups and from one season to
the next. A key feature of this Monte Carlo–style simulation model is its
ability to use information on a range of possible values for influenza
severity and vaccine effectiveness to calculate a range of possible economic
impacts and the likelihood of occurrence for each, i.e., a distribution of
possible outcomes. This feature is important because influenza severity and
vaccine effectiveness are usually unknown early in the influenza season when
policymakers may be called upon to provide guidance on priority populations
for vaccination.
Models for Infectious Disease Agent Study (MIDAS) (2005 to 2010).
Reviewed and analyzed existing models for the spread of infectious diseases.
MIDAS was funded by the National Institute of General Medical Sciences to
encourage development of infectious disease modeling to address a wide range
of possible infectious agents, explore a variety of possible responses, and
enhance the model interface, thereby making the modeling process
understandable and accessible by nonscientists exploring health policy
options. Developed information technology tools and performed modeling
analyses. Developed methicillin-resistant Staphylococcus aureus
(MRSA) agent-based models in collaboration with the University of
Pittsburgh and the Harvard Medical School. Team leader in charge of
maintaining and improving the MIDAS portal. Designed the ORACLE
Ultrasearch-based search system and helped maintain the MIDAS Historic Data
and Document Catalog.
Risks to Watershed Health from Wildfires in the Western United States (2005 to 2006).
Developed system to identify geographic areas at risk of catastrophic forest
fires. The system called FORWARDWest is a suite of Java-based tools, which
main goal is to give the end user the ability to weigh the different
parameters to isolate areas of interest. The core of this suite of tools was
the FORWARDWest slider toolbar. Once the sliders were set to the user’s
preference, a 1:24k USGS quad tile layer was scored to indicate the geographic
risks based on how the user has set the slider bars. A Java-based
configuration program, WestWardHO gave the user the ability to introduce
custom data to the FORWARDWest interface by modifying the underlying XML
Configuration file.
Air Quality Modeling Decision Support Tool (2004 to 2005).
Developed an air-quality modeling decision support tool to help the city of
Beijing, China, analyze and improve air quality before the 2008 Olympic Games.
Designed management plans to improve air quality conditions. The decision
support tool included an air emissions database and a set of small programs to
link many existing air quality models designed by the U.S. Environmental
Protection Agency (EPA).
Development of Integrated Water Quality Analyses for the Shared Waters of the United States and Mexico (2003 to 2005).
Designed and developed a database to collect and store water quality data from
monitoring stations on the U.S.-Mexico Border. This project was funded by
EPA’s Border 2012 Program. The Border 2012 goal was to reduce water
contamination on the U.S.-Mexico border by collecting and analyzing water
quality data. These assessments of significant shared and transboundary
surface waters identified current water quality status and trends and helped
the United States and Mexico formulate water resource management strategies to
achieve, by 2012, a majority of water quality standards currently being
exceeded in those waters. Used EPA’s STOrage and RETrieval system (STORET)
water quality data dictionary and many EPA data standards. Performed analyses
on the water quality data stored in the repository to determine water quality
status and water quality trends on the U.S.-Mexico border. Coordinated a
binational group of stakeholders who represented U.S. and Mexican states,
federal and state agencies, and consortiums.
GUI Development for the Total Risk Integrated Methodology (TRIM) (2002 to 2005).
Developed a Java-based graphical user interface
(GUI) as part of EPA’s TRIM project. The GUI helped users perform ecological
hazard calculations for wildlife and species assemblages for spatially
explicit areas of interest; calculated hazards for acute, sub-chronic, and
chronic benchmarks; and calculated hazards for different endpoints. Inputs for
this model were time series of annual concentrations in abiotic media and time
series of average daily doses for biota.
Data Mining and Analysis Tool Development (2002 to 2004).
Developed a Web-based data mining and analysis tool to present environmental
project results more efficiently. The main purpose of this tool was to replace
large amounts of printed data tables that needed to be analyzed, summarized,
and delivered to the client. The back-end of the tool was an Oracle database,
and the front-end was a combination of JSP pages and applets to provide the
user with a GUI to perform analyses. With this tool, data could be made
available to the client in a format that was engaging and easy to understand;
users could ask follow-up questions rather than searching through the hardcopy
data tables; clients could view tables and graphics that were useful in
decision making; and clients could present results and benefits of a project
to upper management via the easily accessible Web site.
Programming in Support of EPA Reach Indexing Projects (2001 to 2005).
Developed an ORACLE PL/SQL package as a stored procedure and used
object-oriented approaches to handle batch indexing jobs for the EPA’s Surface
Water Reach Indexing tool (WebRIT). The package communicated and integrated
with other packages written to handle different functionalities for the
WebRIT. Used the batch-indexing package to perform several indexing jobs,
including indexing of combined Sewer Overflow data, Drinking Water Initiative
data, and Clean Watershed Needs Survey data. Wrote a series of small functions
in Oracle Spatial to manipulate georeferenced data related to water sources
for the EPA Total WATERS Project. This project created summaries of total
miles of waterbodies by state and by waterbody type.