I am a computer science Ph.D. graduate from Florida International University (FIU) where I was advised by Dr. Niki Pissinou and supported by the GEM Consortium with MIT Lincoln Laboratory, FIU Director's Fellowship, and the 2021 NSF Graduate Research Fellowship.
My doctoral research specialized in adversarial machine learning, where I aimed to provide defenses against imperceptible adversarial perturbations that exploit model uncertainty. Specifically, I analyzed various learning system components (i.e., training data transformations, activation function characteristics, and overall model complexity) to modify the loss surface and increase adversarial robustness. Additionally, from 2019-2023, I co-advised on research projects in adversarial machine learning for domain-specific applications such as the Internet of Things (IoT) and natural language processing (NLP) through the NSF-funded Research Experience for Undergraduates and Research Experience for Teachers programs.
I earned a B.S. in computer science and a B.S in mathematics in May 2019 [Worlds Ahead Graduate] and a M.S. in telecommunications and networking from FIU in December 2022. Throughout my undergraduate career, my research experience primarily consisted of creating reliable data management techniques for mobile wireless sensor networks. This research resulted in outstanding undergraduate research awards [CRA-E Finalist, FIU SCIS]. Additionally, I was fortunate enough to receive a merit-based scholarship [Great Minds in STEM], and participated in two summer research internships at MIT Lincoln Laboratory.
Increasing adversarial robustness around uncertain boundary regions with amodal segmentation
IEEE International Conference on Machine Learning and Applications proceedings (ICMLA 2024)
abstract
Adversarial perturbations in object recognition and image classification tasks impact a learning model's ability to perform accurately and increase the safety risks of deployed machine learning models. During the adversarial example generation process, adversaries approach areas most prone to model uncertainty. Identifying partially occluded items, especially without understanding general object shapes, contributes to significant model uncertainty since object boundaries are not inherently at the forefront of the feature generalization process in deep learning models. Thus, this work aims to reduce model uncertainty surrounding partially occluded boundaries and increase adversarial robustness by augmenting the training dataset with amodal segmentation boundary masks. By observing performance degradation, robust sensitivity, and loss sensitivity, we show how including these masks during training impacts an adversary's ability to generate effective adversarial examples on the versatile MS COCO dataset. Lastly, we observe how including these masks during training influences the performance of adversarial training.
Unifying robust activation functions for reduced adversarial vulnerability with the parametric generalized gamma function
IEEE International Conference on Machine Learning and Applications proceedings (ICMLA 2024)
abstract code
Adversaries minimally perturb deep learning input data to reduce a learning model's ability to produce domain-specific data-driven recommendations to solve specialized tasks. This vulnerability to adversarial perturbations has been argued to stem from a learning model's nonlocal generalization over complex input data. Given the incomplete information in a complex dataset, a learning model captures nonlinear patterns between data points with volatility in the loss surface and exploitable areas of low-confidence knowledge. It is the responsibility of activation functions to capture the nonlinearity in data and, thus, has inspired disjointed research efforts to create robust activation functions. This work unifies the properties of activation functions that contribute to robust generalization with the generalized gamma distribution function. We show that combining the disjointed characteristics presented in the literature with our parametric generalized gamma activation function provides more effective robustness than the individual characteristics alone.
Towards parametric robust activation functions in adversarial machine learning
Tiny Paper at International Conference on Learning Representations (ICLR 2023)
abstract code
Machine learning's vulnerability to adversarial perturbations has been argued to stem from a learning model's non-local generalization over complex input data. Given the incomplete information in a complex dataset, a learning model captures non-linear patterns between data points with volatility in the loss surface and exploitable areas of low-confidence knowledge. It is the responsibility of activation functions to capture the non-linearity in data and, thus, has inspired disjointed research efforts to create robust activation functions. This work unifies the properties of activation functions that contribute to robust generalization with the generalized gamma distribution function. We show that combining the disjointed characteristics presented in the literature provides more effective robustness than the individual characteristics alone. (Full paper: Unifying robust activation functions for reduced adversarial vulnerability with the parametric generalized gamma function)
The dilemma between data transformations and adversarial robustness in time series application systems
AAAI's Workshop on Artificial Intelligence Safety proceedings (Safe AI 2022)
abstract
Adversarial examples, or nearly indistinguishable inputs created by an attacker, significantly reduce machine learning accuracy. Theoretical evidence has shown that the high intrinsic dimensionality of datasets facilitates an adversary's ability to develop effective adversarial examples in classification models. Adjacently, the presentation of data to a learning model impacts its performance. For example, we have seen this through dimensionality reduction techniques used to aid with the generalization of features in machine learning applications. Thus, data transformation techniques go hand-in-hand with state-of-the-art learning models in decision-making applications such as intelligent medical or military systems. With this work, we explore how data transformations techniques such as feature selection, dimensionality reduction, or trend extraction techniques may impact an adversary's ability to create effective adversarial samples on a recurrent neural network. Specifically, we analyze it from the perspective of the data manifold and the presentation of its intrinsic features. Our evaluation empirically shows that feature selection and trend extraction techniques may increase the RNN's vulnerability. A data transformation technique reduces the vulnerability to adversarial examples only if it approximates the dataset's intrinsic dimension, minimizes codimension, and maintains higher manifold coverage.
Jespipe: A plugin-based, open MPI framework for adversarial machine learning analysis
IEEE International Conference on Big Data proceedings (BigData 2021)
abstract code
Research is increasingly showing the tremendous vulnerability in machine learning models to seemingly undetectable adversarial inputs. One of the current limitations in adversarial machine learning research is the incredibly time-consuming testing of novel defenses against various attacks and across multiple datasets, even with high computing power. To address this limitation, we have developed Jespipe as a new plugin-based, parallel-by-design Open MPI framework that aids in evaluating the robustness of machine learning models. The plugin-based nature of this framework enables researchers to specify any pre-training data manipulations, machine learning models, adversarial models, and analysis or visualization metrics with their input Python files. Because this framework is plugin-based, a researcher can easily incorporate model implementations using popular deep learning libraries such as PyTorch, Keras, TensorFlow, Theano, or MXNet, or adversarial robustness tools such as IBM's Adversarial Robustness Toolbox or Foolbox. The parallelized nature of this framework also enables researchers to evaluate various learning or attack models with multiple datasets simultaneously by specifying all the models and datasets they would like to test with our XML control file template. Overall, Jespipe shows promising results by reducing latency in adversarial machine learning algorithm development and testing compared to traditional Jupyter notebook workflows.
Leontief-based data cleaning workload distribution strategy for EH-MWSN best paper
IEEE International Workshop Technical Committee on Communications Quality and Reliability proceedings (CQR 2020)
abstract
The use of energy-harvesting technologies in mobile wireless sensor networks (MWSN) delivers a promising opportunity to mitigate the limitations that irreplaceable energy sources impose over conventional MWSN. We propose Leontief-Data Cleaning Distribution Strategy (Leontief-DCD), an economic model-based method designed to distribute the data cleaning workload in energy harvesting MWSN powered by predictable energy sources, such as solar energy. Leontief-DCD creates interdependencies among sensor nodes to predict the required cooperation from each node in the data cleaning process. Different from existing task allocation methods, the interdependencies in Leontief-DCD allows for to plan a workload distribution that benefits the network as a whole, rather than only individual sensors, which consequently benefit the overall system performance. Our results show that when employing our method to distribute data cleaning workload in highly dirty, real-world datasets in scenarios with high and low energy, our method increased the number of data samples engaged in data cleaning processes by up to 25.57%, the count of active sensor nodes by up to 44.01%, and the network overall well-being by up to 55.42% compared to data cleaning performed by each node individually.
Predicting hurricane trajectories using a recurrent neural network
AAAI Conference on Artificial Intelligence proceedings (AAAI 2019)
abstract codepresentation
Hurricanes are cyclones circulating about a defined center whose closed wind speeds exceed 75 mph originating over tropical and subtropical waters. At landfall, hurricanes can result in severe disasters. The accuracy of predicting their tra- jectory paths is critical to reduce economic loss and save human lives. Given the complexity and nonlinearity of weather data, a recurrent neural network (RNN) could be beneficial in modeling hurricane behavior. We propose the application of a fully connected RNN to predict the trajectory of hurricanes. We employed the RNN over a fine grid to reduce typical truncation errors. We utilized their latitude, longitude, wind speed, and pressure publicly provided by the National Hurricane Center (NHC) to predict the trajectory of a hurricane at 6-hour intervals. Results show that this proposed technique is competitive to methods currently employed by the NHC and can predict up to approximately 120 hours of hurricane path.
Quantifying location privacy in permissioned blockchain-based Internet of Things
EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services proceedings (MobiQuitous 2019)
abstract
Recently, blockchain has received much attention from the mobility-centric Internet of Things (IoT). It is deemed the key to ensuring the built-in integrity of information and security of immutability by design in the peer-to-peer network (P2P) of mobile devices. In a permissioned blockchain, the authority of the system has control over the identities of its users. Such information can allow an ill-intentioned authority to map identities with their spatiotemporal data, which undermines the location privacy of a mobile user. In this paper, we study the location privacy preservation problem in the context of permissioned blockchain-based IoT systems under three conditions. First, the authority of the blockchain holds the public and private key distribution task in the system. Second, there exists a spatiotemporal correlation between consecutive location-based transactions. Third, users communicate with each other through short-range communication technologies such that it constitutes a proof of location (PoL) on their actual locations. We show that, in a permissioned blockchain with an authority and a presence of a PoL, existing approaches cannot be applied using a plug-and-play approach to protect location privacy. In this context, we propose BlockPriv, an obfuscation technique that quantifies, both theoretically and experimentally, the relationship between privacy and utility in order to dynamically protect the privacy of sensitive locations in the permissioned blockchain.
Using candlestick charting and dynamic time warping for data behavior modeling and trend prediction for MWSN in IoT
IEEE International Conference on Big Data proceedings (BigData 2018)
abstract
There is a rapid emergence of new applications involving mobile wireless sensor networks (MWSN) in the field of Internet of Things (IoT). Although useful, MWSN still carry the restrictions of having limited memory, energy, and computational capacity. At the same time, the amount of data collected in the IoT is exponentially increasing. We propose Behavior-Based Trend Prediction (BBTP), a data abstraction and trend prediction technique, designed to address the limited memory constraint in addition to providing future trend predictions. Predictions made by BBTP can be employed by real-time decision-making applications and data monitoring. BBTP applies a candlestick charting technique to abstract the data behavior of a time partition in evolving data streams. It also quantifies differences between a pair of consecutive time partitions using dynamic time warping (DTW) at the sensor node. Then, it forwards the data to an Internet-enabled device, where the sensor's future data trends are predicted using a multi-class Support Vector Machine (SVM). A comparative study was conducted to investigate the effectiveness of our BBTP method on real-world datasets. Our results demonstrate that data trends predicted by BBTP achieve better precision, recall, and accuracy score when contrasted against four well-known techniques while reducing the space complexity by at least a factor of 10.
A dynamic trust weight allocation technique for data reconstruction in mobile wireless sensor networks
IEEE International Conference On Trust, Security And Privacy In Computing And Communications proceedings (TrustCom 2018)
abstract
Data accuracy and low energy consumption in mobile wireless sensor networks (MWSN) are crucial attributes for real-time applications. Although there are many existing methods to reconstruct data for wireless sensor networks, there are few developed for highly mobile environments. We propose Dynamic Trust Weight Allocation Technique (DTWA), a novel in-network data reconstruction method that determines the trust level in the data accuracy of each candidate node by evaluating spatio-temporal correlations, trajectory behavior, quantity and quality of data, and the number of hops traveled by the received data from the source. DTWA is capable of evaluating second-hand data when there is no first-hand data available and selecting second-hand data when this last is more accurate than the first-hand data. Our results demonstrate that data reconstructed using DTWA depicts significantly lower Root Mean Square Error (RMSE) compared to the IMC method when tested for both low and high incomplete dataset scenarios.
Context-aware data cleaning for mobile wireless sensor networks: A diversified trust approach
International Conference on Computing, Networking and Communications proceedings (ICNC 2018)
abstract
In mobile wireless sensor networks (MWSN), data imprecision is a common problem. Decision making in real time applications may be greatly affected by a minor error. Even though there are many existing techniques that take advantage of the spatio-temporal characteristics exhibited in mobile environments, few measure the trustworthiness of sensor data accuracy. We propose a unique online context-aware data cleaning method that measures trustworthiness by employing an initial candidate reduction through the analysis of trust parameters used in financial markets theory. Sensors with similar trajectory behaviors are assigned trust scores estimated through the calculation of “betas” for finding the most accurate data to trust. Instead of devoting all the trust into a single candidate sensor's data to perform the cleaning, a Diversified Trust Portfolio (DTP) is generated based on the selected set of spatially autocorrelated candidate sensors. Our results show that samples cleaned by the proposed method exhibit lower percent error when compared to two well-known and effective data cleaning algorithms in tested outdoor and indoor scenarios.