I am a PhD Candidate in Computational Science, set to graduate in June 2026, and looking for new career opportunities. My research focuses on Machine Learning (ML), Reinforcement Learning (RL), and agentic AI (vehicle and science autonomy); with a specialty in adaptive dynamic deep neural networks. I have a proven track record in developing full autonomy stacks and predictive model-based pipelines, as supported by eight peer-reviewed publications as lead author across UCI, NASA, LBNL, and industry. I am an expert in parallel and High Performance Computing (HPC) for training multimodal machine learning models and large-scale data processing, while simultaneously optimizing compute-heavy pipelines for improved inference efficiency, especially for deployment to resource constrained devices. I uniquely have an interdisciplinary background in STEM, emphasizing: rigorous experimental design, iterative model improvement, deeper understanding of physics-based and statistical methods, and effective collaborative communications with domain experts.

Research Summary

I work with full stacks that integrate various hierarchal levels for autonomous operations of AI agents and predictive pipelines. This includes applications in autonomous vehicles, navigation through unknown environments, and remote science, sensing, exploration, and discovery. Below I explain an end-to-end approach, highlighting specific components that I typically investigate during my research.

The displayed graphic is a .gif showing a robot starting in an unexplored space tasked with reaching the target position. At a variable frequency, it probes the environment using onboard RGB-D, IMU, and GPS sensors. The acquired sensor data is used to (1) perceive the environment around it, (2) navigate around nearby detected obstacles with low-level controllers, (3) build a 3D global map and occupancy grid with ray-tracing, and (4) reason on a high-level with global parameters and mission objectives in mind.

Multimodal sensor data is fused by training independent frozen stems and encoders for each modality. Further, a Denoising-AutoEncoder (DAE) is used with Monte Carlo dropout to treat corrupted real-world data collected in the field. This also provides a means of uncertainty quantification which can be used along with the perceived features for downstream logic and reasoning modules.

I typically train the low-level controllers with Deep Reinforcement Learning (DRL) algorithms such as DQN and TD3, utilizing simulators built on Unreal Engine such as Microsoft AirSim and Carla. I train onboard perception models, such as monocular depth estimation, from scratch and with post-training via deep supervised learning and with a combination of simulator-collected and real-world data to ensure efficient sim-to-real transfer.

The processed data is further fused with mission parameters to scale on-board computing and activate various sensor modalities, by utilizing cutting-edge adaptive and dynamic neural networks that change the structure of the stack at runtime and in response to perceived context. This enables efficient deployment on edge-based resource-constrained hardware with considerations in energy expenditure, power demands, and inference latency. I benchmark these metrics using a hardware-in-the-loop setup that executes computing onboard small embedded chips such as NVIDIA Jetson devices. Onboard data is relayed to an edge server, with considerations in compacting optimization techniques such as split computing to effectively transmit data via volatile communication links.

On the edge server, more compute-heavy high-level policies and algorithms can be executed. One of these is building the 3D global map. Another is for waypoint generation and execution of multimodal large language models (MLLM), such as Gemma 3. I integrate MLLMs with visual-language-action (VLA) methodologies to act as a high-level reasoning module and overseer that supervises the entire process at a lower frequency than the low-level controller. This enables zero-shot prediction, circumventing the need for fine-tuning on specific robot embodiments and environments, while simultaneously leveraging common sense language models trained on large-scale heterogenous web-based data. The MLLM enables generalized input commands, instructing the AI agent where to go and what to do next, while maintaining the safety and reliability of the low-level controller. I have also used this for further improvements in navigation performance as executed by the DRL-trained low-level kinematic controller, by having the MLLM generate intermediate waypoints to escape dead-ends that arise from out-of-distribution (generalization) errors and encountering complex obstacles/scenarios.

Shown above is an AI agent autonomously navigating through an unknown environment, exploring it with local depth sensors, navigation through and avoiding collisions via a DRL-based low-level kinematic controller. Grey pixels indicate unexplored space. Black pixels indicate free space. White pixels indicate detected obstacles. Notice the .gif gets “stalled” at some point, at which it queries a multimodal large language model to generate a restorative waypoint (the triangle) to escape this dead-end which arises due to out-of-distribution error.

Shown above is a slimmable neural network dynamically activating a sub-network of active channels within a CNN block. Active channels are determined by the slimming factor, ρ\rho, set during a forward pass of a slimmable CNN. In this hypothetical example, ρ=0.5\rho=0.5 and there are four maximum channels.

Dynamic Neural Networks

Common workhorses of my research are neural networks that can dynamically change their structure at runtime (during inference). This is accomplished by adopting architectures that are either multi-branch, allowing for real-time context switching of large neural blocks, or embedded, allowing for finer control of active sub-networks contained within a larger super-network. Specific variants scale neural operations at runtime in terms of: the depth with early exits, the width with slimmable networks, the input sensor modalities, or other sub-structures with more novel approaches.

Autonomous Navigation and Decision Making

Remote tasks such as sensing, science, exploration, delivery, surveillance, and search and rescue, rely on autonomous navigation of drones and rovers through unknown terrain in otherwise hard to reach, or hazardous, environments. Such vehicles are equipped with efficient instruments that have a well balanced information-complexity tradeoff between sensing the nearby environment and the low cost, power, mass and volume requirements of lightweight robotics. Such sensors provide limited information for: navigation, which requires knowledge of the 3D structure of the surrounding environment; and autonomy, which requires various other features to be extracted for downstream control. To this end, I integrate perception-driven modules that process data with deep neural networks into navigation pipelines and autonomy stacks. Further methods such as deep reinforcement learning and diffusion policies offer: flexibility in multimodal inputs and variable decision making, real-time collision avoidance capabilities, and generalizability to unknown environments; while methods such as SLAM and physics based trajectory planning provide precise and less black-box behavior under the assumption that more global information is available.

Shown above is a drone autonomously navigating through an urban environment.

Shown here is knowledge distillation used to reduce the size of a large model.

Model Reduction

Model reduction simplifies deep neural networks so that they can be executed on resource constrained devices, and includes methods such as pruning, quantization, and direct design. These approaches result in neural networks that are static in nature that must be complex enough to handle the most challenging scenarios, and often incur a degradation in task accuracy. Knowledge distillation is a powerful technique that follows a teacher-student paradigm to transfer knowledge learned from a larger teacher model to a smaller student one. This is primarily accomplished using a dual-objective loss function that minimizes: (a) a hard target, which represents the downstream task error, and (b) a soft target, which represents the distillation error between the teacher and student outputs and intermediate latent features.

Adaptive Neural Networks

Determining how to optimally activate different components of a dynamic neural network remains an ongoing challenge. The central contribution of my research is the development of adaptive policies that scale the dynamic portions of a neural network in response to contextual cues, environmental conditions, and mission objectives. This approach enables a self-adapting neural architecture that intelligently adjusts its own complexity, improving both efficiency and overall performance. Developing cohesive pipelines that adopt adaptive methods is a significant challenge, and I primarily use deep reinforcement learning which comes with it several other challenges. More details can be found in my research statement.

Shown above is a pipeline in which an adaptive mechanism is integrated into a dynamic CNN block, enabling controlled scaling of the slimmable network.

Shown here is a multimodal neural network with a variable sensor array as input.

Multimodal Neural Networks

Real-world data is collected at various time windows, geographic locations, spatial resolutions, and sensing modalities. Such datasets differ not only in their structure, but also in dimensionality, posing significant challenges in developing cohesive data processing pipelines. Various network stems are independently trained, then later joined by mutual neural layers, to integrate heterogenous data types which can be processed and used for downstream tasks.

Edge/Split Computing

Edge computing remotely executes a full deep neural network at a compute-capable device — the edge server. This transmits input data (e.g., images) over volatile and capacity-constrained wireless links, creating problems in efficient channel usage, delay, delay variance, and security. The benefits are that the computational workload for onboard computing is drastically reduced and offloaded to the edge server.

Split computing, also known as split DNN and model partitioning, is a recent class of approaches in mobile computing. DNN architectures are partitioned into two sections — head and tail — that are executed by the mobile device and edge server. The objective is to balance computing load, energy consumption, and channel usage by utilizing supervised compression (w.r.t. downstream tasks) which has been shown to outperform generalized compression.

Shown here is a teacher model injected with a split point used in split computing.

Shown here is a machine learning model using various input sensor modalities as input into a neural network for downstream hydrological studies.

Predicting Evapo-transpiration

Another of my applications is in predicting Evapotranspiration (ET) from real-time satellite imagery, meteorological forcing data, weather towers, and field sensors. Existing approaches typically rely on interpolation methods followed by rigid physics-based calculations. In contrast, I develop flexible DNN frameworks capable of operating directly on noisy, incomplete data by integrating a DAE into the pipeline. These improvements have implications for agricultural monitoring and water-resource management. Further methods can be utilized to determine sensor placement, based on confidence levels and quantified uncertainty.

Denoising Autoencoder

A related challenge in real-world data is how to mitigate inherent noise and missing values that result from environmental factors and imperfect sensors. An autoencoder is a neural network that realizes a compressed latent representation of input data. A denoising autoencoder is a derivation that inputs an observation sampled from a corrupted feature space, encodes to a compressed latent space, then decodes to a denoised feature space. The governing principle is that the covariance matrix between the input dimensions is learned during training, so that high signal to noise variables can be leveraged to denoise low ones.

Shown here is a noisy image, from the MNIST dataset which contains hand-drawn numbers, input into a denoising autoencoder that denoises the image.

Shown here is a sampling method that perturbs inputs and samples nodes from a neural network used for inferences.

Monte Carlo Sampling

I further advance neural network predictions by integrating Monte Carlo (MC) sampling methods into the inference pipeline for sensitivity analysis and uncertainty quantification. Classical neural networks produce a single deterministic output, which obscures information about predictive uncertainty. To address this, I introduce random perturbations to the input data to evaluate sensitivity to predictors, and apply random dropout during inference to estimate uncertainty related to model parameters. This produces a distribution of predictions, allowing for an assessment of confidence, variance, and modes. Such information is especially valuable in autonomous surveying systems, where regions of high uncertainty or sensitivity can be targeted for further data collection or more careful analysis.

Mineral Classification

Planetary surface missions have greatly benefitted from intelligent systems capable of semi-autonomous navigation and surveying. However, instruments onboard these missions are not similarly equipped with automated science analysis classifiers onboard rovers, which can further improve scientific yield and autonomy.  Such rovers are equipped with scientific instruments including heterogeneous spectrometers and imagers, and typically transmit raw data to an orbiting satellite, which then relays it to scientists on Earth for analysis and further instructions. This process is exposed to severe communication latency and significant resource overhead. Enabling greater autonomy onboard the rover would allow for faster and more efficient in-situ decision making, data processing, and science. To address this issue, I apply my methods to present both single- and multi-mineral autonomous classifiers integrated using the observations from a co-registered dual-band Raman spectrometer, and imagers. I experiment with different neural network structures, showing that a multimodal neural network is the most robust.

Shown above is the progression of more complex methods to integrate different sensor modalities for the downstream task of mineral classification.

Shown here is the resulting distribution of predictions with four various levels of noise added to the input spectra.

Satellite Development

During the conceptual stages of satellite development, a topic of interest is in assessing whether given mission and instrument design parameters will provide data suitable for extracting quantities of interest. To this end, I developed stand-alone C++ neural network code that potentially helps speed observation and mission design planning, by training a neural network pipeline with the selected mission parameters and evaluating the downstream distribution of predictions. The figure shown here illustrates some results of how the noise resulting from selected parameters for a proposed satellite mission would affect the ability to detect the atmospheric cloud properties, fsedf_{sed}, of exoplanets.