Projects
Selected publications, research projects, and independent ventures.
Medical Imaging & Clinical AI
PneumoXttention
IEEE ISPA 2021
PneumoXttention is a convolutional neural network ensemble designed to detect pneumonia in chest X-rays while explicitly reducing human diagnostic error. The model was evaluated against radiologist performance on the RSNA and NIH ChestX-ray datasets, demonstrating strong F1 scores and consistent sensitivity across patient subgroups. The work focuses on reliability and clinical validation rather than purely maximizing accuracy.
Automated Coronary Calcium Scoring
IEEE MIT URTC 2022
Developed a semi-supervised U-Net model to estimate coronary artery calcium scores from non-gated CT scans, enabling cardiovascular risk assessment without specialized imaging protocols. Introduced targeted cropping techniques that reduced mean absolute error by 91% and improved F1 score by 32%, significantly outperforming baseline approaches.
CheX-Nomaly
arXiv Preprint
CheX-Nomaly introduces a Siamese U-Net framework with contrastive learning to localize thoracic abnormalities across 14+ disease categories. The model prioritizes cross-dataset generalization and robustness to spurious correlations, addressing perceptual diagnostic errors common in chest X-ray interpretation rather than optimizing single-dataset performance.
Mask R-CNN for Brain Tumor Segmentation
arXiv Preprint
Applied Mask R-CNN with image subtraction techniques to segment heterogeneous brain tumors from MRI scans. The approach improved tumor boundary delineation and achieved a DICE coefficient of 0.75, outperforming standard segmentation baselines on complex tumor morphologies.
Minimization of False Negatives / Positives
arXiv Preprint
Proposed a post-pretraining input-adjustment method to reduce false negatives and false positives in binary classification tasks. The method consistently improved performance across multiple datasets, demonstrating that targeted input perturbations can correct decision boundary biases without retraining large models.
Time-Series & Sustainability
Water Consumption Analysis
Independent Project
Designed unsupervised machine learning pipelines to disaggregate household water consumption into appliance-level usage. Used clustering and distance-based algorithms (K-Means, Dynamic Time Warping) to infer usage patterns from raw meter data, enabling conservation insights and cost-saving recommendations without smart-meter hardware.
Knowledge Systems & NLP
News–to–ArXiv Pipeline
Independent Project | 2025–Present
Built an automated Python pipeline that maps 500+ real-world news queries into arXiv search tasks. The system uses LLMs and heuristic ranking to retrieve top-10 relevant academic papers from a corpus of 50k+ articles, enabling scalable discovery of emerging research topics from noisy media signals.
Enterprise ML & Production Systems
Credit Card Fraud Detection
MIT CSAIL · Prof. Amar Gupta
Developed LSTM and Transformer-based fraud detection models trained on over 1M transaction records. Enhanced dataset diversity using synthetic data generation, improving F1 score by ~12% and significantly increasing precision on rare fraud cases.
O-Health Symptom Extractor
MIT CSAIL · Prof. Amar Gupta
Processed 50k+ patient records using unsupervised learning to cluster symptoms and infer medical specializations. Achieved over 60% accuracy in specialization prediction, demonstrating the viability of clustering-based classification in noisy clinical text.
Kognitos – Engineering Internship
Summer 2025
Built automated LLM + regex pipelines to improve error messaging in enterprise workflows. The system reduced customer wait times by 89% and achieved 99% structural match accuracy across production logs, directly impacting customer experience at scale.
Patient Embeddings & Representation Learning
Cardiovascular Disease Embeddings
MIT CSAIL · Prof. Manolis Kellis
Developed personalized latent space embeddings for 6,000+ patient records to model cardiovascular disease progression. Used representation learning and dimensionality reduction to uncover phenotypic similarity clusters and longitudinal disease patterns.
Additional Projects
Water Consumption Disaggregation
Appliance-level inference from smart-meter time series using shape-based clustering.
Synapses.news Platform
End-to-end system embedding, clustering, and visualizing latent structure in news & research.
O-Health Symptom Extractor
Unsupervised clustering of 50k+ patient records to infer medical specializations.
Kognitos LLM Error Repair Pipelines
LLM-driven exception tracing and semantic error repair in production systems.
Django Feature-Flag Microservice
Scalable backend integrating LaunchDarkly for 500+ production tags.