Projects
This section highlights some of my key hands-on projects. I believe that building real systems is the best way to turn theoretical knowledge into practical expertise. Each project represents a deep dive into a different area of AI, from multimodal creativity to life-saving safety systems.
DanceNet: AI-Enabled Choreography System
Technologies: Python, PyTorch, VAEs, BERT Timeline: Mar 2025 – Apr 2025
A cross-modal AI system translating textual descriptions into dance choreography by bridging natural language and motion data.
- Built a pipeline using OpenPose for skeleton extraction and BERT embeddings to process 10,000+ motion sequences.
- Designed a VAE architecture (2 LSTM encoders, 1 GRU decoder) compressing 53-point skeleton poses into 128-dimensional latent vectors, achieving 87% reconstruction fidelity.
- Implemented a contrastive loss (margin = 0.5) to align text and motion embeddings, yielding 84% retrieval precision.
- Developed a retrieval system with KMeans clustering (K=3) that reduced annotation needs by 75% while maintaining semantic similarity scores above 0.82.
High-Fidelity 3D Cardiac MRI Synthesis and Artifact Removal
Technologies: GANs, U-Net, PyTorch
Timeline: Feb 2025 – Present (Ongoing Research)
A generative pipeline for synthesizing realistic 3D cardiac MRI volumes and robustly removing motion/equipment artifacts to improve clinical diagnostic imaging.
- Designed a teacher–student GAN framework for 3D MRI synthesis, achieving 92.6% fidelity vs. real scans.
- Developed a U-Net-based denoiser that improved PSNR (+5.68) and SSIM (+0.16) across motion and low-SNR artifacts.
- Simulated multiple types of clinical noise (Rician, Gaussian, salt-and-pepper) for robustness, enabling successful artifact removal in >95% of test cases.
Rethinking Encoders for Generative Virtual Try-On
Technologies: Diffusion Models, Generative AI, PyTorch
Timeline: Apr 2025 – Present (Research in progress)
Inspired by Stable-ViTON, this project investigates how encoder design can unlock fine-grained texture transfer and realistic garment deformation in virtual try-on systems.
- Proposed a spatially-aware encoding strategy integrated into a diffusion-based pipeline.
SuperSafety: An Intelligent Industrial Safety System
Technologies: Python, PyTorch, MediaPipe, GANs, YOLOv8
Timeline: Jan 2024 – May 2024
A full-stack computer vision system enhancing industrial worker safety by combining super-resolution, PPE detection, and worker tracking.
- Implemented a Progressive Growing GAN to enhance low-resolution surveillance footage from 64×64 → 256×256, achieving a 4.2 FID score.
- Fine-tuned a YOLOv8 model on a 2,500-image custom dataset, reaching 96% mAP@0.5 for PPE detection.
- Built a MediaPipe-based tracker to associate PPE with individual workers, sustaining 93% accuracy through occlusions.
- Coordinated a 9-person team using Agile (2-week sprints), delivering 97% of planned features on time.
