I am a graduate student of the Master of Science in Computer Vision at Carnegie Mellon University, advised by Prof. Matthew P. O'Toole, where I work on imaging, camera and machine learning. For my capstone project at Meta Reality Lab , I was advised by Prof. Deepak Pathak, He Wen and Yuan Dong, along with my partner Ashwin Vaswani.
Prior to CMU, I've worked at Actuate AI as a Data Scientist for more than 3 years. Prior to Actuate, I've earned my degree in Data Science from DePaul University, where I was advised by Prof.Jacob Furst, and in E-commerce from Foreign Trade University of Hanoi, advised by Prof.Hung Nguyen.
Introduction
Currently, my focus is on 3D Vision applications, specifically researching the integration of vision and language to generate CAD models with industrial accuracy. My goal this year is to develop a versatile framework capable of producing CAD models for furniture components from image and/or language inputs, accommodating out-of-distribution requests like unconventional chair designs.
If humans can see stars like nocturnal animals, hear high frequency sounds, and feel magnetic fields using external sensors, how will the brain handle these new types of signals? How will it affect our consciousness and subconsciousness? Is it possible and how to unlock new perceptions for humans?
Biological species evolve based on primal goals. How can robots be made to evolve into a high-order organized society similar to that of humans, and what would it look like? How would they cooperate, handle conflicts, yield, compromise, and come up with new, non-predetermined goals (such as performing art or exploring space)? When robots can do self-adjustment based exploration (questioning, reasoning, and creating), does it become a truly intelligent subject?
My work
Following projects showcases my skills and experience through real-world examples of my work. Each project is briefly described with links to code repositories and reports. It reflects my ability to solve complex problems and manage projects effectively.
Part segmentation can reduce the ambiguity of meshes used in further downstream tasks for AR/VR at Meta. Multi-view(MV) part segmentation faces challenges due to complexity and high labeling costs/time (can...
Classify Region and Detect Landmark for Localization...
Improve learning from missing modalites in federated settings...
Optical Flow and Depth for Pose Estimation on TartanAir...
Is structured light better?...
Fine-tune SAM model for 3D inputs....
Meta Learning with Distributionally Robust Optimization for Medical Text...
A home-made pipeline of incremental learning for object detection...
Add object tracker to YOLOv4 Darknet...
Localization on the Moon using ground image...
My Accomplishments
What course I took
16-820
Advanced Computer Vision.
11-777
Multimodal Machine Learning.
16-811
Mathematical Fundamentals for Robotics.
16-861
Space Robotics.
16-823
Physics-based Methods in Vision.
16-825
Learning for 3D Vision.
15-858
Discrete Differential Geometry.
CS330
Deep Multi-Task and Meta Learning.
IS467
Fundamentals of Data Science.
CSC424
Advanced Data Analysis.
CSC478
Machine Learning.
CSC495
Social Networks Analysis .
CSC481
Image Processing.
CSC528
Computer Vision.
CSC555
Mining Big Data.
CSC529
Advanced Data Mining.
CSC578
Neural Networks and Deep Learning.
CSC594
Natural Language Processing.
CSC587
Cognitive Science.