Date of Award

2025

Degree Name

Data Science

College

College of Engineering and Computer Sciences

Type of Degree

M.S.

Document Type

Thesis

First Advisor

Dr. Haroon Malik

Second Advisor

Dr. Ananya Jana

Third Advisor

Dr. Aniruddha Maiti

Fourth Advisor

Dr. Husnu Narman

Abstract

Deep learning methods have advanced rapidly. Digital dentistry has adopted them for tasks such as tooth segmentation from intraoral scans and tooth crown generation. However, there is still a significant gap in implementation due to factors such as limited data availability, the complex nature of tooth images, and inherent ambiguities. My research began with exploring a gap in the tooth segmentation methods for intraoral scans. Intraoral scans are large, containing over 200K mesh cells, and need to be downsampled to make them suitable for deep learning methods. We tested different resolutions to assess the extent of information loss during downsampling. We found that segmentation quality drops sharply when the resolution is below 6K mesh cells. I also experimented with large language models and prompt engineering in the process of cleaning and preparing a scientific text dataset. We compared the performance of two large language models on the task of scientific text categorization on a dataset that we prepared. While conducting the above experiments, we realized that there is currently a very limited number of specialized models for the overall analysis of dental images, and there is a lack of datasets for dental image captioning which can train such specialized models. This led to the research question: Can Vision-Language Models be used to generate structured dental image captions in the absence of paired datasets, and thereby support the creation of a dental captions dataset? Building on these experiences, we devised a framework to generate captions for dental images using vision-language models and evaluated the quality of the generated captions. The model was able to generate clinically relevant captions with structured tooth type and surface information, achieving high accuracy for certain tooth types and disease conditions. Manual evaluation confirmed that the framework produced consistent and interpretable captions across multiple datasets. Overall, this methodology was designed to advance the field of dental image analysis and dental caption generation by using Vision-Language Models and addressing a critical gap in the domain.

Subject(s)

Computer science.

Dentistry.

Machine learning.

Dentistry -- Data processing.

Dental instruments and apparatus.

Data sets.

Available for download on Friday, December 04, 2026

Share

COinS