Date of Award
2026
Degree Name
Data Science
College
College of Engineering and Computer Sciences
Type of Degree
M.S.
Document Type
Thesis
First Advisor
Dr. Husnu Narman
Second Advisor
Dr. Haroon Malik
Third Advisor
Dr. Ananya Jana
Abstract
The rapid advancements in generative Artificial Intelligence (AI), alongside its accessibility and ease of use, have necessitated the development of detection frameworks that are accurate, efficient, and explainable. This thesis evaluates four lightweight Vision Transformers (MobileViT, MobileViTv2, EdgeNeXt, and EfficientViT) for the task of single-face deepfake detection. Utilizing large-scale benchmarks (DF40 and DDL) for training and testing, the thesis investigates the trade-off between computational cost and generalization capabilities. The experimental protocol involved fourteen configurations, evolving from binary classification (Real vs. Fake) to a five-class taxonomy (Real, Face Reenactment, Face Swapping, Entire Face Synthesis, and Face Editing). To address source-domain bias and model overconfidence, the thesis adopted an iterative training pipeline, implementing distinct data-splitting approaches alongside aggressive regularization techniques, specifically Mixup and CutMix. Results showed that MobileViTv2 is the optimal architecture for real-time edge-device deployment, achieving a peak multi-class accuracy of 96.3% while maintaining the lowest inference time (2.56ms) and a minimal memory footprint (4.39 million parameters). Furthermore, the thesis goes beyond subjective Explainable AI (XAI) analysis by using the DDL benchmark ground-truth masks to quantify feature localization. The optimal model achieved a 23.37% Mean Intersection over Union (mIoU) under weakly-supervised conditions. Overall, this thesis provides an efficient and verifiable foundation for deepfake detection in high-stakes environments.
Subject(s)
Computer science.
Big data.
Image processing -- Digital techniques.
Computer vision.
Deception --Technological innovations.
Artificial intelligence.
Machine learning.
Neural networks (Computer science)
Pattern recognition systems.
Recommended Citation
Al Babele, Omar, "Deepfake detection beyond the black box: interpretability and performance benchmarking of lightweight Vision Transformers" (2026). Theses, Dissertations and Capstones. 2071.
https://mds.marshall.edu/etd/2071
