Interests
I believe an AI system can’t be called “intelligent” unless it can correctly handle the multitude of ways in which human users can interact with it. The road to this point is a long one, and entails understanding where and when an AI system will fail, so that we can identify potential system mistakes before they happen. This entails developing methods for uncovering bias in both models and datasets, developing techniques to generate challenging test cases, developing algorithms to uncover annotation mistakes, developing better evaluation metrics, etc.
Education
PhD Computer Science - Vanderbilt University (ongoing)
MSE Electrical and Computer Engineering - University of Michigan
BSE Computer Engineering - University of Michigan
Preprints
1. Document Type Classification using File Names
Zhijian Li, Stefan Larson, Kevin Leach
arXiv preprint, 2024
2. ShabbyPages: A Reproducible Document Denoising and Binarization Dataset
Alexander Groleau, Kok Wei Chee, Stefan Larson, Samay Maini, Jonathan Boarman
arXiv preprint, 2023
3. A Survey of Datasets for Intent Classification and Slot-Filling for Task-Oriented Dialog
Stefan Larson, Kevin Leach
arXiv preprint, 2022
Publications
1. De-Identification of Sensitive Personal Data in Datasets Derived from IIT-CDIP
Stefan Larson, Nicole Cornehl Lima, Santiago Pedroza Diaz, Amogh Manoj Joshi, Siddharth Betala, Jamiu Tunde Suleiman, Yash Mathur, Kaushal Kumar Prajapati, Ramla Alakraa, Junjie Shen, Temi Okotore, Kevin Leach
EMNLP 2024
2. Generating Hard-Negative Out-of-Scope Data with ChatGPT for Intent Classification
Zhijian Li, Stefan Larson, Kevin Leach
LREC-COLING 2024
3. Augraphy: A Data Augmentation Library for Document Images
Alexander Groleau, Kok Wei Chee, Stefan Larson, Samay Maini, Jonathan Boarman
ICDAR 2023
4. On Evaluation of Document Classification using RVL-CDIP
Stefan Larson, Gordon Lim, Kevin Leach
EACL 2023
5. Evaluating Out-of-Distribution Performance on Document Image Classifiers
Stefan Larson, Gordon Lim, Yutong Ai, David Kuang, Kevin Leach
NeurIPS D&B 2022
6. Redwood: Using Collision Detection to Grow a Large-Scale Intent Classification Dataset
Stefan Larson, Kevin Leach
SIGDIAL 2022
7. Exploring Out-of-Distribution Generalization in Text Classifiers Trained on Tobacco-3482 and RVL-CDIP
Stefan Larson, Navtej Singh, Saarthak Maheshwari, Shanti Stewart, Uma Krishnaswamy
Document Images and Language Workshop (DIL) at ICDAR 2021
8. LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction
Jacob Solawetz, Stefan Larson
EACL 2021
9. Inconsistencies in Crowdsourced Slot-Filling Annotations: A Typology and Identification Methods
Stefan Larson, Adrian Cheung, Anish Mahendran, Kevin Leach, Jonathan K. Kummerfeld
COLING 2020
10. Iterative Feature Mining for Constraint-Based Data Collection to Increase Data Diversity and Model Robustness
Stefan Larson, Anthony Zheng, Anish Mahendran, Rishi Tekriwal, Adrian Cheung, Eric Guldan, Kevin Leach, Jonathan K. Kummerfeld
EMNLP 2020
11. Data Query Language and Corpus Tools for Slot-Filling and Intent Classification Datasets
Stefan Larson, Eric Guldan, Kevin Leach
LREC 2020
12. An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction
Stefan Larson, Anish Mahendran, Joseph J. Peper, Christopher Clarke, Andrew Lee, Parker Hill, Kevin Leach, Jonathan K. Kummerfeld, Michael A. Laurenzano, Lingjia Tang, Jason Mars
EMNLP 2019
13. Outlier Detection for Improved Data Quality and Diversity in Dialog Systems
Stefan Larson, Anish Mahendran, Andrew Lee, Jonathan K. Kummerfeld, Parker Hill, Michael A. Laurenzano, Johann Hauswald, Lingjia Tang, Jason Mars
NAACL 2019
Datasets
OOD data for RVL-CDIP
This colleciton is a companion dataset for RVL-CDIP, a popular document image classification benchmark. RVL-CDIP-N includes in-domain, out-of-distribution data. RVL-CDIP-O includes out-of-domain, out-of-distribution data. Both -O and -N datasets consist of documents found on DocumentCloud and websearch (e.g., Google and Bing).
OOS Intent Classification Dataset
This dataset targets the task of intent classification. It contains 150 “in-scope” system-supported intents across 10 domain areas, and notably includes a substantial number of “out-of-scope” samples to test out-of-distribution detection performance.
Patents
1. Systems and Methods Implementing Data Query Language and Utterance Corpus Implements for Handling Slot-Filling and Dialogue Intent Classification Data in a Machine Learning Task-Oriented Dialogue System
US Patent No. 11,183,175
2. Systems and Methods for Mixed Setting Training for Slot Filling Machine Learning Tasks in a Machine Learning Task-Oriented Dialogue System
US Patent No. 11,043,208; 2021
3. Systems and Methods for Automatically Detecting and Repairing Slot Errors in Machine Learning Training Data for a Machine Learning-Based Dialogue System
US Patent No. 10,929,761; 2021
4. Systems and Methods for Constructing an Artificially Diverse Corpus of Training Data Samples for Training a Contextually-Biased Model for a Machine Learning-Based Dialogue System
US Patent No. 10,796,104; 2020
5. Systems and Methods for Automatically Configuring Training Data for Training Machine Learning Models of a Machine Learning-Based Dialogue System Including Seeding Training Samples or Curating a Corpus of Training Data Based on Instances of Training Data Identified as Anomalous
US Patent No. 10,679,150; 2020