Interaction Matching for Long-Tail Multi-Label Classification

AAAI 2021 Workshop on Scientific Document Understanding

Published February 9, 2021

Sean Macavaney, Franck Dernoncourt, Walter Chang, Nazli Goharian, Ophir Frieder

We present an elegant and effective approach for addressing limitations in existing multi-label classification models by incorporating interaction matching, a concept shown to be useful for ad-hoc search result ranking. By performing soft n-gram interaction matching, we match labels with natural language descriptions (which are common to have in most multi-labeling tasks). Our approach can be used to enhance existing multi-label classification approaches, which are biased toward frequently-occurring labels. We evaluate our approach on two challenging tasks: automatic medical coding of clinical notes and automatic labeling of entities from software tutorial text. Our results show that our method can yield up to an 11% relative improvement in macro performance, with most of the gains stemming from labels that appear infrequently in the training set (i.e., the long tail of labels).

Learn More