Explainable Machine Learning for Fake News Detection

Abstract

Recently, there have been many research efforts aiming to understand fake news phenomena and to identify typical patterns and features of fake news. Yet, the real discriminating power of these features is still unknown: some are more general, but others perform well only with specific data. In this work, we conduct a highly exploratory investigation that produced hundreds of thousands of models from a large and diverse set of features. These models are unbiased in the sense that their features are randomly chosen from the pool of available features. While the vast majority of models are ineffective, we were able to produce a number of models that yield highly accurate decisions, thus effectively separating fake news from actual stories. Specifically, we focused our analysis on models that rank a randomly chosen fake news story higher than a randomly chosen fact with more than 0.85 probability. For these models we found a strong link between features and model predictions, showing that some features are clearly tailored for detecting certain types of fake news, thus evidencing that different combinations of features cover a specific region of the fake news space. Finally, we present an explanation of factors contributing to model decisions, thus promoting civic reasoning by complementing our ability to evaluate digital content and reach warranted conclusions.

Publication
Proceedings of the 10th ACM Conference on Web Science