Talk With an Expert

Times Change and Your Training Data Should Too: The Effect of Training Data Recency on Twitter Classifiers

Times Change and Your Training Data Should Too: The Effect of Training Data Recency on Twitter Classifiers (PDF, 3.43MB)Published: 11 Jul, 2018
Created by
Ryan O'Grady

Sophisticated adversaries are moving their botnet command and control infrastructure to social media microblogging sites such as Twitter. As security practitioners work to identify new methods for detecting and disrupting such botnets, including machine-learning approaches, we must better understand what effect training data recency has on classifier performance. This research investigates the performance of several binary classifiers and their ability to distinguish between non-verified and verified tweets as the offset between the age of the training data and test data changed.