Analyzed Java Code Snippets: The Corpus

Analyzed Java Code Snippets: The Corpus (PDF, 1.02MB)Published: 29 Sep, 2021

Created by:

Hitarth Patel

Static Code Analysis is a way to find vulnerabilities in source code. However, this process is flawed due to the significant amount of false-positive findings that take additional time and resources to address, taking away from remediating actual vulnerabilities. For this research, a corpus was created to begin the process of developing a machine learning model that could potentially weed out false positives. Due to the lack of datasets available for this project, it is focused on the process of developing the datasets that could feed the machine learning model. This paper is a blueprint for future research and improvement of processes to find vulnerabilities without false positives.

SEC595: Applied Data Science and AI/Machine Learning for Cybersecurity Professionals

Analyzed Java Code Snippets: The Corpus

Share