A data analyst takes a deep dive into defect datasets. What does she learn about AppSec on the way? Introducing the Enso Research Den’s three part AppSec research series comparing defect data, the technologies involved in collection, severity levels, and CVE alignment over time from the perspective of a data analyst.
As an Application Security Posture Management (ASPM) solution, Enso continuously gathers information from across different organizational R&D and AppSec tools to provide security teams with visibility into the dynamic production environments involved in application development. As a data analyst using these tools, I am naturally exposed to large sets of data on defects and their corresponding information.
As technologies evolve at a rapid pace and there seems to be no end in sight to the digital transformation that began years ago, vulnerability management has become one of the most critical processes to ensure continuous business operations for organizations of all types. In order to address potential security issues, AppSec teams are in a constant race to discover, classify and prioritize vulnerabilities.
As a data analyst, I decided it was time to jump into the AppSec wild and see how using raw data can inform and enhance our understanding of security vulnerabilities in applications.
Enso’s data team took a deep dive into our comprehensive data collection in order to contribute to public AppSec-related research and security vulnerability-related topics. Our three-part series will break down the research surrounding some of these topics into actionable insights:
For our research, we used known CVE data collections that rely on NDV’s CVE feed, such as ‘Open CVE’ and ‘CVE Details’ databases. CVE (Common Vulnerabilities and Exposures) is a standard for vulnerability information sources amongst security professionals. CVSS is used as a baseline to determine severity.
In the following graph, we can see a positive trend of new vulnerabilities created over time. Note that 60% of known vulnerabilities were released just in the past six years.
There are several valid hypotheses to explain this dramatic rise in this specific timeframe:
As open-source projects become increasingly popular, we see a rise in related security research and in the identification of vulnerabilities in open-source projects. This naturally led to an abundance of SCA scanning tools, as they provide fast vulnerability detection in today’s centralized development environments. The rise in SCA usage might therefore contribute to security researchers disclosing more vulnerabilities in open-source projects.
In order to analyze exploits and their associated CVEs we examined multiple data sources, which in many cases were not aligned. In addition, we used public sources such as ‘Exploit DB’ to tag the collected CVE data with their associated exploits.
Attempting to correlate the different data sources, we encountered many challenges. Not every exploit has an associated CVE, and many CVEs have exploits from different sources, making it difficult to reach a true verdict for the question above.
There is also a need to address the time elapsed between when a CVE is disclosed, and when an available PoC exploit is found (raising the vulnerability severity). The distribution of exploits over the noted years below raises the question: what were the underlying factors that made vulnerabilities a target for more exploits during this timeframe? The answer can potentially help us predict which vulnerabilities are likely to be exploited in the AppSec wild sooner than others - allowing us to prioritize tasks better and mitigate risk faster.
However, when looking at the number of CVEs that have an exploit attached to them, we see the number of exploits are not increasing overtime as we would have expected.
An interesting research topic would be analyzing the reasons behind the peak in 2008 and 2017, suggesting why attackers targeted these years in particular.
One possible explanation is due to the trend that exploits in general are published after a CVE was published. We see a peak around 2008 and 2017 suggesting that new exploits that are associated with old CVEs are still being published, making it difficult to track a single source of truth and answer the question above.
The findings above are important because they suggest an issue with how this sort of data is reported by security researchers and collected by the industry, and emphasize how difficult it is for AppSec professionals to explore all the available information in public data.
Prioritizing vulnerability remediation has become a challenge for many AppSec teams, as many rely on CVSS in order to prioritize their critical issues.
In the left figure, we can see CVE severity levels distribution, based on their CVSS score. Out of ~180K CVEs examined, 58% are classified in the medium severity level (CVSS between 4 to 6).
In the figure on the right, we see how CVEs with exploit attachments are distributed by severity. Most of the exploits are found in vulnerabilities with medium and high severity, which is reasonable since it is not very common to find critical severity issues.
Criticality is different from the probability of exploitation. Our findings show that while high-severity issues are less likely to be identified, they are more likely to be exploited.
With so many emerging defects to cover, the real challenge is not only to manage the vulnerabilities but also to find, implement and enforce an effective remediation process.
In the next episode of a data analyst wandering into the AppSec wild, we will take a closer look and examine the data trends surrounding SAST and SCA tools across the most popular development technologies.
Questions and suggestions are welcome! Please reach out to me at firstname.lastname@example.org to get in touch.