Improving the Issue Detection Rate
In code analysis, the issue detection rate is defined as: (Number of Real Issues Correctly Identified / Total Number of Real Issues Actually Present in the Project) × 100%.
The detection rate is influenced by multiple factors, such as rule configurations, project dependencies, and newly disclosed vulnerabilities. A high detection rate may also lead to a high false positive rate, and different business teams may have varying understandings of what constitutes a false positive. Therefore, while improving the detection rate, it is necessary to rationally address false positives. Comprehensive optimization measures should be taken from aspects such as tools, processes, rule configurations, and team collaboration.
Tencent Cloud Code Analysis (TCA) uses self-developed enhanced analysis tools leveraging technologies like lexical analysis, syntactic analysis, control flow analysis, and data flow analysis. It integrates many well-known open-source tools and covers over 30 languages in the industry. Combined with years of internal experience in improving detection rates, the following improvement strategies (among others) can be adopted:
Analysis Rule Scheme Design.
Custom Tools and Rules.
Tool Rule Upgrades and Synchronization.
Strategies for Handling Issues Unnecessary to Fix.
1. Analysis Rule Scheme Design
TCA covers over 30 languages in the industry, with a rich tool library and rule library. It has designed a variety of rule packages based on specific focuses or different purposes, such as code specification rule packages and sensitive information rule packages.
Business teams can flexibly select and combine different rule packages or choose rules from the rule library based on project requirements, team specifications, security requirements, etc. For example, they can use compiled tool rules and apply multiple analysis techniques to conduct more in-depth and accurate analysis of projects.
By complementing rules from multiple tools and selecting appropriate rules, the detection rate of project issues can be effectively improved.
2. Custom Tools and Rules
Although TCA's rich rule library covers a large number of common issues, it is less relevant to specific business contexts and still has limitations in addressing individualized issues under specific business logics, projects, frameworks, or technology stacks.
To address this, TCA provides capabilities such as custom tools and rules, framework adaptation, and rule parameter configuration, enabling more precise and comprehensive identification of real issues and improving the detection rate of project issues.
Rule Parameter Adjustment: Some rules have default configurations and support parameter adjustments to modify the detection breadth and depth of the rules. Default configurations may not always align with business needs, potentially leading to false negatives or false positives. Business teams need to adjust parameters according to actual requirements. For example, adjusting the line length for line length detection rules or enabling strict mode for duplicate variable hiding detection rules.
Custom Rules: Some tools support custom rules and framework adaptation, which can be developed based on business needs. Examples include custom rules for checking specific keywords, functions, or configuration items, or defining taint and sanitization functions to adapt to internal frameworks.
Custom Tools: Some teams have internally developed tools or use open-source/commercial tools. TCA supports independent integration of tools, allowing teams to expand their tool rule libraries and improve detection effectiveness.
Custom Rule Development by TCA Team: For scenarios where existing rules or custom tool rules cannot meet needs (e.g., complex business logic issues), TCA provides custom rule development services. Teams can submit relevant requirements to the TCA team.
3. Tool Rule Upgrades and Synchronization
TCA continuously iterates and upgrades tool rules by optimizing false positives and vulnerabilities, adapting to new changes in languages and frameworks, and enhancing analysis capabilities, thereby enriching the rule library and improving issue detection rates.
Business teams can enhance the depth, breadth, and precision of code analysis through tool functional upgrades, version updates, rule enhancements, enabling advanced analysis capabilities, and optimizing toolchain combinations.
TCA typically iterates and upgrades tool rules through the following channels:
Newly Disclosed Vulnerabilities in the Industry: Over time, new vulnerabilities (e.g., Log4j vulnerability) may emerge in the industry. The TCA team will develop and implement tool rules to cover these vulnerabilities based on newly disclosed industry vulnerabilities.
Internal Tool Development Plans: The TCA team will iterate tool rules according to internal tool development plans. After internal gray-scale iterations, these rules will be prepared for public use (e.g., AI-assisted inspection tools like CodeLingo).
Custom Rule Development for Customer Needs: Based on internal and external customer needs, the TCA team will use technologies such as lexical analysis, syntactic analysis, control flow analysis, and data flow analysis to develop custom rules covering these needs. These rules will be evaluated for security, compliance, and universality before being made available.
4. Strategies for Handling Issues Unnecessary to Fix
Business developers are usually more concerned about whether detected issues require fixing. Issues deemed unnecessary to fix by the business often lead to complaints, with the business team even attributing them to tool false positives.
Understanding False Positives Correctly
"False positive" is a colloquial term for ease of understanding in practice. In static analysis, there is no such thing as a true false positive. When the static information chain is incomplete or the analysis chain is broken, issues are discovered and displayed in a speculative form. Tools prioritize exposing code issues, which are then confirmed by relevant developers to decide whether to mark them as ignored (they will not be reported again). This drives developers to conduct precise Code Reviews and reduces the cost of superficial Reviews.
Scenarios where the business deems issues unnecessary to fix typically include:
- Test/unused code: The business does not actually use the code containing the vulnerability.
- Duplicate issues: The issue has already been resolved in other branches.
- No fix required: The tool's analysis is correct, but the code is designed this way or requires temporary non-action due to historical reasons.
- Tool false positives: The tool's analysis is inaccurate.
When these scenarios occur, the causes should be analyzed, and issues should be handled rationally based on tool rule design and actual business contexts. Strategies include (but are not limited to):
Project repositories may contain test code, automatically generated code, or third-party code, which can introduce significant interference and reduce analysis efficiency. Configurations such as specifying analysis directories, path filtering, and filtering Git submodules can concentrate analysis resources on core business code, improving analysis efficiency, detection rate, and fix rate.
For long-standing repositories where some issues need to be marked as unnecessary to handle, methods such as code comment ignoring and platform marking ignoring can be used. Code comment ignoring is generally recommended, as it records the reason for handling issues at the source and avoids discovering the same issue in different branches.
If the same issue is found in different branches of the same code repository and has been ignored in one branch, configurations such as code comment ignoring and global issue ignoring can be used to prevent duplicate issue reporting.
For tool false positives: Some tool rules adopt stricter strategies to avoid false negatives and assist in comprehensive manual reviews, or they may not be suitable for specific business scenarios (e.g., certain security rules). For these cases, code comment ignoring and platform marking ignoring can also be used. Meanwhile, providing feedback to the TCA team on these scenarios helps iteratively optimize detection rules or configurations, reducing the false positive rate.
When issues deemed unnecessary to fix are effectively controlled, business teams can focus more on genuine issues identified by tools, indirectly improving attention to and handling efficiency of real defects and security vulnerabilities. This ultimately achieves improved issue detection rates while maintaining low false positives.