
Is Penetration Testing Security Testing?
Some people start “Security Testing” by buying and using a pen-test tool on project. Such tools uncover security vulnerabilities (though they seldom help with root cause analysis or even obtaining double-digit code coverage).
These tools are degenerate, at best, in facilitating a security testing strategy. Why? Because, these tools are “black box” tools. What are black box tools?
The term “black box” stems from old testing literature and means “without internal knowledge”. An external perspective has always excluded “code” but sometimes goes as far as (in my opinion appropriately) software design. Obviously, you need to know something about the application’s architecture and design to test it though and the slope gets slippery.
In the realm of penetration testing, the term “black box” often has meaning beyond the tester’s knowledge of the product. OSSTM defines “black box” to mean that neither attacker nor defender are given knowledge of each other. They mean for this test procedure to accurately represent the kinds of opportunities, need for stealth, accessibility, and exploit that a real attacker would have, and also to evaluate the defender’s abilities to identify and prevent or recover from attack.
Because black box tools to a large extent run canned tests they will not satisfy my security testing goal (see previous entry) of having run tests that one traces back to requirements. ‘Requirements that one created as a result of doing risk analysis that determines exactly what behaviors (and their impacts) should be avoided were the software attacked.
Arguably, security folk have “cached” this risk analysis and these implicit requirements in the pen testing tool. Fine, this is that small benefit that I mentioned pen tests do provide. And, they DO find bugs. Again, this is at best a degenerate case of security testing in the same way running a fuzz testing tool is a degenerate way of conducting functional testing.
For QA folk wary of accepting the previous statement, it will suffice to say that you wouldn’t defend your job based on achieving less than fifty percent coverage would you?
Vendors have begun using hybrid approaches (this will only become more common). Do these approaches solve our coverage problem and allay our concerns?
I demo’d Compuware’s DevPartner, which has a poorly advertised (and perhaps now nacent) security scanning capability, a few years back and was pretty impressed with the start they had made in hybrid .NET analysis. I’m not sure where it’s gone since then. Fortify’s PTA also combines static and dynamic analyses to help prioritize static findings and provide root cause analysis for dynamic ones.
These hybrid tools don’t get our security testers off the hook either though, as they’re still not addressing the project-specific risk analysis nor are they anchoring tests in requirements.


April 9th, 2008 at 12:24 pm
From a book I recently read:
Functional (black-box) methods can be applied to any unit, build, or system, since they assume no knowledge of how either was constructed or what it contains. Such methods require a sufficient and unambiguous specification if they are to be effective. (Devising black-box tests for some code from its specification at the time the specification is written is a useful verification of the specification; if it can’t be done, the specification isn’t good enough). This is what you will normally do for a system test.
Structural (white-box) methods can also be used on any unit or build, but are cheapest to devise when used on structured programs, e.g., well-written programs in a structured language. These are usually applied to unit tests.
Dynamic analysis techniques are derivative methods which assume that a particular technique such as condition tables, finite state machines, object-orientation, or functional programming, has been used to develop the software; we use knowledge of that technique to determine the test cases, so these have very special areas of application. These can be applied at both system test and
unit test times. These are usually applied to unit tests.
Static analysis techniques can best be used before integration, and can also point to the need for
more-rigorous unit testing. These require using tools to analyze the source code. These are usually applied to unit tests.
Symbolic evaluation carried out by an automated symbolic evaluation tool places no special requirement on the unit, except of course that it be in the language evaluated by the tool. This involves “dry-running” the code, and recording the output values of variables algebraically
rather than using specific values for input variables. This is usually applied to unit tests.
Coverity, Valgrind, Purify, Insure++ — these are all dynamic analysis tools. However, none of these are `black box’ tools. In fact, dynamic analysis is usually a white-box (structural) activity, just as code coverage is also a structural / white-box activity.
Fortify SCA, Ounce, and SPIN are examples of static analysis tools in the classic sense that you only refer to.
I often consider all requirements gathering analysis tools (e.g. HP/Mercury TestDirector, IBM/Rational RequisitePro, NASA SATC ARM) to be static analysis tools, even though the definition above doesn’t factor these tools in. In some ways, you don’t hit on the importance of requirements enough, but you’re mixing a lot of testing/inspection methods. Let me see if I can be of some assistance here.
There are tools strictly outside of static analysis that perform code comprehension or other analysis of source code. Examples include SciTools Understand, Clover, EMMA, PartCover, NCover, tcov, gcov, and lcov. Some of these require integration. I would prefer to put most tools in a separate “code comprehension” category. The word `hybrid’ is simply too generic and confusing for iterative purposes.
In summary:
1) Black-box testing - EP, BVA, feature tests
2) White-box testing - Code comprehension, dynamic analysis
2a) Code-comprehension/coverage (white-box) - Statement, decision, condition, multiple-condition, LCSAJ, path analysis
2b) Dynamic analysis (white-box) - runtime tests usually done with a symbol table, assertion tests, source tracing
3) Static analysis - DFA, CFA, function value analysis, symbolic execution, mutation tests, fault-injection
3a) Fault-injection - input-based fuzz tests to cause crashes (e.g. stack-based buffer overflow), input tests propagate to output (e.g. cross-site scripting), input tests propagate to function pointer (e.g. heap overflows), etc
Compuware SecurityChecker and Fortify PTA are fault-injection tools (not hybrids, sorry!). However, CUTE is an example of a real hybrid tool that also supports symbolic execution. However, CUTE is also technically a fuzz testing tool!
Regardless, these are static analysis tools. There is no such thing as a hybrid tool in the way that you speak of. There is no gray-box. There is no way to combine a dynamic analysis tool with a static analysis tool (special note below).
You are correct in identifying that most other classic penetration-testing tools are black-box “feature testing” tools. I think that some classically labeled fuzz testing tools are mislabeled, and they are actually quite more like EC/BVA tools for negative testing.
Real fuzz testing tools are fault-injectors, which are static analysis tools that have access to code. Note that source code is not necessary thanks to binary and bytecode disassembly and decompilation.
PaiMei is a type of hybrid tool that crosses the dynamic and static analysis boundaries. Your holy grail is basically a tool that uses the control-flow path of execution information from dynamic analysis and combines it with output from a DFA (static analysis). Great idea, but it’s lacking quite a bit of important information, such as the data values.
See: P. Fairfield, D. Hedley, and M. A. Hennell. Test coverage analysers. Alvey deliverable A35. Alvey Directorate. Project SE/031.
May 12th, 2008 at 11:25 am
John,
Fascinating analysis. I would like to point out a few things. First, let me reference something I’ve already written on the topic of penetration testing and its value [http://portal.spidynamics.com/blogs/rafal/archive/2008/04/04/What_2700_s-the-point-of-_2200_penetration-testing_22003F00_.aspx]
1) First - Blackbox testing (a la web app vulnerability scanners) has its place in immediately identifying to a business the high-risk “low hanging fruit” that attackers will likely go for. While some pundits out there are claiming that you can only achieve as much as 50% vulnerability detection with a Blackbox tool, I would argue that it’s the 50% that will be found by 99% of the “attackers” and should be fixed and analyzed first.
2) Second - once you’ve had a chance to look at the Blackbox results, it’s time to pull out the IDE-integrated tools which can not only help you by analyzing your code (byte-code + source) [http://portal.spidynamics.com/blogs/rafal/archive/2008/05/06/Static-Code-Analysis-Failures.aspx] and providing a thorough data-trace analysis of your inputs as they move through the application. After all, external data is the risk you’re mitigating… right?
3) Next - I do agree that QA testers need tools. They should have something that maps business requirements right into their testing framework so that “Security Vulnerabilities” are just as much a defect as anything else the business dictates [http://portal.spidynamics.com/blogs/rafal/archive/2008/04/01/Security-vulnerabilities-as-quality-defects_3F00_.aspx].
Anywho… I have been saying for a very long time - security analysis of web applications is not a point-in-time, “do it before we go live”, activity. It must be integrated into the SDLC, scrutinized from the developer’s IDE, analyzed as a defect in QA testing, and finally Blackbox tested before releasing to production. Oh, and just to throw in a shameless plug, the HP ASC (formerly SPIynsmics, http://www.hp.com/go/securitysoftware/) suite does exactly that with scary precision.
Thanks!