Testing software is (computationally) hard

In my post “A Tale of Two Approaches”, I mentioned that integration testing is an NP-Hard problem. This seems very intuitive to me, but other people don’t; so, it seems appropriate for me to justify this claim through argumentation.

Why test software?

According to Rice’s Theorem, “there exists no automatic method that decides with generality non-trivial questions on the behavior of computer programs”. This means that the general problem of knowing whether a program correctly runs is undecidable (can not be solved by an algorithm).

This limitation forces developers to either (a) prove their algorithms correct using some formal system or (b) write tests for “functional correctness” by verifying that expected inputs match expected outputs.

Most developers choose the later because providing exact proofs cost too much time and money.

For example, suppose that I wrote the following python method that squared a number:

A program to test that function could look like the following

Problems with software testing

The process of writing very specific functional tests often results in a large number of tests for even simple behavior.

Unfortunately, to be completely thorough you would have to write tests for all possible inputs and verify that you get the expected output. This is completely unrealistic.

Now, I don’t know of a good way to prove that claim, but the theory of NP-completeness comes very close: if I can prove that functional testing belongs to the class of NP-complete problems then it is just as hard as the hardest problems in computer science.

Let me give a quick explanation of NP-completeness so you have enough background to understand my claim.

What is NP-completeness?

In computer science, we often try to write software that solves mechanical problems as efficiently as possible. However, there exist a certain class of “decision” problems that we do not know how to solve efficiently in general cases. We call these problems NP-Complete.

We also have a way to identify NP-Complete problems: create a method that can “efficiently” map one NP-Complete problem to another problem.

Why are functional tests NP-complete

Recall, the class TestSquareMethod in the code sample above.

That test uses the arrange-act-assert paradigm which organizes tests into 3 logical groups:

  • Arrange all necessary preconditions and inputs
  • Act on the object or method under test
  • Assert that the expected results occurs

This model has the same logical form as the traditional if-then statement from propositional logic: if the properties in the arrange grouping are true then the properties in the assert grouping are true when I run the program.

Using this fact, I claim (but will not prove) that any arrange-act-assert style test is the same as a logical proposition and vice versa.

For example, by using the rule of material implication, I can model the test “test_2_squared_equals_4” with the boolean formula p ^ ~q where p represents the “preconditions” in the arrange group and q represents the “postconditions” in the assert group. Further, any boolean formula with the form p ^ ~q I can represent with the test “test_2_squared_equals_4”.

Suppose that I asked the question: Given a functional testing application written in an arrange-act-assert style, does there exist some input that obeys the preconditions but whose output does not obey the postconditions?

Alternatively, we can phrase the question as does there exist some arrangement of inputs that do not satisfy the boolean formula’s associated with a collection of functional test.

Now, there exists a NP-Hard problem called SAT which asks whether there exists an interpretation that satisfies a boolean formula, and I claim (from intuition) that we can reduce the SAT problem to this problem.



I want to appeal to your intuition rather than provide something very rigorous.

Providing a proper proof would likely involve Turing Machines and Hoare Logic, and I just don’t want to do the necessary work.

It would also be boring.


Some people claim that writing functional tests are easy.

I call bullshit.

Writing functional tests are not just hard; they are NP-Hard.


One thought on “Testing software is (computationally) hard

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s