Seeding Strategies in Search-Based Unit Test Generation

Submitted to STVR.

Abstract

Search-based techniques have been applied successfully to the task of generating unit tests for object-oriented software. However, as for any meta-heuristic search, the efficiency heavily depends on many factors; seeding, which refers to the use of previous related knowledge to help solve the testing problem at hand, is one such factor that may strongly influence this efficiency. This paper investigates different seeding strategies for unit test generation, in particular seeding of numerical and string constants derived statically and dynamically, seeding of type information, and seeding of previously generated tests. To understand the effects of these seeding strategies, the results of a large empirical analysis carried out on a large collection of open source projects from the SF110 corpus and the Apache Commons repository are reported. These experiments show with strong statistical confidence that, even for a testing tool already able to achieve high coverage, the use of appropriate seeding strategies can further improve performance.

Experimental Material

Executable EvoSuite jar file (evosuite-master-1.0.2-SNAPSHOT.jar).
- Relevant parameters
  - Seeding constants from bytecode: -Dprimitive_pool=[probability]
  - Seeding values observed at runtime: -Ddynamic_pool=[probability]
  - Seeding types information: -Dseed_types=[boolean]
  - Seeding previous solution: -Dseed_clone=[probability] -Dseed_mutations=[probability] -Dtest_factory=JUnit -Dselected_junit=[JUnit class]
Corpus of classes used in RQ1-3 and RQ5 (SF110 Corpus of Java classes).
Random stratified sample of 100 classes from SF110 (classes_sf110_stratified_100.txt).
List of all SF110 classes (classes_sf110.txt).
Apache Commons projects used for RQ4 (apache_commons_projects.txt):
List of 1212 classes with accompanying test suites from the Apache Commons repository (classes_commons_with_tests.txt).