IBM®
Skip to main content
    Israel [change]    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

Formal Verification and Testing Technologies

Reliable Systems Technologies


BMC Benchmark Illustrations


In order to provide some illustrations of the use of the IBM CNF Benchmark (from http://www.haifa.il.ibm.com/projects/verification/RB_Homepage/bmcbenchmarks.html), we ran zChaff and berkmin561 on it. We used the default configuration for both engines.

For each model, we used SAT_dat.k.cnf with k=1, 10, 15, 20, 25, 30, 35, 40, 45, 50. The time-out was set at 10 000 sec on a workstation with 867841X Intel(R) Xeon(TM) CPU 2.40GHz.

Here is a summary of the results we achieved:

  zchaff berkmin561
Total Time (10000 sec timeouts) 344765 414094
First (# of CNF where the engine is the fastest) 298 131
Timeout Number 25 30
+ (# of CNF where the engine is the fastest by more than a minute and 20%) 67 32
First by Model (# of models where the engine is the fastest) 28 18


For complete details, see the attached ziped Excel file.

The CNFs are available on line; here is some information about their sizes.

The results shows that zChaff and Berkmin561 achieved close results. On some CNFs, zChaff runs faster, and on others, CNFs Berkmin561 runs faster. In most cases, the differences between their performance is not very significant (the faster speed is not faster by more than one minute or 20%). However, while zChaff seems to perform slightly better overall, Berkmin561 gets better results than zChaff on the UNSAT CNFs.

From these results, it is not possible to conclude that berkmin561 performs better than zChaff. This is not consistent with what can be read in the literature. The explanation could be in hardware differences (e.g., cache size) or in the benchmark composition (as we saw, the ranking between zChaff and Berkmin561 is very data dependant).

We believe that this experiment clearly shows the added value of using the IBM Benchmark for SAT tool evaluation.

We extended our comparison to siege_v4 (with 123456789 as a seed). Here is a summary of the results achieved:

  Total Time (including 10000sec timeout) #timeout
zChaff 345000 sec 25
Berkmin561 414000 sec 30
siege_v4 187000 sec 14


Siege_v4 is the fastest in 298 cases out of 498.
In many hard cases, siege_v4 is fastest by an order of magnitude or more. In some cases, siege_v4 performs worse than zchaff or berkmin in a significant way (e.g., for 26_rule, siege_v4 is slower than zchaff by an order of magnitude and slower than berkmin by two orders of magnitude).

For complete details, see this zipped Excel file.

We made some experiments with zChaff II (zChaff 2004.5.13) on the IBM public benchmark.

Here are some results (zChaff II code was left unchanged, so it used random seeds)

  Total Time (in seconds)
zChaff 344 764
berkmin561 414 035
siege_v4 197 239
zChaff II 389 304


zChaff II is faster than:

  • zChaff for 229 CNFs (out of 442)
  • Berkmin561 for 208 CNFs (out of 442)
  • Siege_v4 for 82 CNFs (out of 442)


The details of the results are available in a zipped Excel file.

All in all, it is difficult to conclude that zChaff II performs better than zChaff. Besides, Siege_v4 performs significantly better than zChaff II on the IBM CNF Benchmark.



 
 


 


    About IBMPrivacyContact