SimCon logo

SimCon - Fortran Analysis, Engineering & Migration

  FPT Reference ManualDownloadsLicensingHome

 
 
 
 
 
WinFPT - Relative Debugging and Tracing Program Execution
 
Tracing Execution

The run-time trace facility may used to debug situations where bad data are generated in a program run, but the origin of the bad data is unclear. It may also be used to remove numerical drift when comparing runs of a program under two different compilers or in different computing environments.

FPT instruments the program code, or selected components of the code, to capture all left-hand-side scalar quantities to file. For example, click here for an program which models a cannon ball with square-law drag.

This code is modified by FPT to capture the outputs of every statement. Click here for the modified code.

The routine trace_start_sub_program logs the start of a program, subroutine or function. The FPT library routines trace_r4_data, trace_i4_data etc. write the left-hand-side quantities to file. These routines are self-initialising - the log file is created on the first call if it does not already exist.

The output shows entries to sub-programs, and the left-hand-side quantities, written one to each line. Click here to see the trace output. The main loop, starting with the copy of hddot to p_hddot and ending with the computation of the new value of x can clearly be seen.

 
Tracing Execution for Debugging

When a program crashes or rus incorrectly it is often possible to use a debugger to see bad data at the site of the crash. It is sometimes difficult to find where the bad data has come from. The FPT run-time trace facility may be used for tracing arithmetic errors.

The program is instrumented for run-time trace (Please see the FPT Reference Manual page for a description of the procedure) and a trace file is generated. The bad data may be found in the trace file, and then, by searching backwards, their origin may be identified.

Note that the trace output files can become very large, so we recommend that only a small sub-set of the files in a large program should be instrumented.

 
Relative Debugging - Removing Numerical Drift

The same program may produce significantly different results when it is built with different compilers or with different levels of optimisation under the same compiler. The differences may be due only to numerical drift. This occurs when different systems choose different orders of execution, or different variables to store in processor registers, with the result that there are small differences in rounding errors. These differences integrate and eventually affect the results. However, differences may also be due to compiler bugs or to coding errors which behave differently in different environments.

The WinFPT run-time trace facility, and the library of support routines distributed with WinFPT, are used to analyse this issue. Suppose that we wish to compare runs under two compilers, for example, gfortran and ifort. We want to know whether differences between the runs are due to coding errors or just to numerical drift. The procedure is as follows:

  • The program is instrumented to capture a run-time trace.

  • It is built under ifort and run. A trace file is generated.

  • It is built under gfortran.

It would now be possible to run the program again and compare the two trace files. This is usually not practical. The trace files drift apart because of numerical drift, and any differences due, for example, to coding errors are hidden amongst the large number of differences due to drift. Instead:

  • In the second run, under gfortran, the same subroutines which captured the trace of the first run read the trace file and compare every value computed by gfortran with the value computed by ifort. If the values are the same, no action is taken. If the values differ by more than a criterion amount, the difference is reported. The values computed in the second run are then overwritten by the values from the first run. This prevents the accumulation of numerical drift so that the runs do not drift apart.

The run-time trace files record a unique index which identifies each trace routine call. These indices are used to detect the situation where the two program runs follow different paths. If this occurs, the second run terminates at once, with a report of the point at which the two runs diverge.

This technique has proved to be very powerful in detecting:

  • Uninitialised variables

  • Array references out-of-bounds

  • Ill-defined orders of execution

  • Compiler bugs

 
Relative Debugging - Refining the Comparisons

The detailed behaviour in the second, comparison, run may be refined by writing an optional configuration file.

This file specifies:

  • The critieria for comparing real numbers. Two criteria are specified, a relative criterion difference and an absolute criterion difference. By default, the relative criterion difference is 1% and the absolute difference is 0.0001. A real number is reported as different if the difference exceeds both criteria. The requirement for an absolute criterion difference prevents the report of spurious differences when values are close to zero.

  • Whether integer and logical values are to be overwritten when differences are detected. Some programs use integers to store file and database handles which are always different on different runs. If these are overwritten, the file or database handling may fail.

  • The location of the trace file. These files may become very large and it may be necessary to store them on external devices.

Copyright ©1995 to 2015 Software Validation Ltd. All rights reserved.