Skip to content

Add a new delegate to allow API tracing#505

Merged
kfriedberger merged 300 commits intomasterfrom
tracing_delegate
Mar 28, 2026
Merged

Add a new delegate to allow API tracing#505
kfriedberger merged 300 commits intomasterfrom
tracing_delegate

Conversation

@daniel-raffler
Copy link
Copy Markdown
Contributor

@daniel-raffler daniel-raffler commented Aug 14, 2025

This is a preliminary draft for adding API tracing to JavaSMT with the help of a new delegate. The idea is to record all API calls and generate a new Java program from them. By running this program the exact sequence of calls can then be recreated. The main application here is debugging, where the traces allow us to create easy to reproduce examples for solver errors. This is especially useful when the error occurs as part of a larger program where it can be hard to pin down the exact sequence of JavaSMT calls that are needed to trigger the bug.

We use a new delegate to implement this feature. Setting solver.trace to true will enable tracing, and the output will be stored in a file called trace*.java

TODO

  • Finish the implementation. Currently we only have (parts of) the ArrayFormulaManager, IntegerFormulaManager, BooleanFormulaManager, UFManager and ProverEnvironment
  • Write the trace to a file while it's being created. We'll need this to debug segfaults as the trace is otherwise lost done
  • Consider adding an option to skip duplicate calls. (The trace is currently way too long) Fixed, but not committed yet
  • Write a simple delta-debugger to shrink the trace down even further3 Maybe later..

We're now using ddSmt, see comment #505 (comment)

Things left to do

  • Add support for missing formula managers in the script
    Still missing: floating point, quantifier, strings and separation logic. At least the first two should still be added before merging
  • Handle solver options in the script
  • Fix undo point in the trace logger
    Done, but we should double check the Rebuilder
  • Merge Add support for indexed functions #507
  • Add user documentation for debugging with the tracer (see the comment below)
  • Update the changelog
  • Run some tests in CPAchecker to see if there are still issues in the script
  • Add support for quantifiers and interpolation to the Smtlib translation script
  • Test with more solvers

@daniel-raffler
Copy link
Copy Markdown
Contributor Author

Write a simple delta-debugger to shrink the trace down even further

I started working on the delta-debugger today and wrote a python script to reduce the size of the traces. So far it does little more that some dead-code elimination, but that's already enough to bring down the size of the trace by a factor of ten. I believe that another factor of two should be possible with some aggressive optimization.

The issue now is that I don't quite know where to put such a script in JavaSMT. We could handle this as a separate project, or maybe include it in the JavaSMT source tree, similar to the Example projects. However, neither really seems quite ideal.

@baierd, @kfriedberger: What is your opinion?

Here is the file in question:

#!/usr/bin/env python3
import re
import sys
from collections import defaultdict
from pathlib import Path


# Read a trace file
def readTrace(path):
    with open(path) as file:
        return [line.rstrip() for line in file]


# Build a map with line numbers for all variable definitions
def getLinesForDefinitions(trace):
    lineNumber = 1
    lineDefs = dict()
    for line in trace:
        if line.find('=') >= 0:
            leftSide = line[0:(line.find('=') - 1)]
            name = re.match('var (.*)', leftSide)
            lineDefs[name.group(1)] = lineNumber
        lineNumber = lineNumber + 1
    return lineDefs


# Build a dependency graph for the definitions
# Maps from variables to the places where they are used
def buildDependencies(lineDefs, trace):
    lineNumber = 1
    deps = defaultdict(list)
    for line in trace:
        expr = line[(line.find('=') + 2):] if line.find('=') >= 0 else line
        object = expr[0:expr.find('.')]
        if object[0].islower():
            deps[lineDefs[object]].append(lineNumber)
        # FIXME Parse the expression to get the variables
        for m in re.finditer('(config|logger|notifier|var[0-9]+)', expr):
            deps[lineDefs[m.group()]].append(lineNumber)
        lineNumber += 1
    return deps


# Collect all top-level statements
# Top-level statements are:
#  *.addConstraint(*)
#  *.isUnsat()
#  *.getModel()
#  *.asList()
# FIXME Finish this list
def usedTopLevel(lineDefs, trace):
    tl = set()
    for line in trace:
        m = re.fullmatch(
            'var (var[0-9]+) = (var[0-9]+).(isUnsat\\(\\)|getModel\\(\\)|asList\\(\\)|addConstraint\\((var[0-9]+)\\));',
            line)
        if m != None:
            tl.add(lineDefs[m.group(1)])
    return tl


# Calculate the closure of all used definitions, starting with the top-level statements
def usedClosure(tl, deps):
    cl = set()
    st = set(tl)
    while cl.union(st) != cl:
        cl = cl.union(st)
        st = set()
        for (key, val) in deps.items():
            if set(val).intersection(cl) != set():
                st.add(key)
    return cl


# Keep only statements and definitions that are used
def filterUnused(used, trace):
    lineNumber = 1
    reduced = []
    for line in trace:
        if line.find('=') == -1 or lineNumber in used:
            reduced.append(line)
        lineNumber += 1
    return reduced


# Remove all definitions that are not used (recursively)
def removeDeadCode(trace):
    lineDefs = getLinesForDefinitions(trace)
    deps = buildDependencies(lineDefs, trace)
    tl = usedTopLevel(lineDefs, trace)
    cl = usedClosure(tl, deps)
    return filterUnused(cl, trace)


# We'll use multiple passes to reduce the size of the trace:
# 1. Read the trace
# 2. Remove unused code
# 3. Remove unnecessary toplevel commands
# 4. Loop: Remove aliasing (by duplicating the definitions)
# 5.    Loop: Reduce terms
# 6. Remove unused prover environments
if __name__ == '__main__':
    arg = sys.argv
    if not len(sys.argv) == 2:
        print('Expecting a path to a trace file as argument')
        exit(-1)

    path = Path(sys.argv[1])
    if not (path.is_file()):
        print(f'Could not find file "{path}"')
        exit(-1)

    # TODO Implement steps 3-6
    # TODO Check that the reduced trace still crashes

    trace = readTrace(path)
    for line in removeDeadCode(trace):
        print(line)

The idea is to run JavaSMT with solver.trace=true to collect a trace of the crash, and then use the script to reduce the trace. Since we're "crashing" (posssibly with a segfault) there doesn't seem to be a good way to do this in one go

@daniel-raffler
Copy link
Copy Markdown
Contributor Author

@kfriedberger Please be careful when pushing changes to this branch. It's fine for now, but I still have quite a few unpublished changes and don't want to run into merge conflicts 😅

@kfriedberger
Copy link
Copy Markdown
Member

kfriedberger commented Aug 30, 2025

I find this branch quite interesting and was just playing around with it yesterday. I wished for such a tracing layer some years ago, as it would simplify debugging from other tools, such as CPAchecker, and could help with bug-reports. Some SMT solvers, such as Z3, provide tracing on their side, too.

I try to keep my changes small to avoid conflicts, or open other PRs, just like you.

For the delta-debugger script, you can either add it in a new directory scripts/ or under doc/examples/. We will not package it into any released JAR file, but can keep it as part of this repository, until the scripts grow further and can be moved into their own project.

kfriedberger
kfriedberger previously approved these changes Mar 26, 2026
Copy link
Copy Markdown
Member

@kfriedberger kfriedberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR brings a nice feature for debugging in user applications and exporting a trace for JavaSMT. It covers about 90% of features provided by JavaSMT, with the remaining features being expensive, complex, waiting or under discussion to be implemented. This PR also includes several smaller and unrelated updates to solver bindings, which should be done in a separate PR, so maybe we can extract them and solve/merge upfront in another commit.

@daniel-raffler
Copy link
Copy Markdown
Contributor Author

@kfriedberger
Thanks for having a look at this! I temporarily turned tracing back on for the tests. Just revert ad8693c once you're done

About 23bb920: I think this cuts off too early now. We're already checking in TraceLogger.logDef if this is a known formula:

  /** Log an API call with return value. */
  public synchronized <R extends Formula> R logDef(
      String prefix, String method, Callable<R> closure) {
    String var = newVariable();
    try {
      appendDef(var, prefix + "." + method);
      R f = closure.call();
      if (isTracked(f)) {
        undoLast();
        return f;
      } else { // <- It's not tracked
        keepLast();
        mapVariable(var, f); // <- Now we track it
        return mgr.rebuild(f); // <- And then call 'rebuild'
      }
    } catch (Exception e) {
      sneakyThrow(e);
      throw new RuntimeException(e);
    }
  }

If we add another check in TraceFormulaManager.rebuild the formula is never traversed, and we may end up with missing subterms in our cache

Did you notice any issues with the old version? Otherwise, I'd prefer to go back to it

@kfriedberger kfriedberger merged commit c22a91b into master Mar 28, 2026
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants