Specifying goals to deep neural networks with answer set programming

Published in International Conference on Automated Planning and Scheduling (ICAPS), 2024

This paper shows how logical goal specifications can condition a learned heuristic so one trained DNN can solve many goal variants without retraining.

Why it matters

  • DNN-based heuristics for planning typically assume a fixed, pre-defined goal.
  • Changing the goal often requires retraining or enumerating all acceptable goal states.
  • Many real problems require specifying properties of a goal rather than a single exact state.
  • There is no formal mechanism to express logical goal constraints directly to a trained DNN.
  • In domains with many valid target states (for example, Rubik's cube patterns), enumerating all goals is computationally burdensome.

What we did

  • Introduced a goal-conditioned DQN, Q(s, a, G), that estimates cost-to-go for a set of goal states.
  • Represented goals as sets of ground atoms in first-order logic.
  • Used Answer Set Programming (ASP) to generate stable models that define goal specifications.
  • Trained with random walks (100-200 moves for Rubik's cube starts) and subsampled logical goal atoms.
  • Combined the learned heuristic with weighted batched Q* search (batch size 10,000; weight 0.6).
  • Demonstrated diverse Rubik's cube and Sokoban goals without retraining the DNN.
  • Showed broader goal specifications can reduce solve time (for example, Cross6: 218.45 s vs canonical: 625.62 s).

How it works

  • Goal-conditioned Q-learning: learn Q(s, a, G) where G is a set of ground atoms.
  • Goal generation: convert a state to logical atoms and remove subsets to create generalized goals.
  • ASP specification: use stable models from an ASP program to represent goal sets.
  • Model refinement: if a stable model is not a valid goal model, iteratively expand it.
  • Search: use batched weighted Q* search to reach states satisfying the logical goal.
Training and specification pipeline for goal-conditioned DQN with ASP-based goal construction.
Figure 1. Overview of training goal-conditioned DQN and specifying goals via ASP.

Figure 1 illustrates how logical specifications are converted into ground atoms and passed into the DQN.

Key contributions

  • Formalizes goal specification to DNN heuristics using first-order logic and ASP.
  • Introduces a training procedure that generalizes across unseen goals without retraining.
  • Demonstrates diverse Rubik's cube goals (for example, Cross6, CupSpot, Checkers) and Sokoban goals.
  • Empirically shows that broader goal sets can reduce solve time and path cost (for example, Cross6 vs canonical).

Recommended citation: Forest Agostinelli, Rojina Panta, and Vedant Khandelwal. (2024). "Specifying goals to deep neural networks with answer set programming." Proceedings of the ICAPS, vol. 34, pp. 2-10.
Download Paper