Research

2026

Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

Joshua Ong Jun Leang, Yu Zhao, Mihaela Cătălina Stoian, Wenda Li, Shay B. Cohen, Eleonora Giunchiglia

ICML · 2026 · doi:10.48550/arXiv.2602.12586

We introduce McDiffuSE, a framework that formulates slot selection in Masked Diffusion Models as decision making and optimises infilling orders through Monte Carlo Tree Search. Look-ahead simulations evaluate partial completions before commitment, systematically exploring the combinatorial space of generation orders. McDiffuSE achieves average gains of 3.2% over autoregressive baselines and 8.0% over plan-and-infill baselines, with notable improvements of 19.5% on MBPP and 4.9% on MATH500.

PiCSAR: Probabilistic Confidence Selection And Ranking for Reasoning Chains

Joshua Ong Jun Leang, Zheng Zhao, Aryo Pradipta Gema, Sohee Yang, Wai-Chung Kwan, Xuanli He, Wenda Li, Pasquale Minervini, Eleonora Giunchiglia, Shay B. Cohen

ACL (Findings) · 2026 · doi:10.48550/arXiv.2508.21787

We propose PiCSAR (Probabilistic Confidence Selection And Ranking), a simple training-free method for best-of-n sampling that scores candidate reasoning chains using the joint log-likelihood of the reasoning and final answer. This naturally decomposes into reasoning confidence and answer confidence. PiCSAR achieves substantial gains across diverse benchmarks (+10.18 on MATH500, +9.81 on AIME2025), outperforming baselines with at least 2× fewer samples in 16 out of 20 comparisons.

Code

A Survey on Deep Learning Approaches for Tabular Data Generation: Utility, Alignment, Fidelity, Privacy, Diversity, and Beyond

Mihaela Cătălina Stoian, Eleonora Giunchiglia, Thomas Lukasiewicz

TMLR · 2026 · doi:10.48550/arXiv.2503.05954

We review deep generative modelling approaches for tabular data from the perspective of four types of requirements: utility of the synthetic data, alignment of the synthetic data with domain-specific knowledge, statistical fidelity of the synthetic data distribution compared to the real data distribution, and privacy-preserving capabilities.

2025

Right for the Right Reasons: Avoiding Reasoning Shortcuts via Prototypical Neurosymbolic AI

Luca Andolfi, Eleonora Giunchiglia

NeurIPS · 2025 · doi:10.48550/arXiv.2510.25497

Neurosymbolic AI models are prone to reasoning shortcuts — learning spurious correlations rather than the intended concepts. We introduce Prototypical Neurosymbolic architectures that avoid shortcuts at their root cause by training models to satisfy background knowledge while accounting for input similarity to a handful of labelled datapoints. We validate on the rsbench benchmark suite across synthetic (MNIST-EvenOdd, Kand-Logic) and real-world (BDD-OIA) tasks, showing significant improvements in learning the right concepts even in extremely low data regimes.

Code

Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints

Mihaela Cătălina Stoian, Eleonora Giunchiglia

ICLR · 2025 · doi:10.48550/arXiv.2502.18237

We introduce the Disjunctive Refinement Layer (DRL), a novel layer designed to enforce the alignment of generated data with the background knowledge specified in user-defined constraints. DRL is the first method able to automatically make deep learning models inherently compliant with constraints as expressive as quantifier-free linear formulas, which can define non-convex and even disconnected spaces.

Code

A Posteriori Verification or a Priori Design? Navigating Requirements-Driven Deep Learning

Eleonora Giunchiglia

ECAI · 2025 · doi:10.3233/FAIA250782

We contrast two approaches to ensuring machine learning systems satisfy formal requirements in safety-critical settings: integrating requirements directly into model architecture and training (a priori design) versus analyzing trained models for desired properties (a posteriori verification).

2024

How Realistic Is Your Synthetic Data? Constraining Deep Generative Models for Tabular Data

Mihaela Cătălina Stoian, Salijona Dyrmishi, Maxime Cordy, Thomas Lukasiewicz, Eleonora Giunchiglia

ICLR · 2024 · doi:10.48550/arXiv.2402.04823

We show how deep generative models for tabular data can be constrained such that their generated samples are guaranteed to be compliant with given constraints. This is achieved by automatically parsing the constraints and transforming them into a Constraint Layer seamlessly integrated with the model.

Code

Deep generative models as an adversarial attack strategy for tabular machine learning

Salijona Dyrmishi, Mihaela Cătălina Stoian, Eleonora Giunchiglia, Maxime Cordy

ICMLC · 2024 · doi:10.48550/arXiv.2409.12642

We adapt popular tabular deep generative models into adversarial models and evaluate their effectiveness in generating realistic adversarial examples that conform to domain constraints.

Code

CCN+: A neuro-symbolic framework for deep learning with requirements

Eleonora Giunchiglia, Alex Tatomir, Mihaela Cătălina Stoian, Thomas Lukasiewicz

International Journal of Approximate Reasoning · 2024 · doi:10.1016/j.ijar.2024.109124

We present CCN+, a neuro-symbolic framework that integrates logical requirements directly into neural network outputs using inference rules to ensure compliance, and adapts the standard binary cross-entropy loss for constraint satisfaction in deep learning.

ULLER: A Unified Language for Learning and Reasoning

Emile van Krieken, Samy Badreddine, Robin Manhaeve, Eleonora Giunchiglia

NeSy (Spotlight) · 2024 · doi:10.1007/978-3-031-71167-1_12

We introduce ULLER, a unified language for learning and reasoning that standardises how background knowledge is expressed across neuro-symbolic AI frameworks. ULLER provides first-order logic syntax with multiple semantic interpretations.

PiShield: A PyTorch Package for Learning with Requirements

Mihaela Cătălina Stoian, Alex Tatomir, Thomas Lukasiewicz, Eleonora Giunchiglia

IJCAI · 2024 · doi:10.48550/arXiv.2402.18285

We introduce PiShield, the first package ever allowing for the integration of (propositional or linear) requirements into the neural networks’ topology. PiShield guarantees compliance with these requirements, regardless of input.

Code

Website

Published research

2026

2025

2024