ML Tea: Chain-of-Thought Degrades Abstention in LLMs, Unless Inverted / Context-aware sequence-to-function model of human gene regulation

Speakers: Abinitha Gourabathina and Ekin Deniz Aksu

Bios:

Abinitha is a second year EECS PhD student in MIT LIDS. She is co-advised by Professors Marzyeh Ghassemi and Collin Stultz. Her research interests lie broadly in trustworthy machine learning, with a particular focus on sensitive domains like healthcare. She completed her B.S.E. in Operations Research and Financial Engineering at Princeton University.

Ekin is a PhD student in the Max Planck Institute for Molecular Genetics in Berlin, working on computational biology to uncover the regulatory code of the human genome, he is an MD and has also worked on cancer biology and infectious diseases before.

Abstracts:

(1) For Large Language Models (LLMs) to be reliably deployed, models must effectively know when not to answer: abstain. Chain-of-Thought (CoT) prompting has been gained popularity for improving model performance by ensuring structured outputs that follow a logical sequence. In this paper, we first investigate how current abstention methods perform with CoT outputs, finding that direct use of reasoning traces can degrade performance of existing abstention methods by more than 5%. As a result, we introduce a new framework for thinking about hallucinations in LLMs not as answering a question incorrectly but instead as LLMs answering the wrong question. Based on this framework, we develop a new class of state-of-the-art abstention methods called Trace Inversion. First, we generate the reasoning trace of a model. Based on only the trace, we then reconstruct the most likely query that the model responded to. Finally, we compare the initial query with the reconstructed query. Low similarity score between the initial query and reconstructed query suggests that the model likely answered the question incorrectly and is flagged to abstain. We perform extensive experiments to find impressive performance gains with our Trace Inversion methods.

(2) Sequence-to-function models have been very successful in predicting gene expression, chromatin accessibility, and epigenetic marks from DNA sequences alone. However, current state-of-the-art models have a fundamental limitation: they cannot extrapolate beyond the cell types and conditions included in their training dataset. Here, we introduce a new approach that is designed to overcome this limitation: Corgi, a new context-aware sequence-to-function model that accurately predicts genome-wide gene expression and epigenetic signals, even in previously unseen cell types. We designed an architecture that strives to emulate the cell: Corgi integrates DNA sequence and trans-regulator expression to predict the coverage of multiple assays including chromatin accessibility, histone modifications, and gene expression. We define trans-regulators as transcription factors, histone modifiers, transcriptional coactivators, and RNA binding proteins, which directly modulate chromatin states, gene expression, and mRNA decay. Trained on a diverse set of bulk and single cell human datasets, Corgi has robust predictive performance, approaching experimental-level accuracy in gene expression predictions in previously unseen cell types, while also setting a new state-of-the-art level for joint cross-sequence and cross-cell type epigenetic track prediction. Corgi can be used in practice to impute many assays including DNA accessibility and histone ChIP-seq from RNA-seq data.

Add to Calendar 2025-10-15 16:00:00 2025-10-15 17:00:00 America/New_York ML Tea: Chain-of-Thought Degrades Abstention in LLMs, Unless Inverted / Context-aware sequence-to-function model of human gene regulation Speakers: Abinitha Gourabathina and Ekin Deniz AksuBios:Abinitha is a second year EECS PhD student in MIT LIDS. She is co-advised by Professors Marzyeh Ghassemi and Collin Stultz. Her research interests lie broadly in trustworthy machine learning, with a particular focus on sensitive domains like healthcare. She completed her B.S.E. in Operations Research and Financial Engineering at Princeton University. Ekin is a PhD student in the Max Planck Institute for Molecular Genetics in Berlin, working on computational biology to uncover the regulatory code of the human genome, he is an MD and has also worked on cancer biology and infectious diseases before.Abstracts:(1) For Large Language Models (LLMs) to be reliably deployed, models must effectively know when not to answer: abstain. Chain-of-Thought (CoT) prompting has been gained popularity for improving model performance by ensuring structured outputs that follow a logical sequence. In this paper, we first investigate how current abstention methods perform with CoT outputs, finding that direct use of reasoning traces can degrade performance of existing abstention methods by more than 5%. As a result, we introduce a new framework for thinking about hallucinations in LLMs not as answering a question incorrectly but instead as LLMs answering the wrong question. Based on this framework, we develop a new class of state-of-the-art abstention methods called Trace Inversion. First, we generate the reasoning trace of a model. Based on only the trace, we then reconstruct the most likely query that the model responded to. Finally, we compare the initial query with the reconstructed query. Low similarity score between the initial query and reconstructed query suggests that the model likely answered the question incorrectly and is flagged to abstain. We perform extensive experiments to find impressive performance gains with our Trace Inversion methods.(2) Sequence-to-function models have been very successful in predicting gene expression, chromatin accessibility, and epigenetic marks from DNA sequences alone. However, current state-of-the-art models have a fundamental limitation: they cannot extrapolate beyond the cell types and conditions included in their training dataset. Here, we introduce a new approach that is designed to overcome this limitation: Corgi, a new context-aware sequence-to-function model that accurately predicts genome-wide gene expression and epigenetic signals, even in previously unseen cell types. We designed an architecture that strives to emulate the cell: Corgi integrates DNA sequence and trans-regulator expression to predict the coverage of multiple assays including chromatin accessibility, histone modifications, and gene expression. We define trans-regulators as transcription factors, histone modifiers, transcriptional coactivators, and RNA binding proteins, which directly modulate chromatin states, gene expression, and mRNA decay. Trained on a diverse set of bulk and single cell human datasets, Corgi has robust predictive performance, approaching experimental-level accuracy in gene expression predictions in previously unseen cell types, while also setting a new state-of-the-art level for joint cross-sequence and cross-cell type epigenetic track prediction. Corgi can be used in practice to impute many assays including DNA accessibility and histone ChIP-seq from RNA-seq data.  TBD

Part of

ML Tea

ML Tea: Chain-of-Thought Degrades Abstention in LLMs, Unless Inverted / Context-aware sequence-to-function model of human gene regulation

October 15 2025

Location

Part of

November 24

ML Tea: Planning and Problem-Solving with General, Scalable Neuro-Symbolic Models

November 17

ML Tea: Domain-Aware Scaling Laws Uncover Data Synergy / Ambient Diffusion Omni: Training Good Models with Bad Data

ML Tea: Chain-of-Thought Degrades Abstention in LLMs, Unless Inverted / Context-aware sequence-to-function model of human gene regulation

October 15 2025

Location

Part of

Related Events

November 24

ML Tea: Planning and Problem-Solving with General, Scalable Neuro-Symbolic Models

November 17

ML Tea: Domain-Aware Scaling Laws Uncover Data Synergy / Ambient Diffusion Omni: Training Good Models with Bad Data