Large language models (LLMs) often face conflicts between stored knowledge and contextual information, which can lead to outdated or incorrect responses. Analyzing LLMs’ internal activations, we […]
Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations Read Paper Neural networks deliver exceptional performance but can be impractical for applications with […]
Robust low-rank training via approximate orthonormal constraints Read Paper As models and datasets grow, pruning techniques using low-rank matrix factorizations have become popular for reducing resource […]
Are We Done with MMLU? Read Paper Our analysis uncovers significant issues with the Massive Multitask Language Understanding (MMLU) benchmark, which is widely used to assess […]
Enhancing AI Model Robustness with Natural Language Explanations Read Paper In this paper, we explore how natural language explanations (NLEs) can improve the robustness of large […]
Probing the Emergence of Cross-lingual Alignment during LLM Training Read Paper Multilingual LLMs excel at zero-shot cross-lingual transfer, likely by aligning languages without parallel sentence supervision. […]
Using Natural Language Explanations to Improve Robustness of In-context Learning Read Paper This work explores improving the robustness of LLMs against adversarial inputs by augmenting in-context […]
A Simple and Effective L2 Norm-Based Strategy for KV Cache Compression Read Paper The deployment of large language models (LLMs) is often hindered by the extensive […]
SPARSEFIT: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations Read Paper This work introduces SparseFit, a sparse few-shot fine-tuning strategy for […]