Prompt-OIRL: Offline Inverse RL for Query-Dependent Prompting

Best AI papers explained - A podcast by Enoch H. Kang

Categories:

This paper introduces Prompt-OIRL, a novel method to enhance the arithmetic reasoning of large language models by optimizing prompts based on individual queries. The authors identify challenges in evaluating prompts during inference and the high costs of online prompt optimization. To address these, Prompt-OIRL employs offline inverse reinforcement learning to learn from existing prompt evaluation data and build a reward model for cost-efficient, query-specific prompt assessment and selection, validated across various models and datasets.