MSL: Enhancing LLM Recommenders via Masked Softmax Loss

Best AI papers explained - A podcast by Enoch H. Kang - Fridays

Categories:

The paper "MSL: Not All Tokens Are What You Need for Tuning LLM as a Recommender" identifies limitations of using the standard language modeling loss for fine-tuning large language models as recommendation systems. Specifically, it points out the divergence from recommendation goals and the misleading negative signals arising from treating all non-positive item descriptions as negative. To overcome these issues, the authors introduce Masked Softmax Loss (MSL), which selectively masks invalid tokens during loss calculation to better align with recommendation objectives. The paper further addresses a potential gradient vanishing problem in MSL by proposing an Adaptive Temperature Strategy (ATS) that dynamically adjusts a temperature parameter. Experimental results across multiple datasets validate the effectiveness of MSL, demonstrating significant improvements over existing methods.