We provide theoretical and empirical analysis of when proxy rewards can improve sample efficiency in preference learning. Our work establishes conditions under which proxy-based approaches outperform direct preference optimization, offering practical guidance for more efficient human feedback integration in AI systems.