Collaboration & evaluation for LLM apps

Practical AI: Machine Learning, Data Science - A podcast by Changelog Media

Categories:

Small changes in prompts can create large changes in the output behavior of generative AI models. Add to that the confusion around proper evaluation of LLM applications, and you have a recipe for confusion and frustration. Raza and the Humanloop team have been diving into these problems, and, in this episode, Raza helps us understand how non-technical prompt engineers can productively collaborate with technical software engineers while building AI-driven apps.