问HN:你关心哪些RAG评估?
我目前正在改进Morphik(https://github.com/morphik-org/morphik-core)基础设施中的测试部分。似乎有很多不同的RAG评估,我很好奇您最关心哪些评估。
当然,这些评估会根据使用案例有所不同,但我希望能更多了解您的使用案例,以及哪个基准或评估对您来说最重要,为什么。
查看原文
I'm currently in the process of improving the testing part of our infrastructure at Morphik (https://github.com/morphik-org/morphik-core). There seem to be a lot of different RAG evaluations out there, and I'm just curious which ones you care about the most.<p>Of course, these will be different based on use cases, but I'd love to learn more about your use case, and which benchmark or eval matters the most for it and why.