HNNewShowAskJobs
Built with Tanstack Start
Evaluating Long-Context Question and Answer Systems(eugeneyan.com)
15 points by swyx 6 days ago | 1 comment
  • rooftopzen2 days ago

    Seems AI generated, if not, nothing new here. Post regurgitates info known for long time and misses largest issues of nuance of “LLM-as-a-judge” as if written in 2023 and audience is living under rock (why?):

    >> This is where LLM-evaluators (also called “LLM-as-Judge”) can help