Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models.
People are getting excessive mental health advice from generative AI. This is unsolicited advice. Here's the backstory and what to do about it. An AI Insider scoop.
Use headings for responses longer than five lines. Use numbered lists for sequences and bullet lists for collections. Use tables for comparisons by default. Avoid tables that will be too wide for the ...
Shambaugh recently closed a request from one such AI agent (as the issue it was attempting to weigh in on was only open to human contributors). The bot then retaliated by writing a 'hit piece' about ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results