Glass Image Background
Artificial Intelligence
cover image
Running Evaluation Metrics With Different LLMS + Q/A
Hosted by
host-profile-image
Mohammad Arshad

AUG

26

Tue, 26 Aug

03:30 PM - 04:30 PMfalse

Online

Register to get link
Hey, See you at the event!
Ticket Price$9.99

Ticket Price$9.99

About

In this beginner-friendly 40-minute workshop, you’ll learn a simple, repeatable way to evaluate Q&A answers from different LLMs using a tiny dataset and two complementary approaches: basic automatic scores (Exact Match/F1) and an “LLM-as-Judge” rubric for Correctness, Faithfulness, Relevance, and Conciseness. We’ll show how to compare models fairly (same prompt/settings, temperature=0, consistent context), interpret results, and turn findings into actions using a light Analyze → Measure → Open Coding → Axial Coding loop. You’ll leave with a plug-and-play rubric, a mini dataset template, and a beginner notebook that generates a clear side-by-side report—so you can pick the right model with confidence and iterate quickly. + AI Residency Q/A Add the DDS Google calendar link so that you don't miss any events
Event By
Ask a question
4 people attending
Attendees 0
Attendees 1
Attendees 2
Attendees 3
See attendees

Location

Running Evaluation Metrics With Different LLMS + Q/A
Register to get event link
Online
This event is part of a community
community-profile-image
Artificial Intelligence
11,713 Members
Built with
en