From the Front Lines to the Test Bench: Applied AI and RAG Evaluation, ft. BCG
BCG X is thrilled to host an exclusive in-person Tech Talk for the MIT CSAIL community featuring Harnish Jani (https://www.linkedin.com/in/harnishjani/), Managing Director and Partner at BCG X, and Max Struever (https://www.linkedin.com/in/maxstruever/), Principal at BCG X.
Join us for two dynamic discussions exploring GenAI-augmented field operations and retrieval-augmented generation (RAG) evaluation.
Discussion A (12:00 PM – 12:30 PM ET) Session Title: Accelerating Value Unlock for Field Operations, Maintenance, and Safety Through (Gen)AI Led by: Harnish Jani
In high-stakes, hands-on environments like field services, the potential for AI is massive—but realizing that potential isn’t straightforward. The explosive emergence of Generative AI has dominated headlines, leading innovators and operators in the field services sector to ask deeper questions: What does this technology actually mean for operations on the ground? How do we drive adoption among technicians? And how can we turn hype into tangible impact? In this session, we’ll explore practical applications of (Gen)AI across field operations, maintenance, and safety—grounded in lessons from real-world deployments. We'll highlight the roles being transformed and how to build trust for creating next-gen field technicians. Key topics include: Artificial Intelligence, Generative AI Next-Gen Field Technicians (Upskilling and Workforce Transformation) Trust and Adoption in (Gen)AI Implementation
Discussion B (12:30 PM – 1:00 PM ET) Session Title: Do You Know What You’re Testing? A Generalizable Coverage Metric for RAG Evaluation. Led by: Max Struever
While most LLM evaluation tools focus on output quality, they often overlook a critical blind spot: whether the evaluation questions themselves comprehensively and representatively test the underlying knowledge base. In this session, we’ll introduce a new, generalizable methodology for assessing test set coverage in retrieval-augmented generation (RAG) systems. This approach is especially important in constrained domains like enterprise RAG applications. We’ll present a metric-based framework using vector embeddings to quantify coverage, share real-world examples where coverage diagnostics enhanced evaluation validity, and explore broader implications for benchmarking, system tuning, and error analysis. Key topics include: RAG: Retrieval-Augmented Generation LLM Evaluation Coverage Metrics
We hope you’ll join us for this engaging hour of insights and conversation. Lunch will be served, and there will be time for live Q&A with both speakers. Reserve your seat now and be part of the conversation shaping the future of tech!
REGISTRATION: https://airtable.com/appbkKhr4y8cVCCdm/shr8GUIWjRQxIhjsv