Interpretable Conversation Routing with Latent Embeddings Approach

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Large language models (LLMs) get quickly implemented into question answering and support systems to automate customer experience across all domains even including medical use cases. Models in such environments should solve multiple problems like general knowledge questions, queries to external sources, function calling and many others. Some cases might not even require a full-on text generation. They possibly need different prompts or even models. All of it can be managed by a routing step. This paper focuses on interpretable few-shot approaches for conversation routing like latent embeddings retrieval. The work here presents a benchmark, a sorrow analysis, and a set of visualizations of the way latent embeddings routing works for long-context conversations in a multilingual, domain-specific environment. The results presented here show that latent embeddings router is able to achieve performance on the same level as LLM-based routers with additional interpretability and higher level of control over model decision making.

Article activity feed