LLM

Building an End-2-End Agentic RAG

Kunal Verma

Sep 3, 2025 • 7 min read

No,it’s about juggling between various frameworks, chunking strategies, langchain/llamaindex, just direct straightforward implementation

A very interesting thing just happened, Weaviate — a popular Vector database company just opensource Elysia: An Efficient way to build end-to-end agentic RAG.

What’s the current scenario & Problem ?

Anyone out htere how tried to build a RAG/Question/Answering AI chat app, knows this that AI loves to act confident, but the moment you ask a real question, it usually crumbles. You know the pain, you load the documents, connect with your databases with applying glue on dozens of frameworks, hook it up with some LLM, cross your fingers and pray that the answers are accurate and aren’t hallucinated nonsense.
Still 9/10 the answers will be either irrelevant to your context or too vague to be any useful. That’s the core issue with traditional RAG. They’re basically blind. They just grab vector embeddings, run a similarity search, and hope for the best.

That’s why the team at Weaviate, decidedto take this whole problem headon.

There’s a new system they built called Elysia that does things differently, it shows you the exact steps behind the answers it is generating, adapts how it displays your data with tables, product cards, or charts, and even learn from your feedback. Now this is honestly bliss to developer community as a whole, isntead of building all the parts individually, Elysia just combined them into one.
Let’s talk more about it.

Introduction: What is Elysia ?

An OpenSource, agentic RAG framework that represents a fundamental rethink of how data can be fed through AI and generates more factual, and more user friendly representation of data.

Elysia is based on decision tree ( a model that makes decisions by asking a series of yes/no questions about the data until it reaches an answer) based agentic system which intelligently decides what tools to use, what results have been obtained, whether it should continue the process or whether its goal has been completed. It offers both a full frontend interface and an easily python-package.

By default, obviously it sits on top of your Weaviate database and perform smart searches — automatically generating unique filters and search parameters basd on just natural language from the user — displays the results dynamically on the frontend.

The best is, it is opensource, we can customise it as per ours to create tools to whatever purposes we need an agentic AI for, for example, custoemr support, internal search engine, text-to-sql, etc

Architecture of Elysia

Elysia is architectured as a modern web application with full-featured frontend for a responsive, real-time interfade and a FASTAPI backend serving both the web interface and API.

The core logic is written in pure python, custom logic — with DSPY ( an NLP framework to enhance the programming of your logic to develop easy-fast-better LLMs apps), handling LLM interactions for flexible, future-proof implementations.

Three Pillars of Elysia

Decision Trees & Decision agents

At the heart of Elysia, is its decision tree archtiecture, unlike simple agentic platforms which have access to all possible tools at runtime, Elysia has a pre-defiend web of nodes, each with a corresponding action. Each node in the tree is orchestrated by a decision agent with global context awareness about its environment and its available options. The decision agent evaluates its environment, available actions, past actions and future actions to strategize the best tool to use.

Think of as agent function calling but all your functios are either databases, logics, tools, previous context, etc

The decision agent also outpus reasoning, which is handed down to future agents to continue working towards the same gola. Each agent will be aware of previious agent’s intentions.

The tree structure also enables advanced error handling mechanisms and completion conditions. For example, the agents can set an “impossible flag” during a tree step when they determine a task cannot be completed with available data. (this is so cool and will save a lot of time)
For example, if you ask about trouse prices in an ecommerce collection but only have a jewelry collectikon available, the agent will recogniize this mismatch and reprot back to the deciison tree that the task is impossible.
Likewise, if there are irrelevant results, they’ll go to decision agent, it will recognise and will query again with different search terms, (self-evaluating in simple words, instead of just givign us non-sense)

Additionally, when tools encounter errors — due to connection issues or typos, the decision agent will decide intelligently about the best approach to tackle this, obviously to prevent infinite loops, there’s a hard limit settings as well.

Developers can add custom tools and branches a super simple way, as per their requirement, with additonal features like auto-run tools, like trigger a summarising tool when the chat context exceeds 50,000 tokens.

To not-make-it-a-black-box AI-system, real-time observability is also enabled as out-of-the-box in Elysia’s frontend — displaying the entire decision tree as it’s traversed, allowing you to watch the LLM’s reasoning within each node as it processes the query. This is very important and highly needed when it comes to RAG apps.

Think of this as building an n8n-workflow but for RAG applications — where you can see why/how the answers are generating, refering to which data sources, and what to tweak.

2. Displaying data sources in dynamic formats

The best thing about Elysia, you don’t need to build a custom user interface each time you’re building a new AI agent maybe a customer support where the output is text, maybe an ecommerce app where the output is product cards UI, or maybe a data analysis agent where the output should be in table.

Elysia manages its out-of-the-box and dynamically choose how to display data based on waht makes the most sense for hte content and context. The system current has seven different display formats: generic data display, tables, ecommerce product cards, tickets, conversations and messages, documents, and charts.

By default, Elysia offers in-built user interfaces that generates dynamically based on the response

They’ll continue adding more displays as well.

3. Elysia is an automatic expert on your data

Naive RAG systems, like Verba, apps from langchain, llamaindex, can deeply struggly with complex data, multiple data types or locations, or data that is repeated or similar, because, well they don’t have full picture of the environment.

How Elysia is doing this ?
Super simple, but beautiful, Elysia as mentioend above, sits on top of Weaviate cluster, so it already knows your data structure, they generate meta-data and feeds that to LLMs, this gives a full-known context to each query you ask to Elysia, — enabling Elysia to handle complex queries adn provide knowledgeable responses.

And obviously, you can analyse, view/edit on their native data dashbaord of Elysia — gives you overview of all available collections, while the collection explorer allows detailed inspection of individual datasets.

Few other things they introduced

Self-Learning Feedback System

This not your regular (👍/👎) system, its goes beyond.
Each user maintains their own set of feedback examples stored in within their Weaviate instance, in simple words, each user’s own preferences/feedback/like/dislike gets stored in a separate database.
When the user makes a query, Elysia first searches for similar past queries user’ve rated posivitvely using vector similarity matching, and gives that in the context as a few-shot example.

Getting rid-of chunking headache

yes this is something needed, literaly.

Traditiona RAG systems wants you to experiment with all the chunking strategies that are in the universe, so Elysia just came up with chunk at query time, in other words, instead of having deal ti pre-chunking stragegies, initial search uses document-level vectors, (like a high-overview summary) — gives a good overview of the main points of the document, and when documents exceed a token threshold and prove relevant to the query, Elysia will step in and dynamically chunk them.

❌ One thing I didn’t liked it their frontend/backend stack

Elysia instead of biulding the frontend in Nextjs (which wuld be cool, and easy to edit), they shipped a static HTML stating “eliminating the need for a separate nodejs”, which I just dont’ understand why., the best thing that developers love it just complete control over all the components in terms of customisation.
But anyways, still this whole system is pretty cool.

Conclusion

What Elysia did is not some rocket science, but just playing with meta data, summary of your database, schema structures and feeding that to LLMs before making a query
Few frameworks (like WrenAI, Vanna.AI) also tried this, but Elysia just automated this along with other cool features like dynamic UI, a general RAG implementation system. using Weavite as a base
but again, this is cool and definitely can help us creating accurate RAG — that’s what matters to us in the last.
Check out Elysia ; https://github.com/weaviate/elysia