This code implements an Explainable Retriever, a system that not only retrieves relevant documents based on a query but also provides explanations for why each retrieved document is relevant. It combines vector-based similarity search with natural language explanations, enhancing the transparency and interpretability of the retrieval process.

Motivation

Traditional document retrieval systems often work as black boxes, providing results without explaining why they were chosen. This lack of transparency can be problematic in scenarios where understanding the reasoning behind the results is crucial. The Explainable Retriever addresses this by offering insights into the relevance of each retrieved document.

Key Components

Vector store creation from input texts
Base retriever using FAISS for efficient similarity search
Language model (LLM) for generating explanations
Custom ExplainableRetriever class that combines retrieval and explanation generation

Method Details

Document Preprocessing and Vector Store Creation

Input texts are converted into embeddings using OpenAI’s embedding model.
A FAISS vector store is created from these embeddings for efficient similarity search.

Retriever Setup

A base retriever is created from the vector store, configured to return the top 5 most similar documents.

Explanation Generation

An LLM (GPT-4) is used to generate explanations.
A custom prompt template is defined to guide the LLM in explaining the relevance of retrieved documents.

ExplainableRetriever Class

Combines the base retriever and explanation generation into a single interface.
The retrieve_and_explain method:
- Retrieves relevant documents using the base retriever.
- For each retrieved document, generates an explanation of its relevance to the query.
- Returns a list of dictionaries containing both the document content and its explanation.

Benefits of this Approach

Transparency: Users can understand why specific documents were retrieved.
Trust: Explanations build user confidence in the system’s results.
Learning: Users can gain insights into the relationships between queries and documents.
Debugging: Easier to identify and correct issues in the retrieval process.
Customization: The explanation prompt can be tailored for different use cases or domains.

The Explainable Retriever represents a significant step towards more interpretable and trustworthy information retrieval systems. By providing natural language explanations alongside retrieved documents, it bridges the gap between powerful vector-based search techniques and human understanding. This approach has potential applications in various fields where the reasoning behind information retrieval is as important as the retrieved information itself, such as legal research, medical information systems, and educational tools.

Import libraries

In [ ]:

import os
import sys
from dotenv import load_dotenv
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..'))) # Add the parent directory to the path sicnce we work with notebooks
from helper_functions import *
from evaluation.evalute_rag import *
# Load environment variables from a .env file
load_dotenv()
# Set the OpenAI API key environment variable
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

Define the explainable retriever class

In [6]:

class ExplainableRetriever:
    def __init__(self, texts):
        self.embeddings = OpenAIEmbeddings()
        self.vectorstore = FAISS.from_texts(texts, self.embeddings)
        self.llm = ChatOpenAI(temperature=0, model_name="gpt-4o-mini", max_tokens=4000)
    
        # Create a base retriever
        self.retriever = self.vectorstore.as_retriever(search_kwargs={"k": 5})    
        # Create an explanation chain
        explain_prompt = PromptTemplate(
            input_variables=["query", "context"],
            template="""
            Analyze the relationship between the following query and the retrieved context.
            Explain why this context is relevant to the query and how it might help answer the query.
            
            Query: {query}
            
            Context: {context}
            
            Explanation:
            """
        )
        self.explain_chain = explain_prompt | self.llm

    def retrieve_and_explain(self, query):
        # Retrieve relevant documents
        docs = self.retriever.get_relevant_documents(query)
        
        explained_results = []
        
        for doc in docs:
            # Generate explanation
            input_data = {"query": query, "context": doc.page_content}
            explanation = self.explain_chain.invoke(input_data).content
            
            explained_results.append({
                "content": doc.page_content,
                "explanation": explanation
            })
        
        return explained_results

Create a mock example and explainable retriever instance

In [7]:

# Usage
texts = [
    "The sky is blue because of the way sunlight interacts with the atmosphere.",
    "Photosynthesis is the process by which plants use sunlight to produce energy.",
    "Global warming is caused by the increase of greenhouse gases in Earth's atmosphere."
]

explainable_retriever = ExplainableRetriever(texts)

Show the results

In [ ]:

query = "Why is the sky blue?"
results = explainable_retriever.retrieve_and_explain(query)

for i, result in enumerate(results, 1):
    print(f"Result {i}:")
    print(f"Content: {result['content']}")
    print(f"Explanation: {result['explanation']}")
    print()

Dr. Harun

Dr. Md. Harun Ar Rashid, MPH, MD, PhD, is a highly respected medical specialist celebrated for his exceptional clinical expertise and unwavering commitment to patient care. With advanced qualifications including MPH, MD, and PhD, he integrates cutting-edge research with a compassionate approach to medicine, ensuring that every patient receives personalized and effective treatment. His extensive training and hands-on experience enable him to diagnose complex conditions accurately and develop innovative treatment strategies tailored to individual needs. In addition to his clinical practice, Dr. Harun Ar Rashid is dedicated to medical education and research, writing and inventory creative thinking, innovative idea, critical care managementing make in his community to outreach, often participating in initiatives that promote health awareness and advance medical knowledge. His career is a testament to the high standards represented by his credentials, and he continues to contribute significantly to his field, driving improvements in both patient outcomes and healthcare practices.

Explainable Retrieval in Document Search

Motivation

Key Components

Method Details

Document Preprocessing and Vector Store Creation

Retriever Setup

Explanation Generation

ExplainableRetriever Class

Benefits of this Approach

Import libraries

Define the explainable retriever class

Create a mock example and explainable retriever instance

Show the results

Document Augmentation through Question

Fusion Retrieval in Document Search

Semantic Chunks for RAG

RAG Evaluation and Meta-Evaluation with GroUSE

Deep Evaluation of RAG Systems using deepeval

Simple RAG with Llamaindex

Sign up for our free Good Health Newsletter!