RAG System with Feedback Loop
This system implements a Retrieval-Augmented Generation (RAG) approach with an integrated feedback loop. It aims to improve the quality and relevance of responses over time by incorporating user feedback and dynamically adjusting the retrieval process.
Motivation
Traditional RAG systems can sometimes produce inconsistent or irrelevant responses due to limitations in the retrieval process or the underlying knowledge base. By implementing a feedback loop, we can:
- Continuously improve the quality of retrieved documents
- Enhance the relevance of generated responses
- Adapt the system to user preferences and needs over time
Key Components
- PDF Content Extraction: Extracts text from PDF documents
- Vectorstore: Stores and indexes document embeddings for efficient retrieval
- Retriever: Fetches relevant documents based on user queries
- Language Model: Generates responses using retrieved documents
- Feedback Collection: Gathers user feedback on response quality and relevance
- Feedback Storage: Persists user feedback for future use
- Relevance Score Adjustment: Modifies document relevance based on feedback
- Index Fine-tuning: Periodically updates the vectorstore using accumulated feedback
Method Details
1. Initial Setup
- The system reads PDF content and creates a vectorstore
- A retriever is initialized using the vectorstore
- A language model (LLM) is set up for response generation
2. Query Processing
- When a user submits a query, the retriever fetches relevant documents
- The LLM generates a response based on the retrieved documents
3. Feedback Collection
- The system collects user feedback on the response’s relevance and quality
- Feedback is stored in a JSON file for persistence
4. Relevance Score Adjustment
- For subsequent queries, the system loads previous feedback
- An LLM evaluates the relevance of past feedback to the current query
- Document relevance scores are adjusted based on this evaluation
5. Retriever Update
- The retriever is updated with the adjusted document scores
- This ensures that future retrievals benefit from past feedback
6. Periodic Index Fine-tuning
- At regular intervals, the system fine-tunes the index
- High-quality feedback is used to create additional documents
- The vectorstore is updated with these new documents, improving overall retrieval quality
Benefits of this Approach
- Continuous Improvement: The system learns from each interaction, gradually enhancing its performance.
- Personalization: By incorporating user feedback, the system can adapt to individual or group preferences over time.
- Increased Relevance: The feedback loop helps prioritize more relevant documents in future retrievals.
- Quality Control: Low-quality or irrelevant responses are less likely to be repeated as the system evolves.
- Adaptability: The system can adjust to changes in user needs or document contents over time.
This RAG system with a feedback loop represents a significant advancement over traditional RAG implementations. By continuously learning from user interactions, it offers a more dynamic, adaptive, and user-centric approach to information retrieval and response generation. This system is particularly valuable in domains where information accuracy and relevance are critical, and where user needs may evolve over time.
While the implementation adds complexity compared to a basic RAG system, the benefits in terms of response quality and user satisfaction make it a worthwhile investment for applications requiring high-quality, context-aware information retrieval and generation.
Import relevant libraries
import os
import sys
from dotenv import load_dotenv
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
import json
from typing import List, Dict, Any
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..'))) # Add the parent directory to the path sicnce we work with notebooks
from helper_functions import *
from evaluation.evalute_rag import *
# Load environment variables from a .env file
load_dotenv()
# Set the OpenAI API key environment variable
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
Define documents path
path = "../data/Understanding_Climate_Change.pdf"
Create vector store and retrieval QA chain
content = read_pdf_to_string(path)
vectorstore = encode_from_string(content)
retriever = vectorstore.as_retriever()
llm = ChatOpenAI(temperature=0, model_name="gpt-4o", max_tokens=4000)
qa_chain = RetrievalQA.from_chain_type(llm, retriever=retriever)
Function to format user feedback in a dictionary
def get_user_feedback(query, response, relevance, quality, comments=""):
return {
"query": query,
"response": response,
"relevance": int(relevance),
"quality": int(quality),
"comments": comments
}
Function to store the feedback in a json file
def store_feedback(feedback):
with open("../data/feedback_data.json", "a") as f:
json.dump(feedback, f)
f.write("\n")
Function to read the feedback file
def load_feedback_data():
feedback_data = []
try:
with open("../data/feedback_data.json", "r") as f:
for line in f:
feedback_data.append(json.loads(line.strip()))
except FileNotFoundError:
print("No feedback data file found. Starting with empty feedback.")
return feedback_data
Function to adjust files relevancy based on the feedbacks file
class Response(BaseModel):
answer: str = Field(..., title="The answer to the question. The options can be only 'Yes' or 'No'")
def adjust_relevance_scores(query: str, docs: List[Any], feedback_data: List[Dict[str, Any]]) -> List[Any]:
# Create a prompt template for relevance checking
relevance_prompt = PromptTemplate(
input_variables=["query", "feedback_query", "doc_content", "feedback_response"],
template="""
Determine if the following feedback response is relevant to the current query and document content.
You are also provided with the Feedback original query that was used to generate the feedback response.
Current query: {query}
Feedback query: {feedback_query}
Document content: {doc_content}
Feedback response: {feedback_response}
Is this feedback relevant? Respond with only 'Yes' or 'No'.
"""
)
llm = ChatOpenAI(temperature=0, model_name="gpt-4o", max_tokens=4000)
# Create an LLMChain for relevance checking
relevance_chain = relevance_prompt | llm.with_structured_output(Response)
for doc in docs:
relevant_feedback = []
for feedback in feedback_data:
# Use LLM to check relevance
input_data = {
"query": query,
"feedback_query": feedback['query'],
"doc_content": doc.page_content[:1000],
"feedback_response": feedback['response']
}
result = relevance_chain.invoke(input_data).answer
if result == 'yes':
relevant_feedback.append(feedback)
# Adjust the relevance score based on feedback
if relevant_feedback:
avg_relevance = sum(f['relevance'] for f in relevant_feedback) / len(relevant_feedback)
doc.metadata['relevance_score'] *= (avg_relevance / 3) # Assuming a 1-5 scale, 3 is neutral
# Re-rank documents based on adjusted scores
return sorted(docs, key=lambda x: x.metadata['relevance_score'], reverse=True)
Function to fine tune the vector index to include also queries + answers that received good feedbacks
def fine_tune_index(feedback_data: List[Dict[str, Any]], texts: List[str]) -> Any:
# Filter high-quality responses
good_responses = [f for f in feedback_data if f['relevance'] >= 4 and f['quality'] >= 4]
# Extract queries and responses, and create new documents
additional_texts = []
for f in good_responses:
combined_text = f['query'] + " " + f['response']
additional_texts.append(combined_text)
# make the list a string
additional_texts = " ".join(additional_texts)
# Create a new index with original and high-quality texts
all_texts = texts + additional_texts
new_vectorstore = encode_from_string(all_texts)
return new_vectorstore
Demonstration of how to retrieve answers with respect to user feedbacks
query = "What is the greenhouse effect?"
# Get response from RAG system
response = qa_chain(query)["result"]
relevance = 5
quality = 5
# Collect feedback
feedback = get_user_feedback(query, response, relevance, quality)
# Store feedback
store_feedback(feedback)
# Adjust relevance scores for future retrievals
docs = retriever.get_relevant_documents(query)
adjusted_docs = adjust_relevance_scores(query, docs, load_feedback_data())
# Update the retriever with adjusted docs
retriever.search_kwargs['k'] = len(adjusted_docs)
retriever.search_kwargs['docs'] = adjusted_docs
Finetune the vectorstore periodicly
# Periodically (e.g., daily or weekly), fine-tune the index
new_vectorstore = fine_tune_index(load_feedback_data(), content)
retriever = new_vectorstore.as_retriever()

Reranking Methods in RAG Systems

Dr. Md. Harun Ar Rashid, MPH, MD, PhD, is a highly respected medical specialist celebrated for his exceptional clinical expertise and unwavering commitment to patient care. With advanced qualifications including MPH, MD, and PhD, he integrates cutting-edge research with a compassionate approach to medicine, ensuring that every patient receives personalized and effective treatment. His extensive training and hands-on experience enable him to diagnose complex conditions accurately and develop innovative treatment strategies tailored to individual needs. In addition to his clinical practice, Dr. Harun Ar Rashid is dedicated to medical education and research, writing and inventory creative thinking, innovative idea, critical care managementing make in his community to outreach, often participating in initiatives that promote health awareness and advance medical knowledge. His career is a testament to the high standards represented by his credentials, and he continues to contribute significantly to his field, driving improvements in both patient outcomes and healthcare practices.