This system implements a Retrieval-Augmented Generation (RAG) approach with an integrated feedback loop. It aims to improve the quality and relevance of responses over time by incorporating user feedback and dynamically adjusting the retrieval process.

Motivation

Traditional RAG systems can sometimes produce inconsistent or irrelevant responses due to limitations in the retrieval process or the underlying knowledge base. By implementing a feedback loop, we can:

Continuously improve the quality of retrieved documents
Enhance the relevance of generated responses
Adapt the system to user preferences and needs over time

Key Components

PDF Content Extraction: Extracts text from PDF documents
Vectorstore: Stores and indexes document embeddings for efficient retrieval
Retriever: Fetches relevant documents based on user queries
Language Model: Generates responses using retrieved documents
Feedback Collection: Gathers user feedback on response quality and relevance
Feedback Storage: Persists user feedback for future use
Relevance Score Adjustment: Modifies document relevance based on feedback
Index Fine-tuning: Periodically updates the vectorstore using accumulated feedback

Method Details

1. Initial Setup

The system reads PDF content and creates a vectorstore
A retriever is initialized using the vectorstore
A language model (LLM) is set up for response generation

2. Query Processing

When a user submits a query, the retriever fetches relevant documents
The LLM generates a response based on the retrieved documents

3. Feedback Collection

The system collects user feedback on the response’s relevance and quality
Feedback is stored in a JSON file for persistence

4. Relevance Score Adjustment

For subsequent queries, the system loads previous feedback
An LLM evaluates the relevance of past feedback to the current query
Document relevance scores are adjusted based on this evaluation

5. Retriever Update

The retriever is updated with the adjusted document scores
This ensures that future retrievals benefit from past feedback

6. Periodic Index Fine-tuning

At regular intervals, the system fine-tunes the index
High-quality feedback is used to create additional documents
The vectorstore is updated with these new documents, improving overall retrieval quality

Benefits of this Approach

Continuous Improvement: The system learns from each interaction, gradually enhancing its performance.
Personalization: By incorporating user feedback, the system can adapt to individual or group preferences over time.
Increased Relevance: The feedback loop helps prioritize more relevant documents in future retrievals.
Quality Control: Low-quality or irrelevant responses are less likely to be repeated as the system evolves.
Adaptability: The system can adjust to changes in user needs or document contents over time.

This RAG system with a feedback loop represents a significant advancement over traditional RAG implementations. By continuously learning from user interactions, it offers a more dynamic, adaptive, and user-centric approach to information retrieval and response generation. This system is particularly valuable in domains where information accuracy and relevance are critical, and where user needs may evolve over time.

While the implementation adds complexity compared to a basic RAG system, the benefits in terms of response quality and user satisfaction make it a worthwhile investment for applications requiring high-quality, context-aware information retrieval and generation.

Import relevant libraries

In [ ]:

import os
import sys
from dotenv import load_dotenv
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import ChatOpenAI
from langchain.chains import RetrievalQA
import json
from typing import List, Dict, Any
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), '..'))) # Add the parent directory to the path sicnce we work with notebooks
from helper_functions import *
from evaluation.evalute_rag import *
# Load environment variables from a .env file
load_dotenv()
# Set the OpenAI API key environment variable
os.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"

Define documents path

In [2]:

path = "../data/Understanding_Climate_Change.pdf"

Create vector store and retrieval QA chain

In [3]:

content = read_pdf_to_string(path)
vectorstore = encode_from_string(content)
retriever = vectorstore.as_retriever()
llm = ChatOpenAI(temperature=0, model_name="gpt-4o", max_tokens=4000)
qa_chain = RetrievalQA.from_chain_type(llm, retriever=retriever)

Function to format user feedback in a dictionary

In [4]:

def get_user_feedback(query, response, relevance, quality, comments=""):
    return {
        "query": query,
        "response": response,
        "relevance": int(relevance),
        "quality": int(quality),
        "comments": comments
    }

Function to store the feedback in a json file

In [5]:

def store_feedback(feedback):
    with open("../data/feedback_data.json", "a") as f:
        json.dump(feedback, f)
        f.write("\n")

Function to read the feedback file

In [6]:

def load_feedback_data():
    feedback_data = []
    try:
        with open("../data/feedback_data.json", "r") as f:
            for line in f:
                feedback_data.append(json.loads(line.strip()))
    except FileNotFoundError:
        print("No feedback data file found. Starting with empty feedback.")
    return feedback_data

Function to adjust files relevancy based on the feedbacks file

In [7]:

class Response(BaseModel):
    answer: str = Field(..., title="The answer to the question. The options can be only 'Yes' or 'No'")
def adjust_relevance_scores(query: str, docs: List[Any], feedback_data: List[Dict[str, Any]]) -> List[Any]:
    # Create a prompt template for relevance checking
    relevance_prompt = PromptTemplate(
        input_variables=["query", "feedback_query", "doc_content", "feedback_response"],
        template="""
        Determine if the following feedback response is relevant to the current query and document content.
        You are also provided with the Feedback original query that was used to generate the feedback response.
        Current query: {query}
        Feedback query: {feedback_query}
        Document content: {doc_content}
        Feedback response: {feedback_response}
        
        Is this feedback relevant? Respond with only 'Yes' or 'No'.
        """
    )
    llm = ChatOpenAI(temperature=0, model_name="gpt-4o", max_tokens=4000)
    # Create an LLMChain for relevance checking
    relevance_chain = relevance_prompt | llm.with_structured_output(Response)
    for doc in docs:
        relevant_feedback = []       
        for feedback in feedback_data:
            # Use LLM to check relevance
            input_data = {
                "query": query,
                "feedback_query": feedback['query'],
                "doc_content": doc.page_content[:1000],
                "feedback_response": feedback['response']
            }
            result = relevance_chain.invoke(input_data).answer
            
            if result == 'yes':
                relevant_feedback.append(feedback)        
        # Adjust the relevance score based on feedback
        if relevant_feedback:
            avg_relevance = sum(f['relevance'] for f in relevant_feedback) / len(relevant_feedback)
            doc.metadata['relevance_score'] *= (avg_relevance / 3)  # Assuming a 1-5 scale, 3 is neutral
    
    # Re-rank documents based on adjusted scores
    return sorted(docs, key=lambda x: x.metadata['relevance_score'], reverse=True)

Function to fine tune the vector index to include also queries + answers that received good feedbacks

In [13]:

def fine_tune_index(feedback_data: List[Dict[str, Any]], texts: List[str]) -> Any:
    # Filter high-quality responses
    good_responses = [f for f in feedback_data if f['relevance'] >= 4 and f['quality'] >= 4]    
    # Extract queries and responses, and create new documents
    additional_texts = []
    for f in good_responses:
        combined_text = f['query'] + " " + f['response']
        additional_texts.append(combined_text)
    # make the list a string
    additional_texts = " ".join(additional_texts)   
    # Create a new index with original and high-quality texts
    all_texts = texts + additional_texts
    new_vectorstore = encode_from_string(all_texts)   
    return new_vectorstore

Demonstration of how to retrieve answers with respect to user feedbacks

In [29]:

query = "What is the greenhouse effect?"
# Get response from RAG system
response = qa_chain(query)["result"]
relevance = 5
quality = 5
# Collect feedback
feedback = get_user_feedback(query, response, relevance, quality)
# Store feedback
store_feedback(feedback)
# Adjust relevance scores for future retrievals
docs = retriever.get_relevant_documents(query)
adjusted_docs = adjust_relevance_scores(query, docs, load_feedback_data())
# Update the retriever with adjusted docs
retriever.search_kwargs['k'] = len(adjusted_docs)
retriever.search_kwargs['docs'] = adjusted_docs

Finetune the vectorstore periodicly

In [14]:

# Periodically (e.g., daily or weekly), fine-tune the index
new_vectorstore = fine_tune_index(load_feedback_data(), content)
retriever = new_vectorstore.as_retriever()

SaveSavedRemoved 0

RAG System with Feedback Loop