arsalandywriter.com

Enhancing Large Language Models with Retrieval Augmented Generation

Written on

Retrieval Augmented Generation (RAG) is emerging as a crucial technique in the realm of large language models (LLMs), which are increasingly central to numerous organizations as they transition toward artificial intelligence. While LLMs have gained popularity for various beneficial reasons, improper use can lead to significant drawbacks, including unexpected responses, fabricated information, and biases. This phenomenon, termed "hallucination," can occur due to various factors.

To counteract LLM hallucinations, several strategies have been developed, including fine-tuning, prompt engineering, and notably, Retrieval Augmented Generation (RAG). RAG has garnered attention for its effectiveness in addressing the misinformation generated by large language models.

In this article, we will explore the workings of RAG through a practical implementation using SingleStore as a vector database for managing vector data.

What is Retrieval Augmented Generation (RAG)?

LLMs sometimes generate hallucinated outputs, and RAG serves as one of the methods to mitigate this issue. In response to a user query, RAG retrieves relevant information from a pre-defined source or dataset, which is stored in a vector database. Unlike traditional databases, a vector database is specifically designed for storing vector data.

Vector data is represented as embeddings, which encapsulate the context and meaning of various objects. For instance, if one seeks customized responses from an AI application, the organization’s documents can be transformed into embeddings using an embedding model and then stored in a vector database. When a query is issued to the AI application, it is converted into a vector query embedding. This embedding is then used to search the vector database for the most similar object through vector similarity search. As a result, the LLM-powered application is less likely to hallucinate, as it has been instructed to generate tailored responses based on the provided custom data.

A practical application could be in customer support, where specific data relevant to products or services is stored in a vector database. When a user inquiry is made, the application can generate an appropriate response rather than a generic one. Thus, RAG is transforming various domains.

RAG Pipeline

The RAG pipeline consists of three essential components: Retrieval, Augmentation, and Generation.

  • Retrieval: This component is responsible for sourcing relevant information from an external knowledge base, such as a vector database, for any given user query. This step is critical for curating meaningful and contextually accurate responses.
  • Augmentation: This stage enhances the retrieved information, adding relevant context to tailor the response to the user query.
  • Generation: Finally, the LLM compiles a conclusive output for the user, utilizing both its prior knowledge and the provided context to formulate an appropriate response.

These three components form the backbone of the RAG pipeline, ensuring that users receive contextually rich and accurate information. This is why RAG is particularly valuable in developing chatbots, question-answering systems, and similar applications.

RAG Tutorial

Let's construct a straightforward AI application that can retrieve contextually relevant information from our own dataset for any user query.

Begin by signing up for a SingleStore database to employ it as our vector database. After registration, create a workspace, which is a simple and free process.

Once your workspace is established, create a database and name it as desired.

From the interface, you can create the database via the 'Create Database' tab.

Next, navigate to ‘Develop’ to access the Notebooks feature, akin to Jupyter Notebooks.

Create a new Notebook and give it a name of your choice.

Before proceeding, select your workspace and database from the dropdown menu in the Notebook.

Now, start adding the following code snippets into your newly created Notebook.

Install the Required Libraries

!pip install openai numpy pandas singlestoredb langchain==0.1.8 langchain-community==0.0.21 langchain-core==0.1.25 langchain-openai==0.0.6

Vector Embeddings Example

def word_to_vector(word):

# Define some basic rules for our vector components

vector = [0] * 5 # Initialize a vector of 5 dimensions

# Rule 1: Length of the word (normalized to a max of 10 characters for simplicity)

vector[0] = len(word) / 10

# Rule 2: Number of vowels in the word (normalized to the length of the word)

vowels = 'aeiou'

vector[1] = sum(1 for char in word if char in vowels) / len(word)

# Rule 3: Whether the word starts with a vowel (1) or not (0)

vector[2] = 1 if word[0] in vowels else 0

# Rule 4: Whether the word ends with a vowel (1) or not (0)

vector[3] = 1 if word[-1] in vowels else 0

# Rule 5: Percentage of consonants in the word

vector[4] = sum(1 for char in word if char not in vowels and char.isalpha()) / len(word)

return vector

# Example usage word = "example" vector = word_to_vector(word) print(f"Word: {word}nVector: {vector}")

Vector Similarity Example

import numpy as np

def cosine_similarity(vector_a, vector_b):

# Calculate the dot product of vectors

dot_product = np.dot(vector_a, vector_b)

# Calculate the norm (magnitude) of each vector

norm_a = np.linalg.norm(vector_a)

norm_b = np.linalg.norm(vector_b)

# Calculate cosine similarity

similarity = dot_product / (norm_a * norm_b)

return similarity

# Example usage word1 = "example" word2 = "sample" vector1 = word_to_vector(word1) vector2 = word_to_vector(word2)

# Calculate and print cosine similarity similarity_score = cosine_similarity(vector1, vector2) print(f"Cosine similarity between '{word1}' and '{word2}': {similarity_score}")

Embedding Models

OPENAI_KEY = "INSERT OPENAI KEY" from openai import OpenAI client = OpenAI(api_key=OPENAI_KEY)

def openAIEmbeddings(input):

response = client.embeddings.create(

input="input",

model="text-embedding-3-small"

)

return response.data[0].embedding

print(openAIEmbeddings("Golden Retriever"))

Creating a Vector Database with SingleStoreDB

We will utilize the LangChain framework and SingleStore as a vector database to store our embeddings, along with a public .txt file link containing Sherlock Holmes stories.

Add OpenAI API Key as an Environment Variable: import os os.environ['OPENAI_API_KEY'] = 'mention your openai api key'

Next, import necessary libraries, specify the file to use in the example, load the file, split it, and insert the content into the SingleStore database. Finally, pose a query related to the document utilized. import openai from langchain.text_splitter import CharacterTextSplitter from langchain_community.document_loaders import TextLoader from langchain_community.embeddings import OpenAIEmbeddings from langchain_community.vectorstores.singlestoredb import SingleStoreDB import os import pandas as pd import requests

# URL of the public .txt file you want to use file_url = "https://sherlock-holm.es/stories/plain-text/stud.txt"

# Send a GET request to the file URL response = requests.get(file_url)

# Proceed if the file was successfully downloaded if response.status_code == 200:

file_content = response.text

# Save the content to a file

file_path = 'downloaded_example.txt'

with open(file_path, 'w', encoding='utf-8') as f:

f.write(file_content)

# Now, you can proceed with your original code using 'downloaded_example.txt'

# Load and process documents

loader = TextLoader(file_path) # Use the downloaded document

documents = loader.load()

text_splitter = CharacterTextSplitter(chunk_size=2000, chunk_overlap=0)

docs = text_splitter.split_documents(documents)

# Generate embeddings and create a document search database

OPENAI_KEY = "add your openai key" # Replace with your OpenAI API key

embeddings = OpenAIEmbeddings(api_key=OPENAI_KEY)

# Create Vector Database

vector_database = SingleStoreDB.from_documents(docs, embeddings, table_name="scarlet") # Replace "your_table_name" with your actual table name

query = "which university did he study?"

docs = vector_database.similarity_search(query)

print(docs[0].page_content)

else:

print("Failed to download the file. Please check the URL and try again.")

After executing the above code, a prompt will appear to enter your query regarding the referenced Sherlock Holmes story.

We successfully retrieved pertinent information from the provided dataset, which guided the response generation process. By converting our file into embeddings and storing them in the SingleStore database, we established a retrievable corpus of information. This ensures that responses are not only relevant but also content-rich, derived from the provided dataset.

This article is published under Generative AI Publication.

Stay connected with us on Substack, LinkedIn, and Zeniteq to keep up with the latest in AI. Together, let's shape the future of artificial intelligence!

Share the page:

Twitter Facebook Reddit LinkIn

-----------------------

Recent Post:

Understanding Empathy: Why Some Exhibit More Than Others

Explore why empathy varies among individuals and how it can be cultivated for deeper connections.

Revolutionizing Diabetes Care: The Merrifield Technique and GLP-1

Explore the groundbreaking Merrifield technique and its transformative effects on diabetes treatment through GLP-1 agonists.

Navigating the Unique Challenges Faced by Highly Intelligent Individuals

Explore the unique struggles that smart individuals often encounter in life and how to cope with them.

The Evolution of Light Skin: A Journey Through Human History

Explore the evolutionary journey of human skin color, focusing on how light skin developed in relation to geographical and environmental factors.

The Rhythm of Life: How Plant Cycles Influence Technology and Wellness

Explore how the rhythms of nature and plants shape technology and promote a healthy lifestyle, emphasizing the importance of the Circadian clock.

# A Humorous Take on Flat Earth Beliefs and Misinformation

A witty exploration of flat Earth theories and the impact of misinformation on society.

A Comprehensive 6-Day Bodybuilding Regimen for Growth

Discover a 6-day workout plan designed for muscle growth and strength enhancement, tailored for optimal results.

How I Transitioned to a Part-Time Creator Making Five Figures Monthly

Discover my journey to becoming a successful part-time creator earning five figures monthly and how you can achieve the same.