I am building an app using the RAG process. I have loaded some private documents and I am able to have a conversation with the AI model (I am using Open AI) about the data in the private documents and I am receiving satisfactory responses. The right data is being retrieved from the Vector database based on my query prompts (User Message). But, when I ask a question which is completely outside the context of the documents loaded in the Vector database, then I am still receiving a response; in-spite of no documents being retrieved from the Vector database.
For e.g., when I ask a question like 'How many planets are there in our Solar System', I still get an answer from the AI model even if 0 documents (from the vector database) is passed into it as a System prompt. Any reason why this is happening? Is there a way I can prevent the AI model from answering questions outside of the documents context?
This is my code for the Vector similarity search
private Message generateSystemMessage(String message) {
LOGGER.info("Retrieving documents");
List<Document> similarDocuments = vectorStore.similaritySearch(SearchRequest.query(message)
.withTopK(2).withSimilarityThreshold(0.75));
LOGGER.info("Found {} similar documents", similarDocuments.size());
SystemPromptTemplate systemPromptTemplate = new SystemPromptTemplate(this.systemPromptResource);
if(similarDocuments.isEmpty()) {
return systemPromptTemplate.createMessage(Map.of("documents", "No information found"));
}
String documentContent = similarDocuments.stream().map(Document::getContent).collect(Collectors.joining("\n"));
return systemPromptTemplate.createMessage(Map.of("documents", documentContent));
}
This is my System Prompt
`You are a helpful assistant, conversing with a user about the subjects contained in a set of documents. Use the information from the DOCUMENTS section to provide accurate answers. If unsure or if the answer isn't found in the DOCUMENTS section, simply state that you don't know the answer. And do not answer to any question that is not related to the context provided in the DOCUMENTS section.
DOCUMENTS: {documents}`
These are the logs when I ask a question outside of the context of the documents provided
`2024-03-13T21:26:17.172+11:00 INFO 25298 --- [io-8080-exec-10] c.example.rag.qa.StreamingChatService : Retrieving documents 2024-03-13T21:26:17.714+11:00 INFO 25298 --- [io-8080-exec-10] c.example.rag.qa.StreamingChatService : Found 0 similar documents 2024-03-13T21:26:17.716+11:00 INFO 25298 --- [io-8080-exec-10] c.example.rag.qa.StreamingChatService : The system prompt is -- You are a helpful assistant, conversing with a user about the subjects contained in a set of documents. Use the information from the DOCUMENTS section to provide accurate answers. If unsure or if the answer isn't found in the DOCUMENTS section, simply state that you don't know the answer. And do not answer to any question that is not related to the context provided in the DOCUMENTS section.
DOCUMENTS: No information found `