RAG-POWERED KNOWLEDGE BASE
Web ScrapingLLMRAGPrompt Engineering
CONTEXT AND OBJECTIVE
As part of the Swiss democratic system, the legislative branch of the government can ask formal questions to the executive branch, which is then required to answer in written form. These answers contain interesting information on various public topics such as healthcare and education. While publicly available, they are difficult to find in practice. Web searches are not reliable and navigating the government website can be time consuming.
The goal of this project was the retrieve in a knowledge base all publicly available documents, and to design a RAG interface to provide fast and convenient access to all this information.
WHAT WAS DONE
Using Python’s BeautifulSoup packages, the relevant documents were scraped from the government website.
They were uploaded on a cloud Retrieval-Augmented Generation (RAG) platform, powered by OpenAI ChatGPT-4 API. The system was configured, via prompt engineering, to act as a chatbot specialized on retriving answers from the executive branch.