RAG (Retrieval-Augemented Generation) is a way of improving AI's answer by providign relevant information to AI. Why do AI models like chatGPT or Gemini lie to you? The reason is simple. AI doesn't have common sense, and RAG helps the model to generate an answer based on the related information.
With AnythingLLM, you can easily develop your own RAG. It is particularly important when the source of information shouldn't be leaked into outside system. Anything LLM is RAG working in the local environment (your laptop or desktop without internet connection)
First, you need local LLM model for Anything LLM. Ollama provides good number of LLMs including recently released models.https://ollama.com/
Ollama
Get up and running with large language models.
ollama.com
Once Ollama is installed, please install few LLM models. I use Ollama in low spec PC (my laptop), so I'm trying to use small models such as:
Qwen2.5: 0.5b (Alibaba)
Gemma3: 1b (Google)
llama 3.1~3.2 (Meta)
Deepseek-R1
Exaone3.5, Exaone-deep (LG)
Phi4-mini (Microsoft)
You can find much more in ollama website. Feel free to choose and test models until you get a satisfactory result.
Now, we are ready to install AnythingLLM. https://anythingllm.com/
AnythingLLM | The all-in-one AI application for everyone
AnythingLLM is the AI application you've been seeking. Use any LLM to chat with your documents, enhance your productivity, and run the latest state-of-the-art LLMs completely privately with no technical setup.
anythingllm.com
In the first lauch after intallation, it will ask which LLM are you going to use. Please select ollama, then AnythingLLM will answer using LLM installed in ollama.
The first step for developing RAG is making workspace where you upload your document on it. Upload the text file and upload it to workspace. Then, embedding process begins. Embedding is converting text into the vector (a group of numbers). This is essential for information retrieval. AnythingLLM will split the text in the document and embed texts to save them in internal vector database. LLM finds relevant information by converting the text into numbers. By checking sentences with similar vectors, relevant information to the user prompt is retrieved.
Once the embedding is ready, it's good to go now. AnythingLLM will answer after searching information from the uploaded text. You can check the source of information by checking citation. If the citation is irrelevant to the prompt, there is a high chance that the model answer is also irrelevant. In my case study, RAG didn't work well probably because of the document format. I uploaded pdf file with lot of tables, and the model can't really work well to get information from many tables. RAG can be also search information not only from documents but also from youtube video. Providing youtube url is enough to retireve information from youtubve video.
'GenAI' 카테고리의 다른 글
Coding agent making a game within five minutes (0) | 2025.04.15 |
---|---|
Unlimited use of code interpreter anytime anywhere (0) | 2025.04.14 |
Cartoon generated by AI (a guide on Gen AI and copyright) (0) | 2025.04.12 |
MUST USE AI services for startups and investors (TOP 5) (1) | 2024.11.26 |
Is AI AGI? (1) | 2024.11.14 |