Knowledge Base: Auto-RAG

Building autonomous brains by syncing PDF and Notion data to Vector Stores.

rag_engine_core_v2
1 / 4
VECTORIZING DATA...
📄 ➡️ 🧠 ➡️ 💬

LOG:Step 1: Load and Chunk. Large PDFs exceed LLM tokens, so we split them into overlap chunks.

Pipeline Steps

Step 1: Recursive Chunking

Using libraries like LangChain to scrape text while maintaining metadata (page numbers, source URLs). Essential for source attribution.

RAG Achievements

📄
Data Miner

Master PDF chunking & cleaning.

💾
Vector Voyager

Embed data into Pinecone/Supabase.

📝
Notion Ninja

Automate workspace ingestion via API.