Capstone: Custom AI Assistant

Capstone: Forging a Custom AI Assistant

👾

System Admin

AI Architecture // Syllabus Corp

LLMs alone are just text predictors. By giving them "Tools" (APIs) and "Memory" (Context), we transform them from passive chatbots into autonomous Agents capable of executing real-world tasks.

1. The Core: System Prompting

Every AI assistant begins with a System Prompt. Unlike user messages, the system prompt defines the unshakeable rules of engagement. If you are building an assistant for a hospital, the system prompt strictly forbids giving legal advice.

2. The Bridge: Function Calling

LLMs cannot natively browse the web or check databases. Function Calling bridges this gap. You provide the LLM with a JSON description of your APIs (e.g., check_inventory(sku)). When the user asks "Do we have shoes in stock?", the LLM realizes it needs external data and outputs a JSON command telling your code to run that function.

User: "What's the weather in Tokyo?"

LLM: { "tool_call": "get_weather", "args": { "location": "Tokyo" } }

App: *Fetches API and returns 22°C*

LLM: "It is currently 22°C in Tokyo."

3. The Engine: ReAct Agent Loop

For complex tasks, a single tool call isn't enough. The ReAct (Reasoning and Acting) framework allows the model to loop: it creates a Thought, takes an Action, views the Observation, and repeats until the goal is met.

❓ Architecture FAQs

What is the difference between RAG and Function Calling?

RAG (Retrieval-Augmented Generation): You automatically fetch relevant documents (via Vector DB) *before* asking the LLM the question, injecting it as context. Ideal for knowledge bases.

Function Calling: The LLM decides *dynamically* if it needs to trigger a function (like a math calculator or fetching a user profile) based on the conversation flow. Ideal for actions and live data.

How do AI Agents retain memory?

LLMs are stateless. To simulate memory, developers use a Conversation Buffer—an array of the entire conversation history (User, Assistant, User, Assistant) that is re-sent to the API with every single new prompt.

GenAI Architecture Glossary

Agent

An AI system combining an LLM with tools, memory, and an orchestration loop to achieve autonomous goals.

Function Calling

A capability where the LLM outputs a structured JSON object intended to trigger a specific backend function.

ReAct Loop

A prompting framework (Reasoning + Acting) that teaches the model to 'think' out loud before executing a 'tool'.

System Prompt

The foundational, high-priority instructions that dictate the persona and rules of the AI assistant.

Context Window

The maximum amount of text (tokens) an LLM can process at one time, including history and system prompts.

RAG

Retrieval-Augmented Generation. Connecting an LLM to a vector database to provide it with private, real-time knowledge.

Capstone: AI Assistant

Architecture

Concept: System Directive

System Audit

Initiate Capstone Trials

Dev Protocol Hub