Can I store my API keys in a `.env.local` file for a React or Next.js frontend?

No. While Next.js allows you to expose variables to the browser using the `NEXT_PUBLIC_` prefix, you must NEVER do this for private API keys. Your LLM provider keys should strictly remain on the Node.js backend environment to prevent massive financial theft.

Why does the model sometimes break character even if I use a System role?

While models heavily prioritize the System prompt, if the conversation history grows too large and the user constantly injects conflicting instructions, the model can suffer from 'context dilution'. Truncating old, irrelevant history helps keep the model strictly focused on the current System constraints.

Are input tokens and output tokens billed at the same rate?

Almost never. In the vast majority of AI providers (like OpenAI and Anthropic), output tokens (the text the AI generates) are significantly more expensive than input tokens (the prompt and history you send). Always factor this into your cost estimations.

HTML MASTER CLASS /// LEARN TAGS /// BUILD STRUCTURE /// SEMANTIC WEB /// HTML MASTER CLASS /// LEARN TAGS ///

⚡ Total XP: 0|💻 artificialintelligence XP: 0

Using OpenAI & Anthropic APIs

Learn to integrate industry-leading models into your applications. This guide covers the technical architecture of Chat Completions, authentication strategies, role-based messaging, and the economic considerations of token usage.

LOADING ENGINE...

Skill Matrix

UNLOCK NODES BY LEARNING NEW TAGS.

API Hub

Connectivity logic.

The playground is for testing; the API is for building. Mastering programmatic access to Large Language Models is the absolute foundation of modern AI software development. If you want to build autonomous agents, intelligent chatbots, or dynamic reasoning engines, you have to understand how to connect your code directly to the model's brain.

1The Illusion of Memory (Stateless Architecture)

Here is a concept that typically trips up junior developers: AI APIs have what we call 'total amnesia'. They are completely stateless. Every single time your server sends an HTTP request to OpenAI or Anthropic, the model treats it as a brand new conversation. It has absolutely zero context about what you asked it five seconds ago.

To create the illusion of a fluid, human-like conversation, it is entirely up to you (the developer) to send the *entire chat history* in every new request. We call this the 'Context Window'.

When you see a chatbot that 'remembers' your name, it's not because the AI actually remembered; it's because the frontend application silently appended the previous messages into a massive array and sent the whole payload back to the server. This requires careful state management on your end.

−

// We must resend everything
const requestPayload = {
  model: "gpt-4o",
  messages: fullHistoryArray
};

localhost:3000

Payload Compiled

Context array size: 14 messages.

2Secure Authentication

Let's talk about security, because a mistake here will cost you your job. Connecting to premium models costs real money and requires powerful access keys. These 'API Keys' are practically bearer bonds.

They must be injected into the headers of your HTTP requests to authenticate you with the provider. But here is the critical rule: Never, under any circumstances, expose these keys in your frontend client code (like React components or vanilla JS shipped to the browser).

If you put an API key in the frontend, malicious users will extract it using the browser's developer tools and use it to run up thousands of dollars in charges on your account. You must always construct these requests securely on your backend (Node.js server) where you can safely access your environment variables (process.env).

−

const headers = {
  'Authorization': `Bearer undefined`,
  'Content-Type': 'application/json'
};

localhost:3000

🔒 Secure Node.js Environment

3The Trio of Roles

Modern chat APIs don't just accept a single string of text. They expect a highly organized array of objects, where each object defines a specific role. This semantic separation is the mathematical magic that allows the model to differentiate between your hardcoded backend instructions and the random text typed by an end user.

The three standard roles are:

1. System: Placed at the very top of the array, this is the heart of your agent's behavior. You use it to define the persona, strict rules, and operational limits. Models are trained to heavily prioritize the System prompt over anything else.

2. User: This role represents the human's input. It's the prompt submitted from your application's UI.

3. Assistant: We use this role to reinject the responses that the model itself generated in previous turns. By alternating between user and assistant messages, we build the chronological timeline that creates the illusion of memory.

−

const messages = [
  { role: 'system', content: 'You are a pirate.' },
  { role: 'user', content: 'Hello!' },
  { role: 'assistant', content: 'Ahoy matey!' }
];

localhost:3000

System: Base Persona 📜

User: New Input 👤

Assistant: Past Output 🤖

4Tokenomics & History Truncation

I want you to pay close attention to this, because this is where startups can hemorrhage cash. AI APIs don't charge you per request; they charge you by the Token. A token is roughly a fragment of a word (about 4 characters in English).

Because the API is stateless, your message history array grows larger with every single turn. This means you are constantly paying to re-upload the entire conversation. If you let that array grow indefinitely, your request cost will skyrocket and you will eventually hit the model's hard context limit, crashing your application.

To prevent this, senior engineers implement History Truncation. We use code (like .slice()) to aggressively trim the oldest messages out of the array before making the request. We always preserve the System prompt, but we intentionally discard the user's oldest inputs to save money and keep the payload light.

−

// Keep system prompt + last 10 messages
const optimizedHistory = [
  messages[0], // The System Prompt
  ...messages.slice(-10)
];

localhost:3000

Message index 1-4 (Dropped)

System + 10 Recent Kept

?Frequently Asked Questions

Pascual Vila

Frontend Instructor // Code Syllabus

Lesson Glossary

[01]API Key

A secret token used to authenticate your requests to an AI provider. Never expose this in client-side code.

Code Preview

sk-...

[02]Endpoint

A specific URL where the API receives requests, such as /v1/chat/completions.

Code Preview

URL Target

[03]Payload

The JSON data structure sent to the API, containing the model, messages, and parameters.

Code Preview

Request Data

[04]Tiktoken

A fast BPE (Byte Pair Encoding) tokenizer for use with OpenAI's models, used to count tokens before sending a request.

Code Preview

Token Counter

[05]Latency

The time it takes for the API to process a request and start returning a response.

Code Preview

Response Time

Continue Learning

Foundations

Data Cleaning and Handling Missing Values

Read lesson→

Foundations

Containerization (Docker Basics for AI)

Read lesson→

Foundations

Exploratory Data Analysis (EDA)

Read lesson→

Foundations

Feature Encoding (One-Hot, Label Encoding)

Read lesson→

Foundations

Setting up the Environment (Jupyter, Google Colab)

Read lesson→

Foundations

AI Ethics: Bias, Fairness, and Explainability

Read lesson→

Article Contents

The Illusion of Memory (Stateless Architecture)
Secure Authentication
The Trio of Roles
Tokenomics & History Truncation

Skill Matrix

API Hub

Interactive Challenges

1The Illusion of Memory (Stateless Architecture)

2Secure Authentication

3The Trio of Roles

4Tokenomics & History Truncation

?Frequently Asked Questions

Lesson Glossary

[01]API Key

[02]Endpoint

[03]Payload

[04]Tiktoken

[05]Latency

Continue Learning

Article Contents