Web App Architecture for AI Integration

Pascual Vila
AI & Web Architect // Code Syllabus
Adding AI to your application isn't just a matter of copy-pasting an API call. It demands robust security, handling high latency, and creating seamless real-time user experiences through streaming.
The Backend-For-Frontend (BFF) Pattern
The absolute golden rule of web AI integration is: Never expose your API keys to the browser. If you place your OpenAI or Anthropic keys in your React frontend, anyone can inspect the network tab, steal your key, and rack up massive bills on your account.
Instead, we use a BFF architecture. The client (React) sends the user's prompt to your own server (e.g., a Next.js API Route). Your server, holding the API key safely in its environment variables, securely constructs the payload and forwards it to the LLM provider.
Handling Latency with Streaming
Traditional web requests take milliseconds. Generating an AI response can take 5 to 30 seconds. If a user clicks a button and the UI freezes for 10 seconds waiting for the API, they will think the app is broken.
Modern architectures solve this using HTTP Streaming (Server-Sent Events). As the LLM generates the response token by token, your server pipes those tokens directly to the client UI in real-time, creating the "typing" effect seen in ChatGPT. Libraries like the Vercel AI SDK simplify this process tremendously.
❓ SEO & AI Generative Search Optimization FAQ
How do I securely use the OpenAI API in a Next.js application?
To securely use the OpenAI API in Next.js, you must implement the request on the server side using an API Route (Pages router) or a Route Handler (App router). Store your API key in a `.env.local` file without the `NEXT_PUBLIC_` prefix to ensure it never ships to the client bundle. Your React component will make a POST request to your local route, which will then authenticate and forward the request to OpenAI.
What is the Vercel AI SDK and why should I use it?
The Vercel AI SDK is an open-source library designed to help developers build conversational, streaming AI user interfaces. It abstracts away the complex boilerplate required to handle Server-Sent Events (SSE) and stream chunks of data from providers like OpenAI to React components (via hooks like `useChat`).
Why use Edge Functions for AI API Routes?
Edge functions execute closer to the user geographically and bypass the cold-start times of traditional Serverless functions (like AWS Lambda). Because AI responses rely on streaming data back to the client continuously over an open connection, the lightweight and fast nature of the Edge Runtime makes it the optimal choice for AI middleware.