OpenAI’s Apps SDK (launched Oct 2025) lets you build interactive “apps” that run inside ChatGPT conversations. These apps can surface rich UIs – e.g. inline carousels, maps, or video players – alongside ChatGPT’s responses. Under the hood, the Apps SDK uses the open Model Context Protocol (MCP): your backend (an MCP server) lists tools (with JSON schema) and serves HTML widgets as needed. When ChatGPT invokes a tool, your server returns structured JSON plus a pointer to an HTML template (“widget”) for the UI. In practice, you register frontend bundles as MCP resources (mimeType="text/html+skybridge"
) and mark your tool with metadata "openai/outputTemplate"
so ChatGPT knows which UI to render. For example:
// In your Node.js MCP server (using @modelcontextprotocol/sdk) const KANBAN_JS = readFileSync("web/dist/kanban.js", "utf8"); const KANBAN_CSS = readFileSync("web/dist/kanban.css", "utf8"); server.registerResource("kanban-widget", "ui://widget/kanban-board.html", {}, async () => ({ contents: [{ uri: "ui://widget/kanban-board.html", mimeType: "text/html+skybridge", text: ` <div id="kanban-root"></div> ${KANBAN_CSS ? `<style>${KANBAN_CSS}</style>` : ""} <script type="module">${KANBAN_JS}</script> `.trim(), }] })); server.registerTool("kanban-board", { title: "Show Kanban Board", _meta: { "openai/outputTemplate": "ui://widget/kanban-board.html" }, inputSchema: { tasks: z.string() } }, async () => { // Return structuredContent for the UI plus optional chat text const board = await loadKanbanBoard(); return { structuredContent: { columns: board.columns.map(col => ({ id: col.id, title: col.title, tasks: col.tasks })) }, content: [{ type: "text", text: "Here's your latest board." }], _meta: { tasksById: board.tasksById } }; });
This example (from OpenAI’s docs) shows two key pieces: registerResource declares an HTML+JS bundle as a “widget”, and registerTool ties a tool to that widget via _meta["openai/outputTemplate"]
. When ChatGPT calls "kanban-board"
, it will render your div#kanban-root
and run the bundled script inside an isolated iframe.
Building the Frontend UI
Apps SDK UIs are essentially web components (typically written in React, but any framework works) that run in a sandboxed iframe inside ChatGPT. Your component code uses the window.openai
global bridge to communicate with ChatGPT. For example, the window.openai
API provides methods like:
window.openai.callTool(name, args)
to invoke another tool on your server and await a result.window.openai.sendFollowUpMessage({ prompt })
to inject a new message into the chat as if the user sent it.window.openai.requestDisplayMode({mode: "fullscreen"})
to ask the host to switch your app from inline to fullscreen or Picture-in-Picture.window.openai.setWidgetState(stateObj)
to persist a widget-local state (which the model will see as context).
For example, inside a React component you might do:
const refresh = async () => { await window.openai?.callTool("refresh_pizza_list", { city: "Berlin" }); }; await window.openai?.sendFollowUpMessage({ prompt: "Summarize these locations in a paragraph." });
These calls let your UI trigger backend actions or chat messages seamlessly.
The Docs recommend using hooks to read globals from window.openai
, such as theme or the tool’s current input/output. For instance, a useToolInput()
hook could subscribe to the openai:set_globals
event and return window.openai.toolInput
. This keeps your component reactive to context changes (e.g. new data after a tool call). In short, your front-end acts like any single-page app: it renders data from window.openai.toolOutput
, updates widget state, and makes calls when needed.
Why Not Use Raw JS/CSS?
You might wonder: can I just write plain HTML, CSS and vanilla JS and serve it? In theory yes, but in practice complexity and scale require a build tool. Considerations for OpenAI apps:
-
Multiple modules and dependencies: If you use any npm packages (e.g. React, Axios, Leaflet), you need a bundler to package them into one script. The Apps SDK sandboxed iframe won’t magically resolve your imports. In fact, the troubleshooting guide warns: “Make sure the HTML inlines your compiled JS and that all dependencies are bundled.”[22] This means you must ship a single JS file (with styles) for your widget. Tools like Vite or Webpack crawl your
import
statements and pack everything together. Without bundling, you’d face missing scripts or CSP blocks. -
Modern JavaScript: Apps are typically built with ES modules, JSX/TSX, or modern JS syntax. Browsers (even with
type="module"
) don’t support JSX out of the box. A compiler/transpiler (part of a bundler toolchain) converts your JSX/TS into plain JS that the ChatGPT iframe can execute. -
Optimizations: Bundlers minify code, tree-shake unused bits, and generate hashed filenames for caching. The official examples use hashed bundles so updates are cached busts . This is crucial for production performance and ensuring users get the latest code. Simply serving raw JS files would lead to caching headaches and larger payloads.
-
Development Experience: Without a dev server, you’d have to manually rebuild and reload. Vite (for example) provides a hot-reload dev server, TypeScript support, and fast startup. In large projects, building everything each change is slow. Vite addresses this by pre-bundling dependencies (using [esbuild] under the hood) up to 10–100× faster than older bundlers. This makes iterative development much smoother.
In short, an Apps SDK project is just a web app (with a special runtime), so you benefit from the same frontend tooling best practices. The OpenAI docs imply this: they assume your UI is a compiled bundle (see examples and GitHub repo). For instance, OpenAI’s sample repo uses a Vite setup with multiple entry points – one JS/CSS per widget – built via a build-all.mts
script. That script produces versioned .html
, .js
, and .css
assets for each component, which are then either served by your server or uploaded to a CDN. The readme shows commands like pnpm run dev
(to start Vite’s dev server) and pnpm run build
(to produce hashed bundles).
Setting Up Vite for an AI App
To illustrate, a typical Vite workflow might be:
- Project structure: Have a frontend folder (e.g.
web/
) where your React/Vue code lives, and a backend folder for the MCP server. Inweb/
, write your components (JSX/TSX, CSS/SCSS, etc). - Vite config: Configure Vite with multiple HTML entry points (one per widget) or use dynamic import. You might use
vite-plugin-mpa
or manually specifybuild.rollupOptions.input
. The OpenAI example uses a customvite.config.mts
to handle multiple widget outputs. - Dev server: Run
vite
(often vianpm run dev
). Vite will launch on a localhost port (e.g. 5173). You can then point your MCP server or ChatGPT dev connector to fetch widget HTML/JS from this origin. The example repo serves built assets with CORS enabled on port 4444 for testing. - Building: Run
vite build
(ornpm run build
). This bundles and outputs each widget’s assets into e.g.dist/
, with hashed filenames. The example usesbuild-all.mts
to run Vite for each entry. After building, your server code canreadFileSync
or otherwise load those.js
and.css
files (as shown above).
Example Vite config snippet: (this is illustrative; see the openai-apps-sdk-examples repo for full config)
// vite.config.js import { defineConfig } from 'vite'; import react from '@vitejs/plugin-react'; export default defineConfig({ plugins: [react()], build: { // Suppose each widget has its own HTML in web/src/widgets/ rollupOptions: { input: { "pizza-map": "src/widgets/pizza-map.html", "pizza-carousel": "src/widgets/pizza-carousel.html", // ...other widgets }, output: { entryFileNames: '[name]-[hash].js', assetFileNames: '[name]-[hash].[ext]' } } } });
With this, running vite build
generates pizza-map-[hash].js
, pizza-map-[hash].css
, etc. Your server then includes those in the MCP resource HTML. This automated bundling (with dependency resolution and hashing) would be extremely tedious by hand with raw JS/CSS.
Security and Best Practices
Network & CSP: Frontend widgets run in a strict sandbox. The Apps SDK enforces a Content Security Policy: you can only fetch
to allowed domains and can’t use privileged APIs (alert
, prompt
, clipboard, etc). If your UI needs to fetch data, you must allow-list that domain in the resource’s _meta.widgetCSP.resource_domains
. For example, using the OpenAI example:
_meta: { "openai/widgetCSP": { connect_domains: [], resource_domains: ["https://persistent.oaistatic.com"] } }
This lets your iframe load scripts/images from that domain. Always validate inputs server-side; don’t trust the model or user input to generate safe URLs or file paths. The docs stress least privilege and input validation: only request the scopes and permissions you need, and thoroughly validate all tool arguments to guard against injection attacks.
Avoid insecure file reads: The Node code example above uses readFileSync
to load your built JS/CSS. This is fine for your own static assets, but be cautious: never use readFile
on paths derived from user input, as it could allow directory traversal or reveal sensitive files. In general, static bundling is safer – consider serving assets via HTTPS or CDN instead of reading them dynamically. (OpenAI’s sample also shows linking to a public CDN domain persistent.oaistatic.com
for assets; in your app you’d host on your domain or a cloud bucket.) If you need the user to upload a file, use a controlled upload endpoint and parse it on the server, rather than trying to readFile
on the client. Browser JavaScript cannot arbitrarily read local files due to security; you’d use an <input type="file">
with user consent if needed.
Data & Logging: By default, anything sent via setWidgetState
is visible to the model, so keep that payload small (docs suggest <4k tokens) and free of secrets. Avoid logging raw prompts or sensitive data. Follow the guide’s advice: redact PII in logs, honor deletion requests, and require user confirmation for any write/delete tools.
Vite Benefits: Bundling, Dependencies, Dev DX
In practical terms, using a tool like Vite greatly enhances development:
-
Bundling & Caching: Vite/Rollup bundles all your JS, CSS, and assets into optimized files. It also supports code-splitting and hashing. For example, OpenAI’s examples repo shows each widget “wrapped with the CSS it needs so you can host the bundles directly”. The hashed filenames (
app-83d4f6.js
) ensure ChatGPT loads the latest version (cache-busting). Doing this manually (managing<script>
tags,<link>
tags for styles, version strings, etc.) would be error-prone. -
Dependency Resolution: With Vite, you can freely
import React
or other libraries. Vite’s dependency pre-bundling (using [esbuild]) meansimport React from 'react'
will fetch the React package fromnode_modules
, bundle it, and serve it as part of your build. You don’t have to manually copy library files. If you update or add a package, Vite picks it up automatically on rebuild. -
Fast Dev Server: Vite launches a local server and supports Hot Module Replacement (HMR). During development, saving a file can instantly update the widget without full reload, greatly speeding iteration. Vite’s use of native ES modules means it only rebuilds changed files (instead of the whole app). This contrasts with plain JS where you’d have to refresh manually, or older bundlers where builds are slow.
-
TypeScript and JSX: If you write your widget in TypeScript or use JSX, Vite will compile/transpile these. For example, you can use
zod
schemas and TS typings end-to-end (as shown in the Node server examples) and import components like normal. -
Asset Handling: CSS, images, and other static assets can be imported or inlined. Vite can bundle your CSS into
<style>
tags (as in the example), or emit separate.css
files referenced by<link>
automatically. This ensures your MCP resource HTML doesn’t have to hand-craft<style>
tags – the bundler can inject them.
Overall, Vite (or a similar tool) lets a JavaScript front-end dev work in a familiar way. You write modular code, run a local web server with live reload, and get optimized builds ready for production. The OpenAI docs and examples essentially assume this modern workflow: the example MCP server code uses readFileSync
on a web/dist/*.js
which implies a build step beforehand. In other words, raw script tags without a build simply won’t scale once your app grows beyond “Hello World.”
Limitations and Missing Details
The Apps SDK docs are comprehensive on core concepts, but as with any new platform, some gaps remain for front-end developers:
-
Build Tool Guidance: The docs say “bundle it, and wire it up”, but don’t give a walkthrough of setting up a bundler. They assume you’ll look at the examples repo or infer the process. If you’re new to Vite/webpack, this can be confusing. (For instance, the docs show both
text/html+skybridge
andtext/html
in examples without explicitly explaining the difference; in practicetext/html+skybridge
is used for the main widget shell, while plaintext/html
appears in the “kitchen sink” example with a custom CSP override.) -
Framework Assumptions: The “Build a custom UX” guide focuses on React and React Router, but some developers may prefer Vue or Svelte. The docs don’t cover those explicitly, so you must adapt the React examples or consult community guides.
-
Testing and Iteration: There’s little detail on iterative testing. The examples mention using ngrok to expose your localhost for Developer Mode, but more tips (like unit testing strategies or mocking
window.openai
) would help. -
Error Handling and Logging: The docs list common errors (in Troubleshooting), but a beginner might not immediately see how to debug inside ChatGPT’s UI. More guidance on using the MCP Inspector or browser devtools in the ChatGPT iframe would be useful.
Despite these gaps, OpenAI has published design guidelines and examples, and early adopters have written step-by-step tutorials. As the ecosystem matures, we expect more boilerplates and tools (like React component libraries pre-built for ChatGPT) to emerge.
Conclusion
OpenAI’s new Apps SDK lets you integrate rich, interactive experiences into ChatGPT. For front-end developers, it means building single-page app components that communicate via the window.openai
bridge. Key takeaways:
- Apps are driven by MCP: You define tools on a server and serve HTML/JS as resources.
- Use a build tool: Plain JS/CSS quickly becomes unwieldy. Tools like Vite handle modules, bundling, hashing, and a fast dev server.
- Follow security best practices: Honor sandbox CSP rules, validate all inputs, and avoid risky file operations. Don’t read arbitrary files – use curated static hosting or bundler output instead.
- Documentation is evolving: Leverage the official guides, example repo, and community tutorials. Watch for updates as OpenAI fleshes out more tooling and patterns.
In summary, developing OpenAI ChatGPT apps is like building a web widget with React/Vue – the main difference is the ChatGPT host and window.openai
APIs. Embrace modern frontend workflows (npm, bundlers, CI) to keep your app scalable and secure. With those in place, you’ll be ready to tap into 800+ million ChatGPT users with your AI-powered app!
Sources: OpenAI Apps SDK documentation and examples; OpenAI Dev Day announcements; security guidelines.