← All blog posts
LLMs

OpenAI Function Calling in 2026: A Complete Guide for Node.js Developers

Function calling is what turns an LLM from a text generator into something that actually does things. This is the complete Node.js + TypeScript guide — tool schemas, parallel calls, streaming, strict mode, tool_choice, and the production-grade error handling most tutorials skip.

The moment OpenAI shipped function calling, building AI that actually does things became dramatically simpler.

Before function calling, getting an LLM to interact with external systems meant fragile prompt hacks — "respond only in JSON", "always use this format" — and then praying the model cooperated. It rarely did consistently.

Function calling changed the contract entirely. You define tools with a precise schema, the model decides when to call them and what arguments to pass, you execute the function and return the result, and the model incorporates it and responds.

That loop — model decides, you execute, model incorporates — is the foundation of every serious AI product being built right now. This is the guide I wish existed when I started: Node.js, TypeScript, real examples, production patterns.

You define tools, the model decides which to call and with what arguments, your code executes them and returns the result, and the model incorporates that result — looping until it has everything it needs to answer.

What Changed in the Tool-Calling API

OpenAI has refined this API significantly since the original 2023 launch. In 2026, four things matter:

  • tools replaced functions. OpenAI deprecated functions in favour of tools. Older tutorials using functions: [...] still work but are legacy — use tools: [...] for all new code.
  • Parallel function calls. The model can now call multiple tools in a single response — simultaneously, not sequentially — unlocking genuinely useful agentic patterns without multiple round trips.
  • Strict mode. strict: true on a tool definition forces the model to follow your JSON schema exactly — no extra fields, no missing required fields. This is the feature that finally makes function calling reliable for production.
  • tool_choice control. Force the model to always call a specific tool, never call tools, or let it decide automatically. Essential for predictable workflows.

Setting Up the OpenAI SDK in Node.js

Start with a clean TypeScript project:

mkdir openai-tools-demo
cd openai-tools-demo
pnpm init
pnpm add openai zod dotenv
pnpm add -D typescript tsx @types/node
npx tsc --init

Update your tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "NodeNext",
    "moduleResolution": "NodeNext",
    "strict": true,
    "outDir": "./dist",
    "esModuleInterop": true
  }
}

Create a .env file for your key:

OPENAI_API_KEY=sk-your-key-here

Then create src/client.ts — a singleton OpenAI client you import everywhere:

import OpenAI from "openai";
import * as dotenv from "dotenv";

dotenv.config();

if (!process.env.OPENAI_API_KEY) {
  throw new Error("OPENAI_API_KEY is not set in environment variables");
}

export const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

Defining Your First Tool Schema

A tool definition has three parts, and the model reads all three to decide whether to call it:

  • name — what the model calls it (snake_case, no spaces).
  • description — when and why to call it. This is the single most important field.
  • parameters — the JSON Schema for the arguments.

Write the description like you are telling a smart developer exactly when to reach for this function:

// src/tools/weather.ts
import OpenAI from "openai";

export const weatherTool: OpenAI.Chat.ChatCompletionTool = {
  type: "function",
  function: {
    name: "get_current_weather",
    description:
      "Retrieves the current weather for a given city. Use this whenever the user asks about weather conditions, temperature, or climate in a specific location. Do NOT use for historical weather data.",
    strict: true, // Enforces schema exactly — use this in production
    parameters: {
      type: "object",
      properties: {
        city: {
          type: "string",
          description:
            "The city name to get weather for, e.g. 'London', 'New York', 'Tokyo'",
        },
        unit: {
          type: "string",
          enum: ["celsius", "fahrenheit"],
          description: "Temperature unit. Default to celsius unless the user specifies.",
        },
      },
      required: ["city", "unit"],
      additionalProperties: false, // Required when strict: true
    },
  },
};

Why strict: true and additionalProperties: false? Without strict mode, the model may occasionally pass extra fields or omit optional ones in unexpected ways. With strict mode, OpenAI uses constrained decoding — the output is guaranteed to match your schema. That turns function calling from "mostly reliable" into "production reliable."

Strict mode requires every property to be either listed in required or given a nullable type. Adjust your schemas accordingly.

Your First Function Calling Request

Here is the complete loop — send the message, detect the tool call, execute the function, send the result back:

// src/basic-example.ts
import { openai } from "./client";
import { weatherTool } from "./tools/weather";

// This is the function you actually execute
// In production this calls a real weather API (OpenWeatherMap, Tomorrow.io, etc.)
async function getCurrentWeather(city: string, unit: string): Promise<string> {
  // Mock response — replace with real API call
  const mockData: Record<string, { temp: number; condition: string }> = {
    london: { temp: unit === "celsius" ? 14 : 57, condition: "Cloudy" },
    tokyo: { temp: unit === "celsius" ? 22 : 71, condition: "Sunny" },
    "new york": { temp: unit === "celsius" ? 18 : 64, condition: "Partly cloudy" },
  };

  const data = mockData[city.toLowerCase()] ?? { temp: 20, condition: "Unknown" };
  return JSON.stringify({
    city,
    temperature: data.temp,
    unit,
    condition: data.condition,
    timestamp: new Date().toISOString(),
  });
}

async function runWeatherAgent(userMessage: string) {
  console.log(`\nUser: ${userMessage}`);

  const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
    {
      role: "system",
      content: "You are a helpful weather assistant. Always use the weather tool to get current conditions before answering.",
    },
    { role: "user", content: userMessage },
  ];

  // First call — model decides whether to call a tool
  const response = await openai.chat.completions.create({
    model: "gpt-4o",
    messages,
    tools: [weatherTool],
    tool_choice: "auto", // Let the model decide
  });

  const responseMessage = response.choices[0].message;

  // Check if the model wants to call a tool
  if (responseMessage.tool_calls && responseMessage.tool_calls.length > 0) {
    // Add the assistant's response (with tool calls) to the message history
    messages.push(responseMessage);

    // Execute each tool call
    for (const toolCall of responseMessage.tool_calls) {
      if (toolCall.function.name === "get_current_weather") {
        const args = JSON.parse(toolCall.function.arguments) as {
          city: string;
          unit: string;
        };

        console.log(`[Tool Call] get_current_weather(${args.city}, ${args.unit})`);
        const result = await getCurrentWeather(args.city, args.unit);
        console.log(`[Tool Result] ${result}`);

        // Add the tool result to the message history
        messages.push({
          role: "tool",
          tool_call_id: toolCall.id,
          content: result,
        });
      }
    }

    // Second call — model incorporates the tool result and responds
    const finalResponse = await openai.chat.completions.create({
      model: "gpt-4o",
      messages,
    });

    const answer = finalResponse.choices[0].message.content;
    console.log(`\nAssistant: ${answer}`);
    return answer;
  }

  // Model chose not to call a tool — return its direct response
  console.log(`\nAssistant: ${responseMessage.content}`);
  return responseMessage.content;
}

// Test it
runWeatherAgent("What's the weather like in Tokyo right now?").catch(console.error);

Run it:

npx tsx src/basic-example.ts

You'll see the tool call logged, the mock result, then the model's final answer incorporating the weather data. The thing carrying state across those two API calls is the messages array — it accumulates every step of the exchange:

The messages array is the model's whole memory for a turn — it grows from the system prompt and question to the assistant's tool calls, each tool result, and finally the answer, every step fed back in.

Parallel Function Calls

This is where it gets genuinely useful. Ask "What's the weather in London, Tokyo, and New York?" and GPT-4o calls get_current_weather three times simultaneously in a single response — not one after another:

One parallel turn — a single request fans out into three tool calls that run at the same time with Promise.all, and their results fan back in before the model writes one combined answer.
// src/parallel-example.ts
import { openai } from "./client";
import { weatherTool } from "./tools/weather";

async function getCurrentWeather(city: string, unit: string): Promise<string> {
  // Simulate API latency
  await new Promise((resolve) => setTimeout(resolve, 100));
  return JSON.stringify({ city, temperature: Math.floor(Math.random() * 30), unit });
}

async function runParallelWeatherAgent() {
  const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
    {
      role: "user",
      content: "What is the weather in London, Tokyo, and New York? Use celsius.",
    },
  ];

  const response = await openai.chat.completions.create({
    model: "gpt-4o",
    messages,
    tools: [weatherTool],
    tool_choice: "auto",
  });

  const responseMessage = response.choices[0].message;

  if (responseMessage.tool_calls && responseMessage.tool_calls.length > 0) {
    console.log(`[Parallel] Model made ${responseMessage.tool_calls.length} tool calls simultaneously`);

    messages.push(responseMessage);

    // Execute ALL tool calls in parallel — do not await them one by one
    const toolResults = await Promise.all(
      responseMessage.tool_calls.map(async (toolCall) => {
        const args = JSON.parse(toolCall.function.arguments) as {
          city: string;
          unit: string;
        };
        const result = await getCurrentWeather(args.city, args.unit);
        return { toolCall, result };
      })
    );

    // Add all results to the message history
    for (const { toolCall, result } of toolResults) {
      messages.push({
        role: "tool",
        tool_call_id: toolCall.id,
        content: result,
      });
    }

    // Final response incorporating all results
    const finalResponse = await openai.chat.completions.create({
      model: "gpt-4o",
      messages,
    });

    console.log(`\nAssistant: ${finalResponse.choices[0].message.content}`);
  }
}

runParallelWeatherAgent().catch(console.error);
Always use Promise.all() for parallel tool calls. await them in a loop and you serialise what the model intended to parallelise — losing the entire performance benefit.

Handling Partial Tool Call Results

When one call in a parallel batch fails, you have two options. Returning an error string is almost always the right one — the model can work with partial results and acknowledge the failure:

// Option 1: Return an error string (recommended)
// The model can work with partial results and acknowledge the failure
const toolResults = await Promise.all(
  responseMessage.tool_calls.map(async (toolCall) => {
    try {
      const args = JSON.parse(toolCall.function.arguments);
      const result = await executeToolCall(toolCall.function.name, args);
      return { id: toolCall.id, result };
    } catch (error) {
      // Return error as string — model handles it gracefully
      return {
        id: toolCall.id,
        result: `Error: ${error instanceof Error ? error.message : "Tool call failed"}`,
      };
    }
  })
);

// Option 2: Fail the entire request
// Use this only when ALL results are required to answer correctly

Building a Real Weather + Calculator Agent

Combine multiple tools with a loop and you have something close to a real product. First, a second tool — a calculator:

// src/tools/calculator.ts
import OpenAI from "openai";

export const calculatorTool: OpenAI.Chat.ChatCompletionTool = {
  type: "function",
  function: {
    name: "calculate",
    description:
      "Evaluates a mathematical expression and returns the result. Use for any arithmetic, percentages, or unit conversions the user asks for.",
    strict: true,
    parameters: {
      type: "object",
      properties: {
        expression: {
          type: "string",
          description:
            "A valid mathematical expression using numbers and operators (+, -, *, /, **, %). Example: '(145 * 1.2) + 50'",
        },
        context: {
          type: "string",
          description: "Brief description of what this calculation is for",
        },
      },
      required: ["expression", "context"],
      additionalProperties: false,
    },
  },
};

Then a router that maps tool names to implementations, wrapped in an agent loop that keeps going until the model stops asking for tools:

// src/multi-tool-agent.ts
import { openai } from "./client";
import { weatherTool } from "./tools/weather";
import { calculatorTool } from "./tools/calculator";
import OpenAI from "openai";

type ToolName = "get_current_weather" | "calculate";

// Tool execution router — maps tool names to their implementations
async function executeTool(
  name: string,
  args: Record<string, string>
): Promise<string> {
  switch (name as ToolName) {
    case "get_current_weather":
      // Replace with real weather API
      return JSON.stringify({
        city: args.city,
        temperature: 22,
        unit: args.unit,
        condition: "Sunny",
      });

    case "calculate":
      try {
        const result = Function(`"use strict"; return (${args.expression})`)();
        return JSON.stringify({
          expression: args.expression,
          result,
          context: args.context,
        });
      } catch {
        return JSON.stringify({ error: "Invalid expression" });
      }

    default:
      return JSON.stringify({ error: `Unknown tool: ${name}` });
  }
}

async function runMultiToolAgent(userMessage: string): Promise<void> {
  console.log(`\nUser: ${userMessage}\n`);

  const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
    {
      role: "system",
      content: `You are a helpful assistant with access to weather data and a calculator.
Use the available tools to answer questions accurately.
When you have all the information you need, give a clear, concise answer.`,
    },
    { role: "user", content: userMessage },
  ];

  const tools = [weatherTool, calculatorTool];

  // Agent loop — keep going until the model stops calling tools
  let iteration = 0;
  const MAX_ITERATIONS = 5;

  while (iteration < MAX_ITERATIONS) {
    iteration++;

    const response = await openai.chat.completions.create({
      model: "gpt-4o",
      messages,
      tools,
      tool_choice: "auto",
    });

    const message = response.choices[0].message;
    const finishReason = response.choices[0].finish_reason;

    messages.push(message);

    // Model is done — return its final answer
    if (finishReason === "stop") {
      console.log(`Assistant: ${message.content}`);
      break;
    }

    // Model wants to call tools
    if (finishReason === "tool_calls" && message.tool_calls) {
      const results = await Promise.all(
        message.tool_calls.map(async (toolCall) => {
          const args = JSON.parse(toolCall.function.arguments) as Record<string, string>;
          console.log(`[Tool] ${toolCall.function.name}(${JSON.stringify(args)})`);
          const result = await executeTool(toolCall.function.name, args);
          console.log(`[Result] ${result}`);
          return { toolCall, result };
        })
      );

      for (const { toolCall, result } of results) {
        messages.push({
          role: "tool",
          tool_call_id: toolCall.id,
          content: result,
        });
      }
    }
  }
}

// Test queries
const queries = [
  "What's the weather in London? If it's above 20°C, calculate how many degrees above average (15°C) that is.",
  "Get me the weather in both Tokyo and Paris, then tell me the average temperature.",
];

(async () => {
  for (const query of queries) {
    await runMultiToolAgent(query);
    console.log("\n" + "─".repeat(60));
  }
})();
The MAX_ITERATIONS ceiling is not optional. The defining failure mode of a tool-calling loop is the runaway — and no prompt trick beats a hard stop.

TypeScript Types for Function Call Responses

One of the biggest friction points with function calling is getting proper TypeScript types. This pattern gives you full type safety on the arguments the model passes back:

// src/types/tools.ts

// Define your tool argument types
export interface WeatherArgs {
  city: string;
  unit: "celsius" | "fahrenheit";
}

export interface CalculatorArgs {
  expression: string;
  context: string;
}

// Union type for all possible tool calls
export type ToolArgs = WeatherArgs | CalculatorArgs;

// Type-safe tool call parser
export function parseToolArgs<T>(toolCall: {
  function: { name: string; arguments: string };
}): T {
  try {
    return JSON.parse(toolCall.function.arguments) as T;
  } catch {
    throw new Error(
      `Failed to parse arguments for tool ${toolCall.function.name}: ${toolCall.function.arguments}`
    );
  }
}

// Usage:
// const args = parseToolArgs<WeatherArgs>(toolCall);
// args.city — fully typed, autocomplete works

For anything more complex, use Zod to validate at runtime and derive the TypeScript type — one schema, both guarantees:

// src/schemas/weather.schema.ts
import { z } from "zod";

export const WeatherArgsSchema = z.object({
  city: z.string().min(1),
  unit: z.enum(["celsius", "fahrenheit"]),
});

export type WeatherArgs = z.infer<typeof WeatherArgsSchema>;

// In your tool executor:
function parseWeatherArgs(raw: string): WeatherArgs {
  const parsed = JSON.parse(raw);
  return WeatherArgsSchema.parse(parsed); // Throws ZodError if invalid
}

You get compile-time types and runtime validation at once — essential when you do not fully trust what the model passes back.

Streaming Function Calls

Streaming text is trivial. Streaming tool calls is slightly more involved, because the arguments arrive as chunks you have to assemble — the SDK's stream() helper does the assembly for you:

// src/streaming-tools.ts
import { openai } from "./client";
import { weatherTool } from "./tools/weather";
import OpenAI from "openai";

async function streamWithTools(userMessage: string) {
  const stream = openai.beta.chat.completions.stream({
    model: "gpt-4o",
    messages: [{ role: "user", content: userMessage }],
    tools: [weatherTool],
    tool_choice: "auto",
  });

  // Stream text tokens as they arrive
  stream.on("content", (delta) => {
    process.stdout.write(delta);
  });

  // Wait for the complete response
  const finalMessage = await stream.finalMessage();
  const choice = finalMessage.choices[0];

  if (choice.finish_reason === "tool_calls" && choice.message.tool_calls) {
    console.log("\n[Executing tool calls...]");

    const messages: OpenAI.Chat.ChatCompletionMessageParam[] = [
      { role: "user", content: userMessage },
      choice.message,
    ];

    // Execute tools and get results
    for (const toolCall of choice.message.tool_calls) {
      const args = JSON.parse(toolCall.function.arguments);
      const result = JSON.stringify({ city: args.city, temp: 22, unit: args.unit });

      messages.push({
        role: "tool",
        tool_call_id: toolCall.id,
        content: result,
      });
    }

    // Stream the final response after tool execution
    const finalStream = openai.beta.chat.completions.stream({
      model: "gpt-4o",
      messages,
    });

    process.stdout.write("\nAssistant: ");
    finalStream.on("content", (delta) => process.stdout.write(delta));
    await finalStream.finalMessage();
    process.stdout.write("\n");
  }
}

streamWithTools("What's the weather in Tokyo today?").catch(console.error);

Controlling Tool Behavior with tool_choice

tool_choice is the knob you reach for when "auto" is not precise enough:

// Let the model decide (default — use for most cases)
tool_choice: "auto"

// Never call any tools — get a text response only
tool_choice: "none"

// Force the model to call at least one tool
tool_choice: "required"

// Force a specific tool to be called
tool_choice: {
  type: "function",
  function: { name: "get_current_weather" }
}

When to use each:

  • "auto" — most conversations, where tool use is optional.
  • "none" — when you want the model's reasoning without executing anything.
  • "required" — when you want structured output and a tool call is the mechanism.
  • a specific tool — when you know exactly what you want, e.g. always extracting user intent into one schema.

Error Handling and Fallback Strategies

Production function calling needs layered error handling — validate arguments, retry with backoff, and always degrade gracefully:

// src/robust-agent.ts
async function robustToolExecution(
  toolCall: OpenAI.Chat.ChatCompletionMessageToolCall
): Promise<string> {
  const MAX_RETRIES = 2;
  let lastError: Error | null = null;

  for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
    try {
      const args = JSON.parse(toolCall.function.arguments);

      // Validate args before executing
      if (toolCall.function.name === "get_current_weather") {
        const validated = WeatherArgsSchema.parse(args);
        return await getCurrentWeather(validated.city, validated.unit);
      }

      return JSON.stringify({ error: `Unknown tool: ${toolCall.function.name}` });
    } catch (error) {
      lastError = error instanceof Error ? error : new Error(String(error));

      if (attempt < MAX_RETRIES) {
        console.warn(`[Retry ${attempt}] Tool call failed: ${lastError.message}`);
        await new Promise((resolve) => setTimeout(resolve, 500 * attempt)); // Backoff
      }
    }
  }

  // After all retries, return a graceful error string
  // The model will acknowledge the failure in its response
  return JSON.stringify({
    error: true,
    message: `Tool execution failed after ${MAX_RETRIES} attempts: ${lastError?.message}`,
    tool: toolCall.function.name,
  });
}
The most important rule: always return a string from your tool executor, even on failure. Never let an exception propagate out of the loop — it turns a graceful "that tool failed" into a 500 for your user.

The Assistants API vs Function Calling — Choosing the Right One

A question that comes up constantly. The short version: function calling gives you control, the Assistants API gives you convenience.

Function CallingAssistants API
State managementYou manage message historyOpenAI manages threads
File handlingManualBuilt-in (file search, code interpreter)
ComplexityLowerHigher
ControlFullPartial
CostPay per tokenPay per token + storage
Best forCustom agents, full controlPersistent assistants, file Q&A

Use function calling when you want full control over the conversation loop — which is most custom agent implementations. Reach for the Assistants API when you want OpenAI to manage thread state and files, or you want the Code Interpreter tool without building it yourself.

You now understand function calling end to end — tool schemas, parallel calls, streaming, typed responses, and error handling. The full source for this tutorial is on GitHub.

Frequently asked questions

What's the difference between function calling and the Assistants API?

Function calling is a feature of the Chat Completions API — you own the conversation loop and the message history. The Assistants API is a higher-level abstraction where OpenAI manages threads, history, and file storage for you. Function calling gives more control; the Assistants API gives more convenience for things like persistent chat and file Q&A.

How do I call multiple functions in one request?

GPT-4o and GPT-4o mini support parallel tool calling natively. When the model decides several tools are needed, it returns all of them in a single response. Execute them concurrently with Promise.all() and return every result before making the next API call.

Can I use function calling with GPT-3.5?

GPT-3.5-turbo supports function calling, but without strict mode and with far less reliable parallel tool use. For production agents, use GPT-4o or GPT-4o mini — the reliability difference is substantial.

How do I type function call responses in TypeScript?

The OpenAI SDK ships full types — use OpenAI.Chat.ChatCompletionTool for tool definitions and OpenAI.Chat.ChatCompletionMessageToolCall for tool-call responses. For the arguments JSON, parse with a Zod schema and use z.infer<> to derive the type, giving you runtime validation and compile-time types at once.

What is strict: true in tool definitions?

Strict mode uses constrained decoding to guarantee the model's arguments exactly match your JSON schema — no extra fields, no missing required ones. It requires additionalProperties: false and every property to be required or nullable. Turn it on for all production tool definitions.

How do I stop the model calling tools when it shouldn't?

Set tool_choice: "none" to disable tools for a specific request, or steer it through the system prompt and the tool descriptions themselves — an explicit "do NOT use this for…" line in a description is one of the most effective ways to keep the model from reaching for the wrong tool.

Where to go next

Function calling is the primitive. These are the natural next builds on top of it:

Define the tools precisely, let the model choose, execute and return strings, bound the loop, and validate everything that comes back — and function calling stops being a clever demo and becomes the dependable core of an AI product you can actually ship.

Related articles