LiveKit Agents for Node.js

i<!-- SPDX-FileCopyrightText: 2024 LiveKit, Inc.

SPDX-License-Identifier: Apache-2.0 -->

The LiveKit icon, the name of the repository and some sample code in the background.

LiveKit Agents for Node.js

The Agent Framework is designed for building realtime, programmable participants that run on servers. Use it to create conversational, multi-modal voice agents that can see, hear, and understand.

This is a Node.js distribution of the LiveKit Agents framework, originally written in Python.

Looking for the Python library? Check out Agents.

✨ 1.0 Release ✨

This README reflects the 1.0 release. See the migration guide if you're trying to upgrade from 0.x.

Features

Flexible integrations: A comprehensive ecosystem to mix and match the right STT, LLM, TTS, and Realtime API to suit your use case.
Extensive WebRTC clients: Build client applications using LiveKit's open-source SDK ecosystem, supporting all major platforms.
Exchange data with clients: Use RPCs and other Data APIs to seamlessly exchange data with clients.
Semantic turn detection: Uses a transformer model to detect when a user is done with their turn, helps to reduce interruptions.
Open-source: Fully open-source, allowing you to run the entire stack on your own servers, including LiveKit server, one of the most widely used WebRTC media servers.

Installation

The framework includes a variety of plugins that make it easy to process streaming input or generate output. For example, there are plugins for converting text-to-speech or running inference with popular LLMs.

Install pnpm if you haven't already:

npm install -g pnpm

To install the core Agents library as well as plugins, run:

pnpm install @livekit/agents

Currently, only the following plugins are supported:

Plugin	Features
@livekit/agents-plugin-openai	LLM, TTS, STT
@livekit/agents-plugin-google	LLM, TTS
@livekit/agents-plugin-deepgram	STT
@livekit/agents-plugin-elevenlabs	TTS
@livekit/agents-plugin-cartesia	TTS
@livekit/agents-plugin-neuphonic	TTS
@livekit/agents-plugin-resemble	TTS
@livekit/agents-plugin-silero	VAD
@livekit/agents-plugin-livekit	EOU

Docs and guides

Documentation on the framework and how to use it can be found here

Core concepts

Agent: An LLM-based application with defined instructions.
AgentSession: A container for agents that manages interactions with end users.
entrypoint: The starting point for an interactive session, similar to a request handler in a web server.
Worker: The main process that coordinates job scheduling and launches agents for user sessions.

Usage

Checkout the quickstart guide

Simple voice agent

import {
  type JobContext,
  type JobProcess,
  WorkerOptions,
  cli,
  defineAgent,
  llm,
  voice,
} from '@livekit/agents';
import * as deepgram from '@livekit/agents-plugin-deepgram';
import * as elevenlabs from '@livekit/agents-plugin-elevenlabs';
import * as openai from '@livekit/agents-plugin-openai';
import * as silero from '@livekit/agents-plugin-silero';
import { fileURLToPath } from 'node:url';
import { z } from 'zod';

const lookupWeather = llm.tool({
  description: 'Used to look up weather information.',
  parameters: z.object({
    location: z.string().describe('The location to look up weather information for'),
  }),
  execute: async ({ location }, { ctx }) => {
    return { weather: 'sunny', temperature: 70 };
  },
});

export default defineAgent({
  prewarm: async (proc: JobProcess) => {
    proc.userData.vad = await silero.VAD.load();
  },
  entry: async (ctx: JobContext) => {
    await ctx.connect();

    const agent = new voice.Agent({
      instructions: 'You are a friendly voice assistant built by LiveKit.',
      tools: { lookupWeather },
    });

    const session = new voice.AgentSession({
      vad: ctx.proc.userData.vad! as silero.VAD,
      stt: new deepgram.STT(),
      llm: new openai.LLM(),
      tts: new elevenlabs.TTS(),
    });

    await session.start({
      agent,
      room: ctx.room,
    });

    await session.generateReply({
      instructions: 'greet the user and ask about their day',
    });
  },
});

cli.runApp(new WorkerOptions({ agent: fileURLToPath(import.meta.url) }));

You'll need the following environment variables for this example:

DEEPGRAM_API_KEY
OPENAI_API_KEY
ELEVEN_API_KEY

Multi-agent handoff

This code snippet is abbreviated. For the full example, see multi_agent.ts

type StoryData = {
  name?: string;
  location?: string;
};

class IntroAgent extends voice.Agent<StoryData> {
  constructor() {
    super({
      instructions: `You are a story teller. Your goal is to gather a few pieces of information from the user to make the story personalized and engaging. Ask the user for their name and where they are from.`,
      tools: {
        informationGathered: llm.tool({
          description:
            'Called when the user has provided the information needed to make the story personalized and engaging.',
          parameters: z.object({
            name: z.string().describe('The name of the user'),
            location: z.string().describe('The location of the user'),
          }),
          execute: async ({ name, location }, { ctx }) => {
            ctx.userData.name = name;
            ctx.userData.location = location;

            return llm.handoff({
              agent: new StoryAgent(name, location),
              returns: "Let's start the story!",
            });
          },
        }),
      },
    });
  }

  // Use inheritance to create agent with custom hooks
  async onEnter() {
    this.session.generateReply({
      instructions: '"greet the user and gather information"',
    });
  }
}

class StoryAgent extends voice.Agent<StoryData> {
  constructor(name: string, location: string) {
    super({
      instructions: `You are a storyteller. Use the user's information in order to make the story personalized.
        The user's name is ${name}, from ${location}`,
    });
  }

  async onEnter() {
    this.session.generateReply();
  }
}

export default defineAgent({
  prewarm: async (proc: JobProcess) => {
    proc.userData.vad = await silero.VAD.load();
  },
  entry: async (ctx: JobContext) => {
    await ctx.connect();
    const participant = await ctx.waitForParticipant();
    console.log('participant joined: ', participant.identity);

    const userdata: StoryData = {};

    const session = new voice.AgentSession({
      vad: ctx.proc.userData.vad! as silero.VAD,
      stt: new deepgram.STT(),
      llm: new openai.LLM(),
      tts: new elevenlabs.TTS(),
      userData: userdata,
    });

    await session.start({
      agent: new IntroAgent(),
      room: ctx.room,
    });
  },
});

Running your agent

The framework exposes a CLI interface to run your agent. To get started, you'll need the following environment variables set:

LIVEKIT_URL
LIVEKIT_API_KEY
LIVEKIT_API_SECRET
any additional provider API keys (e.g. OPENAI_API_KEY)

The following command will start the worker and wait for users to connect to your LiveKit server:

pnpm run build && node ./examples/src/restaurant_agent.ts dev

Using playground for your agent UI

To ease the process of building and testing an agent, we've developed a versatile web frontend called "playground". You can use or modify this app to suit your specific requirements. It can also serve as a starting point for a completely custom agent application.

Running for production

pnpm run build && node ./examples/src/restaurant_agent.ts start

Runs the agent with production-ready optimizations.

FAQ

What happens when I run my agent?

When you follow the steps above to run your agent, a worker is started that opens an authenticated WebSocket connection to a LiveKit server instance(defined by your LIVEKIT_URL and authenticated with an access token).

No agents are actually running at this point. Instead, the worker is waiting for LiveKit server to give it a job.

When a room is created, the server notifies one of the registered workers about a new job. The notified worker can decide whether or not to accept it. If the worker accepts the job, the worker will instantiate your agent as a participant and have it join the room where it can start subscribing to tracks. A worker can manage multiple agent instances simultaneously.

If a notified worker rejects the job or does not accept within a predetermined timeout period, the server will route the job request to another available worker.

What happens when I SIGTERM a worker?

The orchestration system was designed for production use cases. Unlike the typical web server, an agent is a stateful program, so it's important that a worker isn't terminated while active sessions are ongoing.

When calling SIGTERM on a worker, the worker will signal to LiveKit server that it no longer wants additional jobs. It will also auto-reject any new job requests that get through before the server signal is received. The worker will remain alive while it manages any agents connected to rooms.

License

This project is licensed under Apache-2.0, and is REUSE-3.2 compliant. Refer to the license for details.

LiveKit Ecosystem
LiveKit SDKs	Browser · iOS/macOS/visionOS · Android · Flutter · React Native · Rust · Node.js · Python · Unity · Unity (WebGL)
Server APIs	Node.js · Golang · Ruby · Java/Kotlin · Python · Rust · PHP (community) · .NET (community)
UI Components	React · Android Compose · SwiftUI
Agents Frameworks	Python · Node.js · Playground
Services	LiveKit server · Egress · Ingress · SIP
Resources	Docs · Example apps · Cloud · Self-hosting · CLI

Name		Name	Last commit message	Last commit date
Latest commit History 557 Commits
.changeset		.changeset
.github		.github
LICENSES		LICENSES
agents		agents
examples		examples
plugins		plugins
scripts		scripts
.eslintrc		.eslintrc
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
REUSE.toml		REUSE.toml
api-extractor-shared.json		api-extractor-shared.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts
turbo.json		turbo.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LiveKit Agents for Node.js

✨ 1.0 Release ✨

Features

Installation

Docs and guides

Core concepts

Usage

Simple voice agent

Multi-agent handoff

Running your agent

Using playground for your agent UI

Running for production

FAQ

What happens when I run my agent?

What happens when I SIGTERM a worker?

License

About

Uh oh!

Releases 196

Packages

Uh oh!

Contributors 36

Uh oh!

Languages

License

livekit/agents-js

Folders and files

Latest commit

History

Repository files navigation

LiveKit Agents for Node.js

✨ 1.0 Release ✨

Features

Installation

Docs and guides

Core concepts

Usage

Simple voice agent

Multi-agent handoff

Running your agent

Using playground for your agent UI

Running for production

FAQ

What happens when I run my agent?

What happens when I SIGTERM a worker?

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 196

Packages 0

Uh oh!

Contributors 36

Uh oh!

Languages

Packages