LLM.js — Simple LLM library for Node.js
LLM.js
LLM.js is the fastest way to use Large Language Models in JavaScript. It’s a single simple interface to hundreds of popular LLMs:
- OpenAI:
gpt-4
,gpt-4-turbo-preview
,gpt-3.5-turbo
- Google:
gemini-1.5-pro
,gemini-1.0-pro
,gemini-pro-vision
- Anthropic:
claude-3-opus
,claude-3-sonnet
,claude-3-haiku
,claude-2.1
,claude-instant-1.2
- Groq:
mixtral-8x7b
,llama2-70b
,gemma-7b-it
- Together:
llama-3-70b
,llama-3-8b
,nous-hermes-2
, … - Mistral:
mistral-medium
,mistral-small
,mistral-tiny
- llamafile:
LLaVa-1.5
,TinyLlama-1.1B
,Phi-2
, … - Ollama:
llama-3
,llama-2
,gemma
,dolphin-phi
, …
await LLM("the color of the sky is", { model: "gpt-4" }); // blue
Features
- Easy to use
- Same API for all LLMs (
OpenAI
,Google
,Anthropic
,Mistral
,Groq
,Llamafile
,Ollama
,Together
) - Chat (Message History)
- JSON
- Streaming
- System Prompts
- Options (
temperature
,max_tokens
,seed
, …) - Parsers
llm
command for your shell- Node.js and Browser supported
- MIT license
Install
Install LLM.js
from NPM:
npm install @themaximalist/llm.js
Setting up LLMs is easy—just make sure your API key is set in your environment
export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export MISTRAL_API_KEY=...
export GOOGLE_API_KEY=...
export GROQ_API_KEY=...
export TOGETHER_API_KEY=...
For local models like llamafile and Ollama, ensure an instance is running.
Usage
The simplest way to call LLM.js
is as an
async function
.
const LLM = require("@themaximalist/llm.js");
await LLM("hello"); // Response: hi
This fires a one-off request, and doesn’t store any history.
Chat
Initialize an LLM instance to build up message history.
const llm = new LLM();
await llm.chat("what's the color of the sky in hex value?"); // #87CEEB
await llm.chat("what about at night time?"); // #222d5a
Streaming
Streaming provides a better user experience by returning results
immediately, and it’s as simple as passing {stream: true}
as an option.
const stream = await LLM("the color of the sky is", { stream: true });
for await (const message of stream) {
process.stdout.write(message);
}
Sometimes it’s helpful to handle the stream in real-time and also process it once it’s all complete. For example, providing real-time streaming in chat, and then parsing out semantic code blocks at the end.
LLM.js
makes this easy with an optional
stream_handler
option.
const colors = await LLM("what are the common colors of the sky as a flat json array?", {
model: "gpt-4-turbo-preview",
stream: true,
stream_handler: (c) => process.stdout.write(c),
parser: LLM.parsers.json,
;
})// ["blue", "gray", "white", "orange", "red", "pink", "purple", "black"]
Instead of the stream being returned as a generator, it’s passed to
the stream_handler
. The response from LLM.js
is the entire response, which can be parsed or handled as normal.
JSON
LLM.js
supports JSON schema for OpenAI and LLaMa. You
can ask for JSON with any LLM model, but using JSON Schema will enforce
the outputs.
const schema = {
"type": "object",
"properties": {
"colors": { "type": "array", "items": { "type": "string" } }
}
}
const obj = await LLM("what are the 3 primary colors in JSON format?", { schema, temperature: 0.1, service: "openai" });
Different formats are used by different models (JSON Schema, BNFS),
so LLM.js
converts between these automatically.
Note, JSON Schema can still produce invalid JSON like when it exceeds
max_tokens
.
System Prompts
Create agents that specialize at specific tasks using
llm.system(input)
.
const llm = new LLM();
.system("You are a friendly chat bot.");
llmawait llm.chat("what's the color of the sky in hex value?"); // Response: sky blue
await llm.chat("what about at night time?"); // Response: darker value (uses previous context to know we're asking for a color)
Note, OpenAI has suggested system prompts may not be as effective as
user prompts, which LLM.js
supports with
llm.user(input)
.
Message History
LLM.js
supports simple string prompts, but also full
message history. This is especially helpful to guide LLMs in a more
precise way.
await LLM([
role: "user", content: "remember the secret codeword is blue" },
{ role: "assistant", content: "OK I will remember" },
{ role: "user", content: "what is the secret codeword I just told you?" },
{ ; // Response: blue ])
The OpenAI message format is used, and converted on-the-fly for specific services that use a different format (like Google, Mixtral and LLaMa).
Switch LLMs
LLM.js
supports most popular Large Lanuage Models,
including
- OpenAI:
gpt-4
,gpt-4-turbo-preview
,gpt-3.5-turbo
- Google:
gemini-1.0-pro
,gemini-1.5-pro
,gemini-pro-vision
- Anthropic:
claude-3-sonnet
,claude-3-haiku
,claude-2.1
,claude-instant-1.2
- Groq:
mixtral-8x7b
,llama2-70b
,gemma-7b-it
- Together:
llama-3-70b
,llama-3-8b
,nous-hermes-2
, … - Mistral:
mistral-medium
,mistral-small
,mistral-tiny
- llamafile:
LLaVa 1.5
,Mistral-7B-Instruct
,Mixtral-8x7B-Instruct
,WizardCoder-Python-34B
,TinyLlama-1.1B
,Phi-2
, … - Ollama:
Llama 2
,Mistral
,Code Llama
,Gemma
,Dolphin Phi
, …
LLM.js
can guess the LLM provider based on the model, or
you can specify it explicitly.
// defaults to Llamafile
await LLM("the color of the sky is");
// OpenAI
await LLM("the color of the sky is", { model: "gpt-4-turbo-preview" });
// Anthropic
await LLM("the color of the sky is", { model: "claude-2.1" });
// Mistral AI
await LLM("the color of the sky is", { model: "mistral-tiny" });
// Groq needs an specific service
await LLM("the color of the sky is", { service: "groq", model: "mixtral-8x7b-32768" });
// Google
await LLM("the color of the sky is", { model: "gemini-pro" });
// Ollama
await LLM("the color of the sky is", { model: "llama2:7b" });
// Together
await LLM("the color of the sky is", { service: "together", model: "meta-llama/Llama-3-70b-chat-hf" });
// Can optionally set service to be specific
await LLM("the color of the sky is", { service: "openai", model: "gpt-3.5-turbo" });
Being able to quickly switch between LLMs prevents you from getting locked in.
Parsers
LLM.js
ships with a few helpful parsers that work with
every LLM. These are separate from the typical JSON formatting with
tool
and schema
that some LLMs (like from
OpenAI) support.
JSON Parsing
const colors = await LLM("Please return the primary colors in a JSON array", {
parser: LLM.parsers.json
;
})// ["red", "green", "blue"]
Markdown Code Block Parsing
const story = await LLM("Please return a story wrapped in a Markdown story code block", {
parser: LLM.parsers.codeBlock("story")
;
})// A long time ago...
XML Parsing
const code = await LLM("Please write a simple website, and put the code inside of a <WEBSITE></WEBSITE> xml tag" {
parser: LLM.parsers.xml("WEBSITE")
;
})// <html>....
Note: OpenAI works best with Markdown and JSON, while Anthropic works best with XML tags.
API
The LLM.js
API provides a simple interface to dozens of
Large Language Models.
new LLM(input, { // Input can be string or message history array
service: "openai", // LLM service provider
model: "gpt-4", // Specific model
max_tokens: 100, // Maximum response length
temperature: 1.0, // "Creativity" of model
seed: 1000, // Stable starting point
stream: false, // Respond in real-time
stream_handler: null, // Optional function to handle stream
schema: { ... }, // JSON Schema
tool: { ... }, // Tool selection
parser: null, // Content parser
; })
The same API is supported in the short-hand interface of
LLM.js
—calling it as a function:
await LLM(input, options);
Input (required)
input
<string>
orArray
: Prompt for LLM. Can be a text string or array of objects inMessage History
format.
Options
All config parameters are optional. Some config options are only available on certain models, and are specified below.
service
<string>
: LLM service to use. Default isllamafile
.model
<string>
: Explicit LLM to use. Defaults toservice
default model.max_tokens
<int>
: Maximum token response length. No default.temperature
<float>
: “Creativity” of a model.0
typically gives more deterministic results, and higher values1
and above give less deterministic results. No default.seed
<int>
: Get more deterministic results. No default. Supported byopenai
,llamafile
andmistral
.stream
<bool>
: Return results immediately instead of waiting for full response. Default isfalse
.stream_handler
<function>
: Optional function that is called when a stream receives new content. Function is passed the string chunk.schema
<object>
: JSON Schema object for steering LLM to generate JSON. No default. Supported byopenai
andllamafile
.tool
<object>
: Instruct LLM to use a tool, useful for more explicit JSON Schema and building dynamic apps. No default. Supported byopenai
.parser
<function>
: Handle formatting and structure of returned content. No default.
Public Variables
messages
<array>
: Array of message history, managed byLLM.js
—but can be referenced and changed.options
<object>
: Options config that was set on start, but can be modified dynamically.
Methods
async send(options=<object>)
Sends the current Message History
to the current
LLM
with specified options
. These local
options will override the global default options.
Response will be automatically added to
Message History
.
await llm.send(options);
async chat(input=<string>, options=<object>)
Adds the input
to the current
Message History
and calls send
with the
current override options
.
Returns the response directly to the user, while updating
Message History
.
const response = await llm.chat("hello");
console.log(response); // hi
abort()
Aborts an ongoing stream. Throws an AbortError
.
user(input=<string>)
Adds a message from user
to
Message History
.
.user("My favorite color is blue. Remember that"); llm
system(input=<string>)
Adds a message from system
to
Message History
. This is typically the first message.
.system("You are a friendly AI chat bot..."); llm
assistant(input=<string>)
Adds a message from assistant
to
Message History
. This is typically a response from the AI,
or a way to steer a future response.
.user("My favorite color is blue. Remember that");
llm.assistant("OK, I will remember your favorite color is blue."); llm
Static Variables
LLAMAFILE
<string>
:llamafile
OPENAI
<string>
:openai
ANTHROPIC
<string>
:anthropic
MISTRAL
<string>
:mistral
GOOGLE
<string>
:google
MODELDEPLOYER
<string>
:modeldeployer
OLLAMA
<string>
:ollama
TOGETHER
<string>
:together
parsers
<object>
: List of defaultLLM.js
parsers- codeBlock(
<blockType>
)(<content>
)<function>
— Parses out a Markdown codeblock - json(
<content>
)<function>
— Parses out overall JSON or a Markdown JSON codeblock - xml(
<tag>
)(<content>
)<function>
— Parse the XML tag out of the response content
- codeBlock(
Static Methods
serviceForModel(model)
Return the LLM service
for a particular model.
.serviceForModel("gpt-4-turbo-preview"); // openai LLM
modelForService(service)
Return the default LLM for a service
.
.modelForService("openai"); // gpt-4-turbo-preview
LLM.modelForService(LLM.OPENAI); // gpt-4-turbo-preview LLM
Response
LLM.js
returns results from llm.send()
and
llm.chat()
, typically the string content from the LLM
completing your prompt.
await LLM("hello"); // "hi"
But when you use schema
and tools
— LLM.js
will typically return a JSON object.
const tool = {
"name": "generate_primary_colors",
"description": "Generates the primary colors",
"parameters": {
"type": "object",
"properties": {
"colors": {
"type": "array",
"items": { "type": "string" }
},
}"required": ["colors"]
};
}
await LLM("what are the 3 primary colors in physics?");
// { colors: ["red", "green", "blue"] }
await LLM("what are the 3 primary colors in painting?");
// { colors: ["red", "yellow", "blue"] }
And by passing {stream: true}
in options
,
LLM.js
will return a generator and start yielding results
immediately.
const stream = await LLM("Once upon a time", { stream: true });
for await (const message of stream) {
process.stdout.write(message);
}
The response is based on what you ask the LLM to do, and
LLM.js
always tries to do the obviously right thing.
Message History
The Message History
API in LLM.js
is the
exact same as the OpenAI
message history format.
await LLM([
role: "user", content: "remember the secret codeword is blue" },
{ role: "assistant", content: "OK I will remember" },
{ role: "user", content: "what is the secret codeword I just told you?" },
{ ; // Response: blue ])
Options
role
<string>
: Who is saying thecontent
?user
,system
, orassistant
content
<string>
: Text content from message
LLM Command
LLM.js
provides a useful llm
command for
your shell. llm
is a convenient way to call dozens of LLMs
and access the full power of LLM.js
without
programming.
Access it globally by installing from NPM
npm install @themaximalist/llm.js -g
Then you can call the llm
command from anywhere in your
terminal.
> llm the color of the sky is
blue
Messages are streamed back in real time, so everything is really fast.
You can also initiate a --chat
to remember message
history and continue your conversation (Ctrl-C
to
quit).
> llm remember the codeword is blue. say ok if you understand --chat
OK, I understand.
> what is the codeword?
The codeword is blue.
Or easily change the LLM on the fly:
> llm the color of the sky is --model claude-v2
blue
See help with llm --help
Usage: llm [options] [input]
Large Language Model library for OpenAI, Google, Anthropic, Mistral, Groq and LLaMa
Arguments:
input Input to send to LLM service
Options:
-V, --version output the version number
-m, --model <model> Completion Model (default: llamafile)
-s, --system <prompt> System prompt (default: "I am a friendly accurate English speaking chat bot") (default: "I am a friendly accurate English speaking chat bot")
-t, --temperature <number> Model temperature (default 0.8) (default: 0.8)
-c, --chat Chat Mode
-h, --help display help for command
Debug
LLM.js
and llm
use the debug
npm module with the llm.js
namespace.
View debug logs by setting the DEBUG
environment
variable.
> DEBUG=llm.js* llm the color of the sky is
# debug logs
blue
> export DEBUG=llm.js*
> llm the color of the sky is
# debug logs
blue
Examples
LLM.js
has lots of tests
which can serve as a guide for seeing how it’s used.
Deploy
Using LLMs in production can be tricky because of tracking history, rate limiting, managing API keys and figuring out how to charge.
Model Deployer is an API in
front of LLM.js
—that handles all of these details and
more.
- Message History — keep track of what you’re sending to LLM providers
- Rate Limiting — ensure a single user doesn’t run up your bill
- API Keys — create a free faucet for first time users to have a great experience
- Usage — track and compare costs across all LLM providers
Using it is simple, specify modeldeployer
as the service
and your API key from Model Deployer as the model
.
await LLM("hello world", { service: "modeldeployer", model: "api-key" });
You can also setup specific settings and optionally override some on the client.
await LLM("the color of the sky is usually", {
service: "modeldeployer",
model: "api-key",
endpoint: "https://example.com/api/v1/chat",
max_tokens: 1,
temperature: 0
; })
LLM.js
can be used without Model Deployer, but if you’re
deploying LLMs to production it’s a great way to manage them.
Changelog
LLM.js
has been under heavy development while LLMs are
rapidly changing. We’ve started to settle on a stable interface, and
will document changes here.
- 04/24/2024 —
v0.6.6
— Added browser support - 04/18/2024 —
v0.6.5
— Added Llama 3 and Together - 03/25/2024 —
v0.6.4
— Added Groq and abort() - 03/17/2024 —
v0.6.3
— Added JSON/XML/Markdown parsers and a stream handler - 03/15/2024 —
v0.6.2
— Fix bug with Google streaming - 03/15/2024 —
v0.6.1
— Fix bug to not add empty responses - 03/04/2024 —
v0.6.0
— Added Anthropic Claude 3 - 03/02/2024 —
v0.5.9
— Added Ollama - 02/15/2024 —
v0.5.4
— Added Google Gemini - 02/13/2024 —
v0.5.3
— Added Mistral - 01/15/2024 —
v0.5.0
— Created website - 01/12/2024 —
v0.4.7
— OpenAI Tools, JSON stream - 01/07/2024 —
v0.3.5
— Added ModelDeployer - 01/05/2024 —
v0.3.2
— Added Llamafile - 04/26/2023 —
v0.2.5
— Added Anthropic, CLI - 04/24/2023 —
v0.2.4
— Chat options - 04/23/2023 —
v0.2.2
— Unified LLM() interface, streaming - 04/22/2023 —
v0.1.2
— Docs, system prompt - 04/21/2023 —
v0.0.1
— Created LLM.js with OpenAI support
Projects
LLM.js
is currently used in the following projects:
- AI.js — simple AI library
- Infinity Arcade — play any text adventure game
- News Score — score and sort the news
- Images Bot — image explorer
- Model Deployer — deploy AI models in production
- HyperType — knowledge graph toolkit
- HyperTyper — multidimensional mind mapping
License
MIT
Author
Created by The Maximalist, see our open-source projects.