dspy-ts

v0.1 — Experimental Release

This is pre-production software. It has been tested (669 tests passing) but has not been battle-tested in production environments. Do not use blindly in production. Verify any behavior you depend on with your own eyes. We tried our best, but we make no guarantees that everything works correctly. APIs may change without notice.

Built by AI. This codebase was written and maintained primarily by AI agents (Claude and Codex), with human direction and review. We believe it to be useful, safe, and secure — but verify for yourself. Issues and pull requests are welcome.

Full-parity TypeScript port of DSPy v3.1.3 — the framework for programming, not prompting, language models.

What's Included

Core

Signature — typed input/output field declarations with automatic parsing
Example / Prediction — structured data containers for demos and outputs
Module / BaseModule — composable program units with namedPredictors(), deepCopy(), dumpState() / loadState()
LM — abstract language model class with caching, history, usage tracking, and streaming hooks
Evaluate — batch evaluation framework with metrics and parallel execution

Predict Modules

Predict — core predict with demos, temperature auto-adjustment (n>1 + temp≤0.15 → 0.7)
ChainOfThought — adds a reasoning field before the output
BestOfN — generate N completions, pick best by reward function
MultiChainComparison — generate multiple reasoning chains, compare them
Refine — iterative self-refinement with feedback
ReAct — reasoning + action loops with tool use
ProgramOfThought — generate and execute code to produce answers
CodeAct — agentic code execution with REPL state
RLM — retrieval-augmented language model
Parallel — run multiple predict calls concurrently
Retrieve — pluggable retrieval module with global retriever config
Tool / ToolCall / ToolCalls — structured tool definitions and invocations

Adapters

ChatAdapter — formats signatures as chat messages with [[ ## field ## ]] markers
JSONAdapter — JSON-formatted output parsing
XMLAdapter — XML-formatted output parsing
TwoStepAdapter — two-pass extraction (natural language → structured)

Adapter Types

Image, Audio, DSPyFile, History, Code, Reasoning, Document, Citation, Citations — rich types for multimodal and structured content with native response type extraction

Optimizers

Optimizer	Description
LabeledFewShot	Select demos from labeled examples
BootstrapFewShot	Generate demos via bootstrapped execution
BootstrapFewShotWithRandomSearch	Bootstrap + random search over demo sets
BootstrapFewShotWithOptuna	Bootstrap backed by Optuna-style trial search
COPRO	Collaborative Prompt Optimization — LM-proposed instructions
MIPROv2	Multi-prompt Instruction Proposal Optimizer with TPE + minibatch eval
SIMBA	Softmax selection, Poisson demo dropping, rule generation
GEPA	Generalized Efficient Prompt Approximation — state-of-the-art
KNNFewShot	k-nearest-neighbor demo selection at inference time
InferRules	Extract reusable rules from execution traces
Ensemble	Combine multiple program variants
AvatarOptimizer	Persona-based optimization
BootstrapFinetune	Bootstrap training data then finetune the LM
GRPO	Group Relative Policy Optimization (RL-based weight training)
BetterTogether	Joint optimization of prompts and finetunes

Infrastructure

TPE — Tree-structured Parzen Estimator for Bayesian optimization (used by MIPROv2)
KNN — k-nearest-neighbor retriever with pluggable embedding
Cache — disk-backed caching with TTL and eviction
Embedder — embedding client with caching support
Callbacks — full observability system (module start/end, LM start/end, adapter format/parse)
Streaming — streamify() wrapper, StreamListener, StatusMessage types
Dataset — data loading utilities
UsageTracker — token usage tracking across LM calls
ParallelExecutor — concurrent execution with configurable limits
Provider / TrainingJob / ReinforceJob — finetuning infrastructure

Quick Start

bun install

import { LM, configure, ChainOfThought } from "dspy-ts";

// Implement the LM abstract class for your provider
class MyLM extends LM {
  async forward(
    messages: Array<{ role: string; content: string }>,
    config: LMConfig
  ): Promise<LMResponse[]> {
    // Call your LLM provider here
    const response = await callMyProvider(messages, config);
    return [{ text: response.text, usage: response.usage }];
  }
}

const lm = new MyLM({ model: "my-model", temperature: 0.7 });
configure({ lm });

const cot = new ChainOfThought("question -> answer");
const result = await cot.call({ question: "What is DSPy?" });
console.log(result.answer);

Implementing a Custom LM

Extend the LM abstract class and implement forward():

import { LM, type LMConfig, type LMResponse } from "dspy-ts";

class CustomLM extends LM {
  async forward(
    messages: Array<{ role: string; content: string }>,
    config: LMConfig
  ): Promise<LMResponse[]> {
    // config.temperature, config.maxTokens, config.n are available
    // Return an array of LMResponse objects (length = config.n)
    return [{ text: "response text", usage: { promptTokens: 10, completionTokens: 20 } }];
  }
}

The base class handles caching, history tracking, usage aggregation, retry logic, and streaming hooks automatically.

Callbacks

Observe every level of execution:

import { BaseCallback, setGlobalCallbacks, ChainOfThought } from "dspy-ts";

class LoggingCallback extends BaseCallback {
  onModuleStart(callId: string, instance: unknown, inputs: Record<string, unknown>) {
    console.log(`[${(instance as any).constructor.name}] start:`, inputs);
  }
  onModuleEnd(callId: string, outputs: unknown, exception?: Error) {
    console.log("  done:", exception ? `error: ${exception.message}` : outputs);
  }
  onLmStart(callId: string, instance: unknown, inputs: Record<string, unknown>) {
    console.log("  LM call");
  }
  onLmEnd(callId: string, outputs: unknown, exception?: Error) {
    console.log("  LM done");
  }
}

setGlobalCallbacks([new LoggingCallback()]);

Running Tests

bun test           # run all 669 tests
bun test tests/    # run specific directory

Architecture

DSPy programs are built from composable modules, each containing one or more Predict instances. Signatures define typed input/output contracts. Adapters format signatures into LM prompts and parse responses. Optimizers search the space of instructions and demos to maximize a metric.

Module (ChainOfThought, ReAct, etc.)
  └── Predict (core prediction unit)
        ├── Signature (typed I/O contract)
        ├── Adapter (prompt formatting / response parsing)
        └── LM (language model backend)

Optimizers wrap this pipeline: they generate candidate prompts (instructions + demos), evaluate them against a metric, and select the best configuration.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
cross-validation		cross-validation
shared/interpreter		shared/interpreter
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dspy-ts

What's Included

Core

Predict Modules

Adapters

Adapter Types

Optimizers

Infrastructure

Quick Start

Implementing a Custom LM

Callbacks

Running Tests

Architecture

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

productioneer/dspy-ts

Folders and files

Latest commit

History

Repository files navigation

dspy-ts

What's Included

Core

Predict Modules

Adapters

Adapter Types

Optimizers

Infrastructure

Quick Start

Implementing a Custom LM

Callbacks

Running Tests

Architecture

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages