Build an AI Coding Assistant with Code Interpreter

Persona: You're building a Cursor or Replit-like coding assistant — users describe what they want in natural language, your LLM generates the code, and E2B executes it safely in an isolated sandbox. Output streams back to the user in real time.

Skills used: e2b-code-interpreter

What You're Building

A conversational coding assistant that:

Accepts a natural language coding task from the user
Sends the task to an LLM to generate code
Executes the generated code in an E2B sandbox
Streams stdout/stderr back to the user as it runs
Detects errors and retries with auto-correction (up to 3 attempts)
Maintains session state — variables defined in one message persist in the next

Prerequisites

bash

npm install @e2b/code-interpreter @anthropic-ai/sdk

bash

export E2B_API_KEY=e2b_your_key
export ANTHROPIC_API_KEY=sk-ant-your_key

Step 1: Initialize the sandbox and LLM client

Create one sandbox per user session. Keep it alive for the full conversation so variables and imports persist between turns.

typescript

import Anthropic from '@anthropic-ai/sdk'
import { Sandbox } from '@e2b/code-interpreter'

const anthropic = new Anthropic()

// One sandbox per session — reused across multiple messages
const sandbox = await Sandbox.create({
  timeoutMs: 30 * 60 * 1000, // 30 minutes per session
})

console.log('Sandbox ready:', sandbox.sandboxId)

Step 2: Generate code from natural language

Ask the LLM to write Python code for the user's task. Use a system prompt that enforces clean, executable output.

typescript

async function generateCode(
  task: string,
  history: Array<{ role: 'user' | 'assistant'; content: string }>,
  errorContext?: string
): Promise<string> {
  const systemPrompt = `You are an expert Python developer.
The user will describe a coding task. Write clean, working Python code to accomplish it.
Rules:
- Output ONLY the Python code. No markdown fences, no explanation.
- Use print() to show results — they will be streamed to the user.
- You may import any standard library or common packages (pandas, numpy, matplotlib, requests).
- Keep code concise and readable.
${errorContext ? `\nThe previous attempt failed with this error:\n${errorContext}\nFix the code.` : ''}`

  const messages = [
    ...history,
    { role: 'user' as const, content: task },
  ]

  const response = await anthropic.messages.create({
    model: 'claude-opus-4-5',
    max_tokens: 2048,
    system: systemPrompt,
    messages,
  })

  const code = response.content
    .filter(block => block.type === 'text')
    .map(block => (block as { type: 'text'; text: string }).text)
    .join('')
    .trim()

  return code
}

Step 3: Execute code with streaming output

Run the generated code in the sandbox. Stream every line of stdout/stderr to the user as it arrives.

typescript

interface ExecutionResult {
  success: boolean
  output: string
  error?: string
  richOutputs?: unknown[]
}

async function executeCode(
  sandbox: Sandbox,
  code: string,
  onOutput: (line: string) => void
): Promise<ExecutionResult> {
  const outputLines: string[] = []
  let errorMessage: string | undefined

  const result = await sandbox.runCode(code, {
    onStdout: (output) => {
      outputLines.push(output.line)
      onOutput(`  ${output.line}`)
    },
    onStderr: (output) => {
      outputLines.push(`[stderr] ${output.line}`)
      onOutput(`  ⚠️  ${output.line}`)
    },
  })

  if (result.error) {
    errorMessage = `${result.error.name}: ${result.error.value}`
    if (result.error.traceback) {
      errorMessage += `\n${result.error.traceback}`
    }
  }

  return {
    success: !result.error,
    output: outputLines.join('\n'),
    error: errorMessage,
    richOutputs: result.results,
  }
}

Step 4: Auto-retry on error

If execution fails, send the error back to the LLM and ask it to fix the code. Retry up to 3 times.

typescript

const MAX_RETRIES = 3

async function generateAndRun(
  sandbox: Sandbox,
  task: string,
  history: Array<{ role: 'user' | 'assistant'; content: string }>,
  onOutput: (line: string) => void
): Promise<{ code: string; result: ExecutionResult }> {
  let code = ''
  let result: ExecutionResult | null = null
  let lastError: string | undefined

  for (let attempt = 1; attempt <= MAX_RETRIES; attempt++) {
    if (attempt > 1) {
      onOutput(`\n🔄 Attempt ${attempt}/${MAX_RETRIES} — fixing error...\n`)
    }

    // Generate (or regenerate) code
    code = await generateCode(task, history, lastError)
    onOutput(`\n📝 Generated code:\n\`\`\`python\n${code}\n\`\`\`\n\n🚀 Running...\n`)

    // Execute
    result = await executeCode(sandbox, code, onOutput)

    if (result.success) {
      onOutput('\n✅ Done.\n')
      return { code, result }
    }

    lastError = result.error
    onOutput(`\n❌ Error: ${result.error}\n`)
  }

  onOutput(`\n🛑 Failed after ${MAX_RETRIES} attempts.\n`)
  return { code, result: result! }
}

Step 5: Multi-turn session loop

Run the full conversation loop. The LLM conversation history grows with each turn; the sandbox preserves Python state.

typescript

async function runCodingAssistant() {
  const sandbox = await Sandbox.create({ timeoutMs: 30 * 60 * 1000 })

  // LLM conversation history
  const history: Array<{ role: 'user' | 'assistant'; content: string }> = []

  console.log('🤖 AI Coding Assistant ready. Type a task or "exit" to quit.\n')

  // Simple REPL loop (replace with your chat UI)
  const readline = await import('readline')
  const rl = readline.createInterface({ input: process.stdin, output: process.stdout })

  const askQuestion = (prompt: string) =>
    new Promise<string>(resolve => rl.question(prompt, resolve))

  while (true) {
    const userInput = await askQuestion('You: ')

    if (userInput.toLowerCase() === 'exit') break
    if (!userInput.trim()) continue

    console.log('\nAssistant:')

    const { code, result } = await generateAndRun(
      sandbox,
      userInput,
      history,
      (line) => process.stdout.write(line + '\n')
    )

    // Add this turn to LLM history
    history.push({ role: 'user', content: userInput })
    history.push({
      role: 'assistant',
      content: result.success
        ? `I ran this code:\n\`\`\`python\n${code}\n\`\`\`\nOutput:\n${result.output}`
        : `I tried to run this code but it failed:\n\`\`\`python\n${code}\n\`\`\`\nError: ${result.error}`,
    })

    console.log()
  }

  // Clean up sandbox when session ends
  await sandbox.kill()
  rl.close()
  console.log('Session ended. Sandbox destroyed.')
}

runCodingAssistant().catch(console.error)

Step 6: Example session

You: Create a list of the first 20 fibonacci numbers and show their statistics

Assistant:

📝 Generated code:
```python
fib = [0, 1]
for _ in range(18):
    fib.append(fib[-1] + fib[-2])

import statistics
print("Fibonacci numbers:", fib)
print("Min:", min(fib))
print("Max:", max(fib))
print("Mean:", round(statistics.mean(fib), 2))
print("Median:", statistics.median(fib))

🚀 Running...

Fibonacci numbers: [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181] Min: 0 Max: 4181 Mean: 697.25 Median: 44.5

✅ Done.

You: Now plot them as a bar chart and save it as fibonacci.png

Assistant:

📝 Generated code:

python

import matplotlib.pyplot as plt

fib = [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
plt.figure(figsize=(12, 5))
plt.bar(range(len(fib)), fib, color="steelblue")
plt.title("First 20 Fibonacci Numbers")
plt.xlabel("Index")
plt.ylabel("Value")
plt.tight_layout()
plt.savefig("fibonacci.png", dpi=150)
print("Chart saved to fibonacci.png")

🚀 Running...

Chart saved to fibonacci.png

✅ Done.


---

## Key Design Decisions

- **One sandbox per session**: All variables, imports, and files created in earlier turns are available in later turns — just like a Jupyter notebook.
- **Streaming first**: Users see output as it prints, not after the full run completes. This is critical for long-running tasks.
- **Auto-retry loop**: Most LLM code errors are fixable by sending the traceback back to the model. Three retries handles the vast majority of cases without user intervention.
- **History in the LLM context**: Including previous code and output in the conversation history lets the assistant reference earlier results (e.g., "now plot them" without re-explaining what "them" is).

---

## Production Considerations

- **Sandbox per user session**: Create one sandbox per user, store its ID in your session/database, and reconnect with `Sandbox.connect(sandboxId)` across HTTP requests.
- **Idle timeout**: Set `timeoutMs` to match your session timeout. Extend with `sandbox.setTimeout()` on activity.
- **File downloads**: Use `sandbox.files.read(path)` to retrieve generated files (charts, CSVs) and serve them to users.
- **Language selection**: Pass `{ language: "javascript" }` or `{ language: "bash" }` to `runCode()` for non-Python execution.
- **Resource limits**: E2B sandboxes have CPU/memory limits. For heavy computation, consider Modal (see [modal-labs skill](../skills/modal-labs/SKILL.md)) instead.

Skills stack · 1 skill

e2b

What You're Building

Prerequisites

Step 1: Initialize the sandbox and LLM client

Step 2: Generate code from natural language

Step 3: Execute code with streaming output

Step 4: Auto-retry on error

Step 5: Multi-turn session loop

Step 6: Example session