Skip to main content

The Launch

Today marks the release of Olly Chat MVP—the first web-based interface for Olly, our local AI coding assistant. Developers can now interact with Olly directly from their browser, without needing to use the terminal.

What Is Olly?

Olly is a local-first coding LLM sidekick that helps developers:
  • Read and understand codebases
  • Search for patterns across files
  • List symbols and function definitions
  • View git status, diffs, and history
All powered by Qwen2.5-Coder-7B running locally via llama-cpp-python.

What’s New

Web Chat Interface

  • Real-time message display with WebSocket streaming
  • Tool execution cards showing exactly what Olly is doing
  • Agent status indicator (thinking, executing, idle)
  • Message history persistence

HTTP API

  • RESTful endpoints for integration
  • WebSocket for real-time updates
  • Health check and state endpoints

Easy Deployment

  • Docker container with auto model download
  • Railway one-click deploy
  • Vercel frontend ready

Try It Now

# Run locally
cd olly && ./serve
cd web && pnpm run dev

# Visit http://localhost:5173/chat

What’s Next

  • Authentication: Secure access control
  • Streaming: Real-time token-by-token responses
  • More Tools: Write file, execute commands (Phase 2)
  • Model Options: Switch between different models

Built with ❤️ by the Nestr team