Building a local-first AI coding assistant with unlimited completions and zero cloud dependencies
The subscription problem
Every major AI coding assistant routes your code through external servers. Your proprietary code, your unreleased projects, your client work — all of it hits a third-party server before a completion appears.
I built guIDE to break both of those constraints.
Local-first architecture
guIDE runs LLMs directly on your machine. The completions happen locally. Your code never leaves your computer.
const completion = await fetch('http://localhost:8080/v1/chat/completions', {
method: 'POST',
body: JSON.stringify({
model: 'qwen2.5-coder',
messages: [{ role: 'user', content: prompt }],
stream: true
})
});
The UI looks like a standard AI coding environment. The difference is everything you type goes to localhost, not api.openai.com.
Unlimited completions
Because the model runs locally, there's no API meter. The only cost is electricity.
Try it
graysoft.dev — download, point at your local llama.cpp server, start coding.