GitAsk

turn any github repo into an AI you can talk to, right in your browser. no server. no API key. no cloud.

paste a github URL. ask a question. get an answer grounded in the actual code.

no API key setup. no docker. no postgres. no cloud bill. the model runs on your GPU, the index lives in your browser, and nothing leaves your machine (unless you want it to).

it started as a toy to see if I can make a basic offline RAG system for code. it ended up being the way I actually explore unfamiliar repos.

demo

_{↑ click to watch with music}

how it works

ingestion

github fetch - pulls the full file tree in one API call, then fetches files on-demand.
AST chunking - tree-sitter WASM parses your source and cuts at real boundaries: functions, classes, methods. not arbitrary line counts. supports JS/TS/TSX, Python, Rust, Go, Java, C/C++, and more. falls back to text splitting for everything else.
embedding - all-MiniLM-L12-v2 via @huggingface/transformers, running on WebGPU if available, WASM otherwise. adaptive batch sizing squeezes out throughput.
binary quantization - float32 embeddings are sign-bit packed into Uint32Arrays. 32x smaller in memory. hamming distance runs fast even on large repos.
persistence - everything lives in IndexedDB via entity-db. re-open the tab, the index is still there.

retrieval

this is the part that actually makes answers good.

multi-query expansion - cloud llms: 3 rephrased variants using conversation context. local model: adds a code-symbol search path for free. follow-up queries get terms from prior turns injected automatically.
adaptive retrieval refinement - if the first pass comes back weak, the LLM rewrites the query and tries again. no user input needed.
hybrid search - dense retrieval (hamming distance on binary embeddings) fused with BM25 sparse search. neither one alone is enough.
RRF - reciprocal rank fusion merges the ranked lists from each query/retriever into one signal.
graph expansion - the import/definition graph lets retrieval hop from a file to its dependencies when the chunk boundary cuts off relevant context.
cosine rerank - the coarse RRF candidates get reranked with full cosine similarity before being handed to the LLM.

generation

WebLLM - Qwen2-0.5B runs entirely in your browser via @mlc-ai/web-llm. first load takes a few minutes (model download), then it's cached.
CoVe loop - after generating an answer, the model extracts its own claims and verifies each one against the vector store. wrong claims get corrected. it's one pass of chain-of-verification, tuned for a small model.
BYOK - if you'd rather use Gemini or Groq, you can. keys are encrypted in a local vault and never leave your device.

quick start

npm install
npm run dev

open http://localhost:3000, paste a github URL, and start asking.

that's the whole setup.

Local note: this repo uses webpack for npm run dev because the [owner]/[repo] route currently hangs under Turbopack on some local machines.

stack

layer	what
framework	Next.js 16 + React 19
LLM (local)	`@mlc-ai/web-llm` - Qwen2-0.5B on WebGPU
embeddings	`@huggingface/transformers` - all-MiniLM-L12-v2
AST parsing	`web-tree-sitter` (WASM)
vector store	`@babycommando/entity-db` -> IndexedDB
cloud LLM (optional)	Gemini, Groq via BYOK vault
UI	Framer Motion, React Markdown, syntax highlighting

features

zero backend - the entire pipeline runs in-browser
WebGPU inference with WASM fallback - works on most modern machines
AST-aware chunking - chunks respect code structure, not just line counts
binary quantization - 32x memory savings on the embedding index
hybrid search - dense + sparse, fused with RRF
multi-query expansion - CodeRAG-style. LLM or heuristic, depending on provider. follow-up queries get context from prior turns
adaptive retrieval refinement - weak results trigger an automatic second pass with an LLM-rewritten query
dependency graph traversal - retrieval follows imports to surface related code
CoVe self-correction - the model checks its own answers against the codebase
persistent index - close the tab, reopen, resume chatting
BYOK - swap in Gemini or Groq if you want a bigger model
multi-session chat - multiple chat histories per repo

why not just use an existing tool?

most code-search tools either require a cloud backend, or they do naive chunking that cuts functions in half. I wanted something that:

runs completely locally
understands code structure (AST chunks)
retrieves well (hybrid search + RRF, not just cosine similarity)
is actually fast to set up (paste URL, done)

this is the thing I built to scratch that itch.

research

the retrieval design draws from two papers:

CodeRAG - Zhang et al., Finding Relevant and Necessary Knowledge for Retrieval-Augmented Repository-Level Code Completion, EMNLP 2025. arXiv:2509.16112 -> multi-query expansion, hybrid retrieval, RRF fusion, adaptive retrieval refinement
CoVe - Dhuliawala et al., Chain-of-Verification Reduces Hallucination in Large Language Models, Findings of ACL 2024. arXiv:2309.11495 -> self-verification loop on generated answers

star history

acknowledgments

shoutout to CosmoBean for shipping a ridiculous number of PRs on this. BM25, prompt injection guards, chunking perf, metrics, embeddings - a lot of the good stuff in here is his. he's my roommate and he just kept opening pull requests. genuinely made this project way better than it would've been.

contributing

issues and PRs are welcome. if something doesn't work on your machine, open an issue with your browser + GPU info. browser ML is still a mess and edge cases are rreal.

license

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 196 Commits
.github/workflows		.github/workflows
assets		assets
diagram-frames		diagram-frames
public		public
scripts		scripts
src		src
test		test
.build-debug.log		.build-debug.log
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
next.config.ts		next.config.ts
options-output.txt		options-output.txt
package-lock.json		package-lock.json
package.json		package.json
tsconfig.ci.json		tsconfig.ci.json
tsconfig.json		tsconfig.json
vitest.config.mts		vitest.config.mts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GitAsk

demo

how it works

ingestion

retrieval

generation

quick start

stack

features

why not just use an existing tool?

research

star history

acknowledgments

contributing

license

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GitAsk

demo

how it works

ingestion

retrieval

generation

quick start

stack

features

why not just use an existing tool?

research

star history

acknowledgments

contributing

license

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages