Try it live: The chatbot runs on Hugging Face Spaces with a clean Gradio UI.
👉 https://huggingface.co/spaces/Sanuwar/meet-sanuwar
🚀 What This Project Does
Meet Sanuwar is a personal Q&A chatbot built with only Python. It converts a single Markdown profile into a searchable index and answers career questions with strict context grounding—no vector database, no heavy frameworks.
- Reads
activities.md→ builds embeddings → savesdata/retrieval_index.json - Retrieves top-k chunks with cosine similarity
- Answers with a concise, context-only prompt
- Logs leads and unknown questions to simple CSVs
Why it’s different: Minimal code, transparent retrieval, production-friendly guardrails, and easy deploys to Hugging Face Spaces.
🤖 Agentic AI Workflow (Pure Python)
This career chatbot is a multi-step, agentic LLM workflow implemented without frameworks. The model has narrow, purposeful autonomy: it chooses between answering from context or invoking small tools when needed.
- Perception → Reasoning → Action loop
- Perception: Retrieve top-k chunks from
activities.md(cosine similarity). - Reasoning: Apply a strict, context-only prompt with synonym + timeframe logic.
- Action:
- If the answer is known → reply concisely.
- If unknown → log the question to
unknown_questions.csv. - If the user shares contact → record name/email to
leads.csv.
- Perception: Retrieve top-k chunks from
- Autonomy with guardrails
- The LLM never invents facts; it answers only from the provided Context.
- “Niceties” (hi/thanks/bye) are handled conversationally without tools.
- Year/timeframe questions synthesize overlapping roles (e.g., 2020 transitions).
- Why this is agentic
- The model decides when to answer vs. when to call tools (logging, lead capture).
- Each tool is a tiny, auditable Python function (CSV writes)—no external services.
- The loop is transparent and easy to extend (e.g., add email alerts or a task queue).
🏗️ Architecture Overview
Index Builder
Parses headings → creates chunks → generates OpenAI embeddings → writes data/retrieval_index.json.
Chat Engine
Cosine search → top-k context → concise answer with strict RAG prompt (synonyms + timeframe logic).
Lightweight Logging
Captures name/email to leads.csv and unknown questions to unknown_questions.csv—no database required.
⚡ Key Capabilities
🎯 Context-Only Answers (Strict RAG)
The bot answers only from the provided context—no hallucinations. Synonym mapping handles phrases like “professional experience” → Industry/Research/Teaching.
```text User: “Which company does Sanuwar work for?” Bot: “Humana (2020–Present).”