HelixTelemetry is a production-grade, clinically aligned Medical RAG Architecture engineered for high-velocity data ingestion, deterministic safety routing, and automated hallucination tracking. Built to transcend basic LLM wrappers, this system introduces a multi-threaded, non-blocking telemetry engine and an enterprise 3-pane UI, deployed globally to Web and Native Android.
- Deterministic Safety Router (Zero-Temperature) Prevents dangerous LLM extrapolations by passing raw queries through a strict gatekeeper. Malicious prompts or acute emergency symptoms bypass the generator entirely, immediately rendering hardcoded clinical safety interventions.
- High-Velocity Async Token Streaming Bridges LangChain's asynchronous generators directly into Streamlit's synchronous loop, unlocking multi-threaded execution. Renders responses at hundreds of tokens per second using Groq's LPU hardware.
- Automated Hallucination Telemetry (Ragas) Every RAG transaction is asynchronously graded in the background for Faithfulness, Context Precision, and Answer Relevancy.
- Non-Blocking I/O Logging
Protected by
asyncio.Lock()and managed viaaiofiles, system metrics (TTFT, tokens/sec, Ragas scores) are successfully written to persistent CSV storage simultaneously during generation without freezing the Global Interpreter Lock (GIL) or the UI. - Domain-Specific Embeddings (PubMedBERT)
Leverages
NeuML/pubmedbert-base-embeddingsalongside ChromaDB for hyper-accurate local vector similarity search against the MedQA USMLE clinical corpus.
Streamlit's default aesthetic was entirely overridden via aggressive CSS DOM injection (src/ui/themes.py).
- Dark-Mode Enterprise Matrix: Styled to mimic Tier-1 EHR systems (Cerner/Epic) to reduce clinical eye strain.
- Live Plotly Integration: Visualizes the backend
system_metrics.csvasynchronous logs dynamically on execution completion. - Persistent State Management: Seamlessly retains chat history and telemetry scores through strict
st.session_statesingleton rules.
- Inference Model: Meta Llama-3.1-8b-instant (via Groq API)
- Embedding Model: PubMedBERT (HuggingFace)
- Vector Store: ChromaDB (Persistent SQLite)
- Pipeline/Agent Framework: LangChain Core
- Telemetry & Grading: Ragas, Plotly, Pandas, Aiofiles
- Frontend: Streamlit & Custom CSS
- Mobile Portability: Google Chrome Labs Bubblewrap (TWA)
1. Clone the repository:
git clone https://github.com/ramkumar27072006/HelixTelemetry.git
cd HelixTelemetry2. Create the Python Virtual Environment:
python -m venv venv
source venv/bin/activate # Windows: .\venv\Scripts\activate3. Install Dependencies:
pip install -r requirements.txt4. Configure Environment Variables:
Create a .env file in the root directory and securely add your Groq LPU Key:
GROQ_API_KEY=gsk_your_api_key_here5. Boot the Engine:
streamlit run app.py- Cloud Web Application: Hosted seamlessly via Streamlit Community Cloud. Secrets securely managed in the Streamlit advanced settings portal.
- Native Android (.APK): The live Streamlit URL is wrapped in a Trusted Web Activity (TWA) compiled using Median (GoNative) / Bubblewrap CLI.
- Disclaimer: HelixTelemetry is a portfolio engineering project demonstrating advanced ML architecture. It is not an FDA-approved medical device and should not be deployed in real-world clinical environments or used for actual medical diagnostic decision-making.*
