📚 Complete Tutorial: Medical Chatbot Configuration with AI Engine

💰 TOKEN SYSTEM AND COST OPTIMIZATION

🔤 WHAT ARE TOKENS?

Tokens are the basic units used for billing AI services. One token represents approximately:

  • 📝 1 token = 4 characters in English

  • 📝 1 token = 1 short word in foreign language

  • 📖 1000 tokens ≈ 750 words

Practical examples:

  • Simple question: “what medicines for cough?” → ~10 tokens

  • Short medical answer (200 words) → ~250 tokens

  • Complete conversation (question + answer) → ~300-500 tokens


🏪 OPENROUTER – THE AI “SUPERMARKET”

OpenRouter functions as a centralized marketplace where different AI providers compete on price:

YOUR APPLICATION → OPENROUTER → [Provider1, Provider2, Provider3...]
                                  ↓
                          Automatically chooses the lowest price

🎯 HOW IT WORKS:

  • A single API Key for all providers

  • Automatic selection of the cheapest available model

  • Unified billing – a single monthly invoice

  • Transparent switching between providers without code change


📊 TOKEN PRICE COMPARISON TABLE

ProviderModelInput (1M tokens)Output (1M tokens)Performance
OpenRouterDeepSeek$0.14$0.28✅ Excellent
OpenRouterMistral 7B$0.14$0.42✅ Good
OpenRouterLlama 3.1 8B$0.18$0.54✅ Very Good
OpenAI DirectGPT-3.5 Turbo$0.50$1.50✅ Good
OpenRouterClaude Haiku$0.25$1.25✅ Excellent
OpenAI DirectGPT-4 Turbo$10.00$30.00✅ Exceptional
Google DirectGemini Pro$0.50$1.50✅ Good

💡 CONVERSATION COST CALCULATION:

Typical Conversation: 500 tokens
With DeepSeek: (500 ÷ 1,000,000) × $0.28 = $0.00014
With GPT-4 Turbo: (500 ÷ 1,000,000) × $30.00 = $0.01500

🏆 COST-PERFORMANCE RECOMMENDATIONS:

🥇 DEEPSEEK (via OpenRouter)

  • Best price-quality ratio

  • Cost: 10x lower than GPT-4 Turbo

  • Performance: Excellent

🥈 MISTRAL (via OpenRouter)

  • Cost similar to DeepSeek

  • Specialized for European languages

🥉 GPT-3.5 TURBO (direct)

  • Good for prototyping

  • Decent cost, moderate performance


🚀 COST OPTIMIZATION STRATEGY

1. FOR TESTING/DEVELOPMENT:

OpenRouter + DeepSeek: $0.14/M input
- Minimum cost with good performance
- Ideal for fast iterations

2. FOR QUALITY PRODUCTION:

OpenRouter with fallback: DeepSeek → Mistral → Claude
- Perfect cost-quality balance
- Automatic redundancy

3. FOR MAXIMUM ACCURACY:

GPT-4 Turbo direct: $10.00/M input  
- Most obedient to prompts
- Guaranteed quality, higher cost

👉 OpenRouter offers the best cost-flexibility ratio for medical projects!


🗃️ MEDICINE DATABASE (JSON)

[
{
“name”: “Amoxicillin”,
“category”: “Antibiotic”,
“prescription”: “Yes”,
“indications”: “Bacterial respiratory infections, Dental infections, Urinary tract infections. Administered only on a doctor’s recommendation.”
},
{
“name”: “Ciprofloxacin”,
“category”: “Fluoroquinolone Antibiotic”,
“prescription”: “Yes”,
“indications”: “Urinary tract infections, Gastrointestinal infections, and Respiratory infections. Requires a medical prescription.”
},
{
“name”: “Metronidazole”,
“category”: “Antibiotic / Antiparasitic”,
“prescription”: “Yes”,
“indications”: “Anaerobic infections and Parasitic infections. Administration only under medical supervision.”
},
{
“name”: “Prednisone”,
“category”: “Corticosteroid”,
“prescription”: “Yes”,
“indications”: “Inflammatory and autoimmune conditions. Administered only upon the doctor’s indication.”
},
{
“name”: “Enalapril”,
“category”: “Antihypertensive (ACE Inhibitor)”,
“prescription”: “Yes”,
“indications”: “Arterial hypertension and heart failure. Requires medical monitoring.”
},
{
“name”: “Diazepam”,
“category”: “Anxiolytic / Sedative”,
“prescription”: “Yes”,
“indications”: “Anxiety disorders and insomnia. Uncontrolled administration can lead to dependence.”
},
{
“name”: “Insulin”,
“category”: “Antidiabetic Hormone”,
“prescription”: “Yes”,
“indications”: “Treatment of diabetes mellitus. Dosage is determined exclusively by the doctor.”
},
{
“name”: “Paracetamol”,
“category”: “Antipyretic / Analgesic”,
“prescription”: “No”,
“indications”: “Fever and mild pain. Available without prescription.”
},
{
“name”: “Ibuprofen”,
“category”: “Non-steroidal Anti-inflammatory Drug (NSAID)”,
“prescription”: “No”,
“indications”: “Muscle pain, Joint pain, Dental pain, and fever. No prescription required for low doses.”
},
{
“name”: “Loratadine”,
“category”: “Antihistamine”,
“prescription”: “No”,
“indications”: “Seasonal allergies and hives. Can be purchased without a prescription.”
}
]

🎯 PROJECT GOAL

Creating a medical chatbot that:

  • Answers EXCLUSIVELY from its own medicine database

  • Never invents medicines or indications

  • Provides short and precise answers (under 200 words)

  • Recommends medical consultation for treatment

  • Uses cost-optimized infrastructure

🏗️ SYSTEM ARCHITECTURE

Data Flow:

User → Chatbot AI Engine → Qdrant Vector DB → Medicines JSON → Response

Critical Components:

  • AI Engine Pro – the chatbot platform

  • Qdrant Cloud – vector database

  • OpenAI GPT-4 Turbo – optimized AI model

  • PDF/JSON – the medicine database

⚙️ DASHBOARD MODULE CONFIGURATION

1. CLIENT MODULE – CONFIGURATION

📍 AI Engine → Modules → Client Modules

🤖 CHATBOT – ENABLED

Status: ✅ Enable
Description: "Build intelligent conversational experiences with fully customizable AI-powered chatbots."
Motivation: Central component of the medical project, primary interface with end users
Impact: Enables the creation of the conversational medical assistant

📝 FORMS – DISABLED

Status: ❌ Disable
Description: "Create dynamic, intelligent forms that adapt and respond based on user input with conditional logic."
Motivation: Medical forms are not needed in the current phase
Decision: Exclusive focus on simplified conversational interaction

🔍 SEARCH – DISABLED

Status: ❌ Disable
Description: "Override the default WordPress search with AI powered keywords or embeddings."
Motivation: The system does not use search in the WordPress site content
Important: Search is performed exclusively in the medical JSON database within the Knowledge Base

2. SERVER MODULE – CONFIGURATION

📍 AI Engine → Modules → Server Modules

📊 INSIGHTS – ENABLED

Status: ✅ Enable
Description: "Enable Query Logs, Usage and Limits."
Motivation: Essential monitoring of performance and costs
Usage: Tracking token consumption, query analysis, anomaly detection

🧠 KNOWLEDGE – ENABLED

Status: ✅ Enable
Description: "Searchable data for AI. Powered by embeddings for now, alternatives will come later."
Motivation: The foundation of the system - storing and searching the medical database
Function: Hosting the JSON PDF with medicines in Qdrant Cloud

🔗 ORCHESTRATION – ENABLED

Status: ✅ Enable
Description: "Connect AI models to external tools and services through MCP servers. Currently, MCP servers need to be set up in Settings > Orchestration."
Motivation: Preparation for future integrations with external medical systems
Potential: Connection with medical databases, healthcare APIs

🎓 FINE TUNES – DISABLED

Status: ❌ Disable
Description: "Train your own AI models."
Motivation: Unnecessary complexity for a system based on a fixed database
Decision: High cost without practical benefits for this use case

🛡️ MODERATION – DISABLED

Status: ❌ Disable
Description: "Moderation features with AI."
Motivation: Medical content is controlled and does not require filtering
Decision: Specialized users, restricted medical context

👥 ASSISTANTS BETA – DISABLED

Status: ❌ Disable
Description: "The Assistants API is designed to help developers build powerful AI assistants capable of performing a variety of tasks."
Motivation: Technology deprecated at OpenAI, replaced by modern architectures
Decision: Investment in current and stable solutions

3. ADMIN MODULE – CONFIGURATION

📍 AI Engine → Modules → Admin Modules

📋 ADVISOR – ENABLED

Status: ✅ Enable
Description: "In your Dashboard will be displayed daily recommendations tailored to your WordPress setup. Admins only."
Motivation: Continuous optimization of the configuration based on usage
Benefit: Personalized recommendations for setup improvement

🎨 GENERATORS – DISABLED

Status: ❌ Disable
Description: 
  - "Content Generator: Transform ideas into polished articles with AI-powered content creation."
  - "Images Generator: Bring your vision to life with stunning AI-generated visuals."
  - "Videos Generator: Generate videos using AI models like Sora. Create videos from text prompts with control over duration and resolution."
Motivation: We do not generate automatic content in a medical context
Decision: Prevention of accuracy risks and legal responsibility

🧪 PLAYGROUND – DISABLED

Status: ❌ Disable
Description: "Experiment with AI models and unlock endless creative possibilities."
Motivation: Security - limiting access to experimentation tools
Decision: Preventing unauthorized use and potential vulnerabilities

⚙️ UTILITIES – DISABLED

Status: ❌ Disable
Description: "AI Copilot, AI Suggestions, Magic Wands: Tools to brainstorm/write faster and better."
Motivation: Unnecessary tools for the specific medical purpose
Decision: Focus on medical accuracy, not creativity or brainstorming

🎤 TRANSCRIPTION – DISABLED

Status: ❌ Disable
Description: "Introduces a 'Transcribe' tab to easily transform audio/images into text and get AI answers in JSON format."
Motivation: Functionality beyond the scope of the current medical project
Decision: Added complexity without direct benefit to the system

🎯 STRATEGIC CONFIGURATION ANALYSIS

SEARCH ARCHITECTURE – THE ACTUAL FLOW:

USER: "what medicines for cough?"
         ↓
CHATBOT AI ENGINE → COMPLETELY IGNORES WordPress content
         ↓
SEARCHES EXCLUSIVELY in KNOWLEDGE BASE (Qdrant + JSON)
         ↓
COMPARES with embeddings from the medical database
         ↓
RETURNS only the relevant medicines from JSON
         ↓
RESPONSE: "I do not have relevant medicines for cough"

BENEFITS OF THE CURRENT CONFIGURATION:

MAXIMUM SECURITY:

  • Playground disabled → preventing unauthorized access

  • Generators disabled → eliminating automatic content risks

  • Utilities disabled → reducing attack surface

OPTIMIZED COSTS:

  • Only essential modules enabled

  • Eliminating redundant functions

  • Focus on Knowledge Base as the only source

SUPERIOR PERFORMANCE:

  • Direct search in the medical database

  • No search overhead in WordPress content

  • Fast and precise answers

FUTURE SCALABILITY:

  • Orchestration enabled for integrations

  • Insights for evolution monitoring

  • Advisor for continuous optimization


CONCLUSION: Your current configuration represents the optimal setup for an accurate, secure, and cost-efficient medical chatbot that operates exclusively with its own medicine database.


⚙️ CHATBOT CONFIGURATION

1. “CHATBOT” TAB – BASIC SETTINGS

📍 AI Engine → Chatbots → [Chatbot Name] → “Chatbot” Tab

📋 GENERAL INFORMATION:

Name: Zerodoping
ID: chatbot-xxxxxxx (auto-generated)
Scope: chatbot
Local Memory: Yes (preserves conversation history)

🎯 OPERATING MODE:

Mode: Chat (✅ Selected)
- Conversational - maintains discussion context
- Ideal for long medical interactions

📝 SYSTEM INSTRUCTIONS (SYSTEM PROMPT) – COMPLETE:

You are a medical assistant that uses EXCLUSIVELY the medicine database.

ABSOLUTE RULES:
1. You have access to ONLY ONE source of information: our medicine database
2. All answers must come ONLY from this database
3. You are not allowed to use any other medical knowledge from your training
4. If a medicine is not in our database, it does not exist for you
5. The database is your only source of truth

When you receive a question:
1. Search our database
2. If you find something relevant: answer ONLY with what you found
3. If you do not find: "I do not have relevant medicines in our database"

🎯 ADVANCED MEDICAL PROMPT STRATEGY:

🏗️ ENGINEERING STRUCTURE:

✅ ROLE DEFINITION: "medical assistant"
✅ RADICAL RESTRICTION: "EXCLUSIVELY the database"
✅ EXPLICIT RULES: 5 absolute points + 3 action steps
✅ COMPLETE AMBIGUITY ELIMINATION: "it does not exist for you"
✅ UNIQUE SOURCE OF TRUTH: "your only source"
✅ CLEAR ALGORITHM: "When you receive a question: 1. Search... 2. If you find... 3. If you do not find..."

🔬 PSYCHOLOGICAL ANALYSIS OF THE PROMPT:

🧠 COGNITIVE CONSTRAINTS:

1. "EXCLUSIVELY" → Prohibits access to general knowledge
2. "ONLY from this database" → Eliminates alternative sources
3. "You are not allowed" → Limits freedom of action
4. "It does not exist for you" → Rewrites the AI's reality
5. "Your only source" → Defines a new ontology

ACTION MECHANISM:

INPUT: User question
         ↓
PROCESSING: "Search our database"
         ↓
DECISION: 
   ├─ If found → "answer ONLY with what you found"
   └─ If NOT found → "I do not have relevant medicines"
         ↓
OUTPUT: 100% database-based answer

🎯 BENEFITS OF THE PROMPT ARCHITECTURE:

MAXIMUM DETERMINISM:

  • Same input → always the same output

  • Eliminates unwanted variations

  • Predictable behavior

MEDICAL SECURITY:

  • Zero “hallucinations” or inventions

  • Answers only from verified sources

  • Protection against misinformation

OPTIMIZED PERFORMANCE:

  • Direct search in the database

  • No unnecessary processing of general knowledge

  • Fast and precise answers


🔧 ANALYSIS OF SETTINGS FROM THE SCREENSHOT:

CORRECTLY CONFIGURED:

  • Scope: chatbot → optimal for user interaction

  • Local Memory: Yes → maintains medical context per session

  • Mode: Chat → ideal for medical consultations

  • Complete Prompt → covers all medical scenarios

🎯 IMPACT OF SETTINGS:

  • Memory active → the chatbot remembers the conversation history

  • Chatbot scope → complete conversational functionality

  • Exhaustive prompt → ensures 100% adherence to the database


2. “AI MODEL” TAB – MODEL AND PERFORMANCE CONFIGURATION

📍 AI Engine → Chatbots → [Chatbot Name] → “AI Model” Tab

🤖 PROVIDER AND MODEL SELECTION:

Environment: OpenAI
Model: GPT-4 Turbo (✅ Selected)

🎯 MODEL SELECTION ANALYSIS:

🏆 GPT-4 TURBO – TECHNICAL MOTIVATION:

✅ Performance: Best quality-price ratio in the GPT-4 range
✅ Obedience: Exceptionally respects strict prompts
✅ Context: 128k tokens - enough for long medical history
✅ Speed: Optimized for fast answers
✅ Cost: ~3x cheaper than standard GPT-4

⚙️ ADVANCED TECHNICAL PARAMETERS:

🌡️ TEMPERATURE: 0.1

Value: 0.1 (very low)
Impact: 
  - Maximum deterministic answers
  - Minimizes variation and creativity
  - Same input → always the same output
  - Ideal for medical accuracy
Motivation: Eliminates any possibility of "hallucination"

📊 MAX TOKENS: 500

Value: 500 tokens
Calculation: 
  - Typical Question: ~20 tokens
  - Medical Answer: ~150-300 tokens
  - Safety Buffer: ~180 tokens
Benefit:
  - Prevents overly long answers
  - Cost control
  - Forces medical conciseness

🧠 CONTEXT CAPACITY:

Contextual: 128,000 tokens (disponibil)
Completion: 4,096 tokens (limitat)
Recommended: 4,096 tokens (recomandat)

Strategie:
  - We use 4,096 for cost optimization
  - Enough for long medical conversations
  - Maintains relevant history without overhead

🚫 CRITICAL CONFIGURATION: FILE UPLOAD

Uploads: File Upload ❌ DISABLED
Motivation: 
  - Forces exclusive use of the Knowledge Base
  - Prevents loading external documents
  - Ensures the AI uses only our database
Impact: The key that blocks "leaks" of general knowledge

🔧 ADVANCED CONFIGURATION ANALYSIS:

🎯 COST-PERFORMANCE STRATEGY:

Model: GPT-4 Turbo → $10/M input (vs $30/M standard GPT-4)
Temperature: 0.1 → reduces unnecessary tokens
Max Tokens: 500 → limits costs/output
Context: 4,096 → optimized for medical conversations

🛡️ SAFETY MECHANISMS:

1. File Upload disabled → eliminates external sources
2. Temperature 0.1 → eliminates variation
3. Max Tokens 500 → eliminates verbosity
4. GPT-4 Turbo → maximum obedience to the prompt

📈 IMPACT ON THE MEDICAL SYSTEM:

✅ PRECISION: 100% deterministic answers
✅ SAFETY: Zero external information sources
✅ COST: Optimized for high volumes
✅ PERFORMANCE: Fast and concise answers
✅ SCALABILITY: Ready for intense usage

🎯 AI MODEL CONFIGURATION CONCLUSION:

OUR SETTINGS ENSURE:

"Medical Question" → "Search in Knowledge Base" → "Answer from the database"
    ↓                       ↓                           ↓
GPT-4 Turbo           Qdrant + JSON                Concise, precise
Temperature 0.1       Active Sync                   Max 500 tokens

👉 This configuration transforms GPT-4 from a generalist model into a specialized medical search engine!


3. “EMBEDDINGS” TAB – DATABASE CONNECTION

📍 AI Engine → Chatbots → [Chatbot Name] → “Embeddings” Tab

🔗 ENVIRONMENT CONFIGURATION:

Environment: Quadrant (✅ Selected)
Content Aware: No (❌ Disabled) - STRATEGIC DECISION

🎯 EMBEDDINGS CONFIGURATION ANALYSIS:

🏗️ ENVIRONMENT: QUADRANT – VECTOR ARCHITECTURE:

Selection: Quadrant (✅ Correct)
Function: Connection to the cloud vector database
Benefit: 
  - Optimized storage for medical embeddings
  - Ultra-fast semantic search
  - Automatic scalability
  - Zero cost for the free plan (1GB)

🚫 CONTENT AWARE: NO – ISOLATION STRATEGY:

Status: ❌ No (Disabled) - CONSCIOUS DECISION

WHAT IS CONTENT AWARE:
- A function that adapts the chatbot to the content of the WordPress PAGE
- Replaces {CONTENT} in the prompt with the text of the current page
- Makes the chatbot "read" the page and respond in context

WHY WE DISABLED IT:
- Our medical chatbot MUST NOT know what is on the site's pages
- It must function ISOLATEDLY, only with its medicine database
- The {CONTENT} placeholder would introduce unwanted external information
Outcome: We keep the chatbot STRICTLY FOCUSED on the medical database

🔍 CONTENT AWARE vs. COMPLETE ISOLATION COMPARISON:

WITH CONTENT AWARE ENABLED:

User on page: "Acute cough treatment"
Chatbot "reads" the page → understands the cough context
Question: "what medicines do you recommend?"
Answer: Is influenced by the page content
PROBLEM: Mixes information from the page with the database

WITH CONTENT AWARE DISABLED (OUR CONFIGURATION):

User on ANY page
Chatbot COMPLETELY IGNORES the page content
Question: "what medicines for cough?"
Answer: Searches ONLY in the medicine database
SUCCESS: Consistent answer regardless of the page where it is embedded

⚡ COMPLETE ISOLATION ARCHITECTURE:

🔄 DATA FLOW – TOTAL ISOLATION:

WORDPRESS PAGE: "Pancreatic Cancer Treatment"
         ↓
CHATBOT COMPLETELY IGNORES this content
         ↓
QUESTION: "what medicines for cancer?"
         ↓
SEARCHES DOAR în KNOWLEDGE BASE (medicamente JSON)
         ↓
RESPONSE: "I do not have relevant medicines for cancer"

🎯 BENEFITS OF ISOLATION:

ABSOLUTE CONSISTENCY:

  • Same answer on any page of the site

  • Not influenced by page content

  • Predictable and verifiable behavior

CLEAR SEPARATION:

  • Medical database = the only source of truth

  • Site content = only UI container

  • No mixing between the two

INFORMATIONAL SECURITY:

  • Prevents “contamination” with information from pages

  • Ensures that answers come only from the verified source

  • Eliminates the risk of misinformation from the site content


🔧 STRATEGIC DECISION ANALYSIS:

🎯 WHY THIS SETUP IS CRITICAL:

🏥 PURE MEDICAL SPECIALIZATION:

The Chatbot is: "Medical Expert with its own Database"
It is not: "Contextual Assistant that reads pages"
Advantage: The same level of medical expertise everywhere

🔒 PROTECTION AGAINST CONTAMINATION:

If a page contained: incorrect medical information
With Content Aware: the chatbot could be influenced
Without Content Aware: the chatbot remains pure, only with its base

🌐 MAXIMUM PORTABILITY:

You can move the chatbot to any page
The behavior remains IDENTICAL
It does not need to be re-tested for every new page

🎯 CONCLUSION – ISOLATED ARCHITECTURE:

🔬 OUR SYSTEM WORKS AS:

"Medical Expert in a Capsule"
- Hermetically sealed from external influences
- Powered only by its database
- Pure and uncontaminated answers

👉 Disabled Content Aware ensures that the medical chatbot remains an independent and consistent entity, undistorted by site content!


4. “THRESHOLDS” TAB – PERFORMANCE OPTIMIZATION

📍 AI Engine → Chatbots → [Chatbot Name] → “Thresholds” Tab


🎯 TECHNICAL LIMITS CONFIGURATION

Input Max Length: 512
Max Messages: 15
Context Max Length: 4000

🔍 PERFORMANCE SETTINGS ANALYSIS:

📝 USER INPUT CONTROL:

Input Max Length: 512
Function: Limits the size of user messages
Technical Impact:
  - Reduces request processing time
  - Prevents system overload with excessive text
  - Optimizes AI token usage

💬 CONVERSATION HISTORY MANAGEMENT:

Max Messages: 15
Function: The maximum number of messages kept in memory
Conversational Impact:
  - Maintains the context of the medical discussion
  - Allows for relevant follow-up questions
  - Ensures continuity in medical evaluations

⚡ CONTEXT SIZE OPTIMIZATION:

Context Max Length: 4000
Function: Limits the total size of the context sent to the AI
Performance Impact:
  - Reduces operating costs with the OpenAI API
  - Improves response speed
  - Prevents exceeding technical limits

🏥 SPECIALIZATION FOR MEDICINE:

💊 BALANCE BETWEEN DEPTH AND EFFICIENCY:

Input 512 characters:
  - Sufficient for describing essential symptoms
  - Forces focus on key medical information
  - Compatible with the limited time in medical practice

Messages 15:
  - Covers complete medical consultations
  - Allows discussion of multiple symptoms
  - Maintains relevant history without excessive accumulation

🚀 PERFORMANCE ARCHITECTURE:

Context 4000 + Messages 15:
  - Ensures fast answers in critical situations
  - Keeps operating costs under control
  - Allows scaling to multiple simultaneous users

✅ FINAL CONFIGURATION:

Input Max Length: 512
Max Messages: 15  
Context Max Length: 4000

The settings create a perfect balance between advanced medical functionality and optimized technical performance.


**📚 CONTINUATION IN PART 2**

This is only the first part of the complete configuration guide. 
In the next article, we will cover the essential settings for the user interface:

🔹 **"Appearance" Tab** - Customizing the medical interface
🔹 **"Popup" Tab** - Configuring medical dialogue windows  
🔹 **UI Builder** - The interface constructor
🔹 **Advanced** - Advanced settings for experts
🔹 **Cross-Site** - Multi-site integration
🔹 **Shortcodes** - Quick implementation on pages
🔹 **Actions** - Custom automations and triggers

**[👉 Read Part 2: Interface and Integrations here]**