The AI with an Elephant’s Memory and a Chicken’s Mouth – It Speaks Only What We’ve Taught It

💊 Interactive Test: How Permissive Are AIs with Medical Advice

🧠 How We Built a “Lobotomized” RAG System Using Qdrant, OpenAI, and AI Engine

📌 Purpose:
This document details the functioning of a Retrieval-Augmented Generation (RAG) system designed to provide strictly controlled responses based exclusively on its own internal database, while excluding the general pre-trained knowledge of a language model (LLM).
This process is referred to as “lobotomization.”
The system’s sole source of knowledge consists of four files: one drug catalog (Excel) and three structured medical guidelines (Word documents).

1. 🔍 What is Qdrant?

Qdrant (pronounced Quadrant) is a vector database specialized in storing, managing, and rapidly searching vector embeddings.

Main purpose: It is used in artificial intelligence applications, such as:

🔎 Semantic search

🧩 RAG systems (Retrieval-Augmented Generation)

💬 Specialized chatbots – just like in the use case with the drug database.

2. ⚙️ How Qdrant Works (Conceptual)

The process can be divided into three fundamental stages:

Pas	Descriere	Exemplu
1. Data Vectorization	Raw data (text, images) is transformed into numeric vectors (lists of numbers) using an embedding model.	A model like text-embedding-ada-002 (OpenAI) transforms the word `"Paracetamol"` into a 1536-dimensional vector: `[0.12, -0.44, 0.56, ...]`.
2. Storage in Qdrant	The generated vectors are stored in Qdrant, along with metadata (payload) containing the original information.	`{ "id": 1, "vector": [0.12, -0.44, 0.56, ...], "payload": { "name": "Paracetamol", "manufacter": "XYZ" } }`
3. Querying and Search	When a user asks a question, it is also vectorized. Qdrant compares the query vector with all vectors in the database and returns the most similar ones based on a distance algorithm (e.g., Cosine, Euclidean).	A search for `"headache pill"` will return the vectors associated with `"Paracetamol"`, even if that word does not appear in the query.

🧩 Conceptual Conclusion: Qdrant does not look for exact text matches; it searches for semantic similarities between meanings.

3. 🏗️ Qdrant Architecture and Usage Modes

Qdrant can be deployed in two main ways:

Mode	Description	Features
☁️ Qdrant Cloud	Hosted and managed service.	– Ideal for prototypes and small projects. – Free plan: 1 collection, ~1GB, 10k-20k points. – Web interface and simple management.
🖥️ Qdrant Local (Self-Hosted)	Installed on your own infrastructure (e.g., via Docker).	– Full control and enhanced security. – Optimized performance for specific hardware. – Requires administration knowledge.

⚠️ Important: Both options use the same REST or gRPC API, meaning the application code will be identical regardless of the deployment choice.

4. 📚 Data Structure in Qdrant

Qdrant’s data structure is hierarchical and intuitive:

Collection → "medications"
       |
       ├── Point 1 -> Vector + Payload for "Paracetamol"
       ├── Point 2 -> Vector + Payload for "Ibuprofen"
       └── Point n -> Vector + Payload for another medication

Glossary of Terms:

Term	Explanation
Collection	A logical group of related vectors (e.g., the “medications” collection).
Point	A unique record in the collection, consisting of a vector and a payload.
Vector	Numeric representation (embedding) of a text fragment.
Payload	Metadata attached to the vector (e.g., name, manufacturer, code, original text).
Filter	Conditions applied to restrict the search based on payload (e.g., `manufacturer = "Pfizer"`).

5. 🧮 Query Types in Qdrant

Qdrant supports several query operations, the most relevant of which are:

🔍 Simple Search: Returns the most similar vectors for a given query.
🧩 Filtered Search: Combines semantic search with metadata filters (e.g., “find antidepressants only from manufacturer Roche”).
🔁 Recommend: Generates recommendations based on a set of reference points already in the system.

6. ⚡ Performance and Key Advantages

Aspect	Qdrant Evaluation
Speed	Very high, thanks to optimizations in the Rust language.
Persistence	Data is stored on disk, ensuring durability.
Filtering	Advanced capabilities for filtering metadata (payload).
Scalability	Supports sharding and replication for large datasets.
Compatibility	APIs for all popular languages (Python, JS, Go, Java).

🧩 Contextual Application: Lobotomized Medication System

🎯 Final Goal: Create a chatbot that responds only based on the 4 files and the 3 text documents, without relying on the general knowledge of the LLM.

🔧 System Components:

🗃️ Vector Database (Qdrant): Stores all information from your files as vectors and payloads.
🛠️ Embedding Engine (OpenAI): Converts user queries and file content into vectors.
🧠 Response Engine (OpenAI GPT): Generates the final textual answer.
🔗 Orchestrator (AI Engine – WordPress Plugin): Connects all components and applies strict rules.

7. 🔐 The Crucial Role of “Severe Instructions” (System Prompt)

This is the soul of the system and the lobotomization mechanism. These instructions are configured in the AI Engine interface and sent to OpenAI GPT at each interaction.

📍 Where do these rules operate?

NOT in Qdrant.
NOT in the embedding engine.
YES, in the response engine (OpenAI GPT), which is required to follow the instructions provided via AI Engine.

📜 Example of Severe Instructions for AI Engine:

# IDENTITY: You are a specialized assistant for the company's medication database.
# KNOWLEDGE SOURCE: Your only source of information is the vector database (Qdrant), which contains only files X, Y, Z.
# ABSOLUTE RULES:
# 1. Respond EXCLUSIVELY based on text fragments (context) returned by Qdrant for the user's question.
# 2. NEVER use general pre-trained knowledge about medications, diseases, or any other topic.
# 3. IF the fragments returned by Qdrant contain no relevant information for the question, your answer must be STRICTLY: "No information available on this topic in our database."
# 4. Answers must be concise, clear, and based solely on the provided data.

8. 🔁 Data Flow and Lobotomization in Action (Extended Examples)

Let’s follow a few concrete scenarios to understand how lobotomization and smart information retrieval work.

Use Case 1: Searching for a Specific Medication (Strict Knowledge Control)

Step	Action	Result
1.	User asks: “Which medications containing morphine do you recommend?”	The question is sent to AI Engine.
2.	AI Engine vectorizes it and sends it to Qdrant.	Qdrant searches for similar vectors.
3.	Assumption: There is no mention of “morphine” in your files.	Qdrant finds no relevant results.
4.	AI Engine builds the prompt for GPT: System Prompt + Context (empty) + Question.	Context is empty.
5.	OpenAI GPT processes the prompt. The absolute rule forbids using general knowledge.	Final Answer: “No information available on this topic in our database.” ✅

Use Case 2: From Symptom to Diagnosis and Treatment (Full Knowledge Flow)

Step	Action	Result
1.	User asks: “I have pulsatile headaches with nausea. What could it be and what can I take?”	The question is vectorized and sent to Qdrant.
2.	Qdrant identifies relevant fragments from all files: – `Symptoms.doc`: “Headache: pulsatile, nausea” – `Diagnoses.doc`: “Migraine with/without aura” – `Treatments.doc`: “Migraine -> Sumatriptan, Topiramate” – `Excel.xlsx`: Products containing “Sumatriptan”	Returns rich and relevant context.
3.	AI Engine assembles the final prompt with all these fragments.	Context contains the full hierarchy: Symptom -> Diagnosis -> Treatment -> Medications.
4.	OpenAI GPT synthesizes the information.	Final Answer: “The described symptoms (pulsatile headache with nausea) may correspond to migraine. For migraine, our database recommends medications such as Sumatriptan or Topiramate. Please consult a physician for an accurate diagnosis.” ✅

Use Case 3: Advanced Semantic Search (Without Exact Keywords)

Step	Action	Result
1.	User asks: “Is there something for dizziness and the feeling that everything is spinning?”	The question is vectorized.
2.	Qdrant, using semantic similarity, finds key fragments: – `Symptoms.doc`: “Vertigo: feeling that the house is spinning” – `Treatments.doc`: “Vertigo -> Betahistine” – `Excel.xlsx`: Rows with “BETAHISTINE” from various manufacturers (ATB, LPH, MCC, GEMAX PHARMA).	Looks for meaning, not exact words. “Dizziness” and “spinning” match “Vertigo”.
3.	The final prompt includes these connections.	The system understands the question refers to vertigo.
4.	OpenAI GPT formulates the answer.	Final Answer: “The described symptoms may be associated with vertigo. In our database, a medication used in such cases is Betahistine, available under multiple brands (e.g., Betahistine ATB, Betahistine LPH).” ✅

Use Case 4: Implicit Filtering via File Content

Step	Action	Result
1.	User asks: “What is the treatment for Parkinson’s Disease?”	The question is vectorized.
2.	Qdrant quickly finds associations: – `Diagnoses.doc`: “Parkinson’s Disease” linked to gait/balance disorders and abnormal movements. – `Treatments.doc`: “Parkinson’s Disease -> Dopamine, Ropinirole, Entacapone” – `Excel.xlsx`: Does not find products with these active ingredients (e.g., “ALEDOV” contains Levodopa/Carbidopa, but Dopamine itself is not listed).	The system reports only what exists in the files.
3.	The prompt includes only what Qdrant found.	Context includes recommended treatments, but not all specific products.
4.	OpenAI GPT sticks strictly to the context.	Final Answer: “For Parkinson’s Disease, our database indicates treatments including Dopamine, Ropinirole, and Entacapone. In the catalog, a medication containing Levodopa/Carbidopa (ALEDOV) is available. Please consult a neurologist for proper prescription.” ✅ Note: The answer does not invent other medications.

Use Case 5: Excluding Any External Information

Step	Action	Result
1.	User asks: “Is Aspirin good for preventing heart attacks?”	The question is vectorized.
2.	Qdrant finds: – `Excel.xlsx`: Dozens of entries for “ACID ACETILSALICYLIC” (active ingredient in Aspirin) with therapeutic action: “ANTITHROMBOTIC ANTIPLATELET”. – `Treatments.doc`: “Strokes -> Acetylsalicylic Acid” (semantically similar to “prevent heart attack”).	No direct mention of “heart attack,” but cardiovascular uses are found.
3.	The prompt includes only these pharmaceutical and therapeutic data.	No direct medical statements about “heart attack prevention.”
4.	OpenAI GPT sticks to the payload data.	Final Answer: “In our database, Acetylsalicylic Acid (active ingredient in Aspirin) is classified as antithrombotic and antiplatelet, with indications including strokes. Please consult a physician for specific guidance and dosage.” ✅ Note: The answer is neutral, data-based, without confirming or denying a specific medical use.

9. 🧠 How Logical Connections Between Files Are Made

Who makes the connections? Not Qdrant, but OpenAI GPT.
How?
1. All files (Excel and DOCs) are “chunked” into smaller text pieces.
2. Each chunk is vectorized and inserted into Qdrant as an independent point.
3. Upon a query, Qdrant returns the most relevant fragments, regardless of which file they come from.
4. GPT receives these mixed fragments and synthesizes them, finding semantic links between them to build a coherent answer.

Example: If one fragment mentions “Paracetamol” and another “liver contraindications,” and the user asks about liver effects, GPT connects the two fragments to provide a complete answer.

🔗 System Knowledge Structure: The vector database is built on a set of 4 files forming a complete clinical hierarchy:

Symptoms.doc – Structured list of symptoms (e.g., Headache, Vertigo, Muscle Weakness).
Diagnoses.doc – Possible diagnoses for each symptom.
Treatments.doc – Recommended treatments for each diagnosis.
Excel.xlsx – Catalog of specific commercial products (Name, INN, Manufacturer, ATC Code, etc.).

This structure allows the system to follow a complete logical chain: from the patient’s symptom to the appropriate diagnoses, to the recommended treatment, and finally to the specific medications available in the catalog.

✅ Final Conclusion

The system built is robust and perfectly achieves the goal of providing a specialized and controlled chatbot:

Qdrant is the high-performance semantic memory.
OpenAI is the brain that understands and generates text.
AI Engine is the conductor that applies the score (severe instructions).
Severe Instructions are the logical quarantine ensuring lobotomization, forcing the system to respond only from the specific data source.

This architecture eliminates the risk of “hallucinations” (invented responses) and provides full control over information, making it ideal for sensitive domains such as medical applications. The final system will function as a virtual neurology expert, reporting exclusively on the knowledge base structured in the 4 files, providing safe, traceable, and verified answers.

Recent Posts

Recent Comments