Small Language Models (SLMs) are compact neural models designed for efficiency, balancing lightweight architecture with effective performance on tasks like sentiment analysis and embedding generation. MiniLM developed by Microsoft, exemplifies this with its optimized speed and accuracy for natural language understanding while using minimal resources. all-MiniLM-L6-v2 is a specialized version of MiniLM, fine-tuned for sentence embeddings.
In this article, we will explore SLMs and demonstrates creating a symptoms-based diagnosis system using all-MiniLM-L6-V2.
Getting Started
Table of contents
- What is Small Language Model (SLM)
- What is all-MiniLM-L6-V2
- Experimenting with all-MiniLM-L6-V2
- Sentence similarity using all-MiniLM-L6-V2
- Building a symptoms-based diagnosis system
- Importing necessary libraries
- Importing dataset
- Initializing sentence transformers
- Finding conditions by symptoms
- Testing with sample input
- Resources
What is Small Language Model (SLM)
Small Language Models (SLMs) are lightweight versions of large language models (LLMs) designed to be computationally efficient while retaining robust language processing capabilities. Unlike LLMs, which require substantial hardware resources and often operate in cloud-based environments, SLMs can run on less powerful devices, making them suitable for edge applications or scenarios with limited resources.
Key Characteristics of SLMs:
- Compact Size: SLMs have fewer parameters, making them smaller in storage and faster in inference time compared to their larger counterparts.
- Efficiency: Optimized for resource-constrained environments without significant loss of functionality for common tasks.
- Specific Use Cases: Often tailored for particular tasks, such as classification, summarization, or recommendation systems, to maximize efficiency and relevance.
- Transfer Learning: Many SLMs are pre-trained on large datasets and fine-tuned for specific tasks, similar to LLMs, ensuring task-specific performance. ### Examples of SLMs:
- MiniLM: Known for its efficiency, MiniLM achieves near state-of-the-art performance in tasks like semantic similarity and text classification with fewer computational resources.
- DistilBERT: A smaller, faster, and cheaper variant of BERT, designed for general-purpose tasks while maintaining strong accuracy.
- TinyBERT: Focused on low-latency applications and mobile device compatibility.
- ALBERT: A lite version of BERT that achieves compactness through parameter sharing and factorization techniques. ### Applications: SLMs are widely used in:
- Mobile and embedded systems for on-device processing.
- Real-time applications, such as chatbots or recommendation systems.
- Domains where low latency and privacy are critical (e.g., healthcare or financial systems).
What is all-MiniLM-L6-V2
MiniLM (Minimal Language Model) is a family of lightweight transformer-based models designed for natural language understanding and retrieval tasks. Developed by Microsoft Research, it focuses on achieving high performance similar to large models like BERT while being computationally efficient. MiniLM is particularly useful for scenarios requiring real-time processing or where resources are limited, such as mobile or edge devices.
all-MiniLM-L6-v2 is a specialized version of MiniLM, fine-tuned for sentence embeddings. It is part of the Sentence Transformers library and is widely used for generating high-quality sentence embeddings in tasks requiring semantic textual similarity.
Key Characteristics:
- Architecture: MiniLM-L6 refers to a 6-layer version of MiniLM. V2 signifies an updated and optimized version.
- Optimization: Fine-tuned on large-scale datasets for sentence similarity tasks. Pre-trained on the MS MARCO dataset for information retrieval and question answering, ensuring strong semantic understanding.
- Output: Produces 384-dimensional sentence embeddings, balancing quality and efficiency.
- Applications: Semantic search, text clustering, question answering systems, recommendation engines.
Experimenting with all-MiniLM-L6-V2
Let's get started exploring all-MiniLM-L6-V2 by installing sentence-transformers library.
Installing dependencies
- Create and activate a virtual environment by executing the following command.
python -m venv venv
source venv/bin/activate #for ubuntu
venv/Scripts/activate #for windows
- Install sentence-transformers, pandas libraries using pip.
pip install -U sentence-transformers pandas
Sentence similarity using all-MiniLM-L6-V2
Letβs create embeddings for an array of sentences and compute the similarities between them.
- Create a file named app.py and add the following code to it.
from sentence_transformers import SentenceTransformer, util
# Load the MiniLM model
model = SentenceTransformer('all-MiniLM-L6-v2')
# Define an array of sentences
sentences = [
"The quick brown fox jumps over the lazy dog.",
"A fast dark fox leaps across a sleepy canine.",
"The weather is sunny and warm today.",
"The forecast predicts a bright and hot day."
]
# Create embeddings for each sentence
embeddings = model.encode(sentences, convert_to_tensor=True)
# Calculate pairwise cosine similarity
similarity_matrix = util.cos_sim(embeddings, embeddings)
# Display the similarity scores
print("Sentence Similarity Scores:")
for i in range(len(sentences)):
for j in range(i + 1, len(sentences)):
print(f"Similarity between \"{sentences[i]}\" and \"{sentences[j]}\": {similarity_matrix[i][j]:.4f}")
- Run the code using the following command to see the output.
python app.py
The expected output is as follows:
Building a symptoms-based diagnosis system
A symptoms-based diagnosis system using all-MiniLM-L6-V2 converts medical text, such as symptoms or treatments, into embeddings that capture context. These embeddings enable effective comparison of symptoms, providing accurate condition or treatment recommendations and helping users discover relevant care options.
Importing necessary libraries
Import sentence transformers to use all-MiniLM-L6-V2 model and pandas for loading the dataset.
import pandas as pd
from sentence_transformers import SentenceTransformer, util
pd.set_option('display.max_columns', None)
Importing dataset
Kaggle provides a dataset with information on symptoms and treatments for over 400 medical conditions.
Disease and Symptoms | Explore Symptoms and Treatments for 400+ Medical Conditions! | www.kaggle.com
This dataset is loaded into a Pandas DataFrame named df, and the first few entries are displayed to understand its structure and content.
df = pd.read_csv('Diseases_Symptoms.csv')
print(df.head())
Initializing sentence transformers
The Sentence Transformer model all-MiniLM-L6-v2 is initialized to convert the symptom descriptions in the dataset's symptom column into vector embeddings. A new column, Symptom_Embedding, is added to the DataFrame to store the embeddings for each disease's symptoms.
model = SentenceTransformer('all-MiniLM-L6-v2')
df['Symptom_Embedding'] = df['Symptoms'].apply(lambda x: model.encode(x))
Finding conditions by symptoms
Define a functionfind_condition_by_symptoms() which identifies the best-matching medical condition based on user-provided symptoms. It generates an embedding for the input symptoms and calculates cosine similarity with pre-computed embeddings of diseases in the dataset. The similarity scores are stored in the Similarity column, and the condition with the highest score is identified as the best match using .idxmax(). The function then retrieves and returns the Name of the disease and its corresponding Treatments.
def find_condition_by_symptoms(input_symptoms):
input_embedding = model.encode(input_symptoms)
df['Similarity'] = df['Symptom_Embedding'].apply(lambda x: util.cos_sim(input_embedding, x).item())
best_match = df.loc[df['Similarity'].idxmax()]
return best_match['Name'], best_match['Treatments']
Testing with sample input
Provide an example input for symptoms to pass to the find_condition_by_symptoms()
function. The function will return and print the name of the matching condition along with the recommended treatments.
symptoms = "Fever, sore throat, and fatigue"
condition, treatments = find_condition_by_symptoms(symptoms)
print("Symptoms:", symptoms)
print("Condition:", condition)
print("Recommended Treatments:", treatments)
Final code
Below is the complete code for the app.
import pandas as pd
from sentence_transformers import SentenceTransformer, util
pd.set_option('display.max_columns', None)
# Load the data
df = pd.read_csv('Diseases_Symptoms.csv')
# print(df.head())
# Initialize a Sentence Transformer model to generate embeddings
model = SentenceTransformer('all-MiniLM-L6-v2')
# Generate embeddings for each condition's symptoms
df['Symptom_Embedding'] = df['Symptoms'].apply(lambda x: model.encode(x))
# Function to find matching condition based on input symptoms
def find_condition_by_symptoms(input_symptoms):
input_embedding = model.encode(input_symptoms)
df['Similarity'] = df['Symptom_Embedding'].apply(lambda x: util.cos_sim(input_embedding, x).item())
best_match = df.loc[df['Similarity'].idxmax()]
return best_match['Name'], best_match['Treatments']
# Sample input and output
symptoms = "Fever, sore throat, and fatigue"
condition, treatments = find_condition_by_symptoms(symptoms)
print("Symptoms:", symptoms)
print("Condition:", condition)
print("Recommended Treatments:", treatments)
If you run the app then the expected output is as follows:
MiniLM-L6-V2 helps to improve healthcare accessibility and efficiency through symptom-based disease diagnosis. By generating embeddings for user-provided symptoms, the system can accurately identify conditions and offer treatment recommendations. However, challenges such as incomplete data, symptom variability, and data security need to be addressed to enhance accuracy and user experience.
Thanks for reading this article !!
Thanks Gowri M Bhatt for reviewing the content.
If you enjoyed this article, please click on the heart button β₯ and share to help others find it!
The full source code for this tutorial can be found here,
Top comments (0)