Joshua Thuo

Posted on Dec 22, 2024

Using Python to develop a Multi-Modal Data Fusion Framework

Multi-modal data fusion using Python combines several data sources for better outcomes, readability, and workability. For example, for an image recognition system to work perfectly, associated images should be collected from various sources to give the system a wide range of solutions before making a well-informed final decision.

This article will accomplish this using several data types, i.e., numerical data, text, and images. The three data sets will first be computed one after the other, and finally, we shall merge them to make a better decision.

Prerequisites

To fully understand this topic, one should have basic knowledge of the following:

Pandas, Natural Language Toolkit(nltk), OpenCV, and scikit-learn
Machine learning
Python and its libraries.

Develop a Multi-Modal Data Fusion
Libraries installation
Data Preparation
Processing Text Data
Processing Image data
Processing Numerical Data
Merging the data entities
Conclusion

Develop a Multi-Modal Data Fusion

In order to achieve this, we will take the following three steps:

Libraries installation
Data Preparation
Practical Examples

The codes can run on various systems, e.g., Pycharm, Jupiter Notebook, and Python Idle. But in this case, we shall use vscode, which can be downloaded here and installed on your machine.

How to install Libraries

The following procedures must be done to install the following libraries:

Open your cmd and ensure it runs as an administrator.
Ensure you are operating from the root by performing the following. cd..
The command cd.. shows you are already at the root.
After that, you can perform the command below, which will be installed on your machine.

pip install pandas

This library will help execute numerical data, as shown in our case example.

pip install nltk

This library will help execute text data, as shown in our case example.

pip install opencv

This library will help execute image data, as shown in our case example.

pip install scikit-learn

Data Preparation

In this project, three data types will be used: text, image, and numerical data. The reason for using the different types of data is that for the model to have a better decision-making capability, it requires a combination of data from multiple sources.

Development of Multi-Modal Data Fusion

We shall hereby prepare a simple dataset for each modality.

Processing Text Data

A brief text will be subjected to sentiment analysis using NLTK (Natural Language Toolkit). The aim is to get a sentiment score out of the text.

Example 1.1

import nltk
from nltk.sentiment import SentimentIntensityAnalyzer

nltk.download('vader_lexicon')

# Example
text = "Programming is sweet if you code everyday"

sia = SentimentIntensityAnalyzer()
# score
sentiment_score = sia.polarity_scores(text)

print("Sentiment Score:", sentiment_score)

See terminal output below:

Sentiment Score: {'neg': 0.0, 'neu': 0.667, 'pos': 0.333, 'compound': 0.4588}

The results above show mixed results, nevertheless. Some are positive, while others are negative. A negative sentiment will, therefore, be a score less than or equal to 0, while a positive sentiment will be a score above 0.

Processing Image data

In this section, we shall generate a histogram with different colour codes, and in this case, RGB will be applied, as shown below.

Example 1.2

import cv2
import numpy as np
from matplotlib import pyplot as plt

# Upload an img with the correct img path
img = cv2.imread('/home/thuo/Desktop/hero.jpg')  
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  

# Img output as a histogram
hist_r = cv2.calcHist([img], [0], None, [256], [0, 256])  
hist_g = cv2.calcHist([img], [1], None, [256], [0, 256]) 
hist_b = cv2.calcHist([img], [2], None, [256], [0, 256])  



# Histogram output
plt.figure()
plt.title('Color Output')
plt.xlabel('Pixels')
plt.ylabel('Frequency')
plt.xlim(0, 256)
plt.plot(hist_r, color='red')
plt.plot(hist_g, color='green')
plt.plot(hist_b, color='blue')
plt.show()

NB: A correct image path should be applied; otherwise, the code will generate an error

Processing Numerical Data

Here, we shall process a third data category, such as humidity readings, before finally combining the three data processing categories.

Example 1.3

NB: The humidity data as recorded is assumed to be collected across several months across the same year

import pandas as pd

sensor_data = 
{
 "timestamp": ["2022-04-28 15:58", "2022-07-01 07:26", "2022-10-14 15:58", "2022-12-19 03:45"],
 "humidity": [14, 18.3, 25.7, 20.8] # Humidity values
 }

 sensor_df = pd.DataFrame(sensor_data)

 print("Sensor Data:")
 print(sensor_df)

The terminal output is as shown below

Sensor Data:
 timestamp  humidity
0  2022-04-28 15:58      14.0
1  2022-07-01 07:26      18.3
2  2022-10-14 15:58      25.7
3  2022-12-19 03:45      20.8

Merging the data entities

Using the three data categories described above, we shall merge them into one entity for better outcomes (decision-making in particular), readability, or workability.

The following data will be fed to get a combined feature as the output when merging the three data entities discussed above.

Terminal output of Example 1.1. Here, we shall input the sentiment score as sentiment_score = {'neg': 0.0, 'neu': 0.667, 'pos': 0.333, 'compound': 0.4588}
Terminal output of Example 1.2. Processing image data.

NB: The image path to be included in this section is the terminal output but not the initially uploaded image used to get the output. eg

image_path = "/home/thuo/Desktop/Figure_1.png"

Terminal output of Example 1.3. Here, we shall input the Sensor Humidity Data output obtained in our above example as 14.0, 18.3, 25.7, and 20.8, omitting the dates.

Example 1.4

import numpy as np
import cv2
import pandas as pd

sentiment_score = {'neg': 0.0, 'neu': 0.667, 'pos': 0.333, 'compound': 0.4588}

img_path = "/home/thuo/Desktop/Figure_1.png"

img = cv2.imread(img_path)

hist_r = cv2.calcHist([img], [0], None, [256], [0, 256])  
hist_g = cv2.calcHist([img], [1], None, [256], [0, 256]) 
hist_b = cv2.calcHist([img], [2], None, [256], [0, 256])  


sensor_data = {'humidity': [14.0, 18.3, 25.7, 20.8]}
sensor_df = pd.DataFrame(sensor_data)

text_feature = sentiment_score['compound']  # Sentiment score
img_feature = (np.mean(hist_r), np.mean(hist_g), np.mean(hist_b))  # Average color intensity
sensor_feature = np.mean(sensor_df['humidity'])

combined_features = np.array([text_feature, *img_feature, sensor_feature])

print("Combined Features Vector:", combined_features)

Terminal output is shown below

Combined Features Vector: [4.58800000e-01 3.42895312e+03 3.42895312e+03 3.42895312e+03
 1.97000000e+01]

Conclusion

In this tutorial, we were first able to experience an individual output of all three data sets. This makes it difficult to make a correct decision about the data. To avoid the difficulty of making an informed decision, we have gone further and merged the three data entities to get a final output. More informed decisions using the same procedures can be made by using more complex computations.

! HAPPY CODING!

DEV Community

Using Python to develop a Multi-Modal Data Fusion Framework