In this tutorial, we’ll build TopicVis, a desktop application that:
Loads multiple text files
Uses LDA topic modeling
Visualizes topics as word clouds
Exports results to CSV
Saves & loads projects
The app is built with Tkinter, ttkbootstrap, and scikit-learn, making it perfect for beginners who want to mix machine learning + GUI apps.
🛠 Prerequisites
Install the required libraries first:
pip install ttkbootstrap tkinterdnd2 scikit-learn wordcloud pillow
1️⃣ Importing Required Libraries
We start by importing everything we need:
GUI tools, ML tools, file handling, and visualization libraries.
import sys, os, json, csv
import tkinter as tk
from tkinter import filedialog, messagebox
import ttkbootstrap as tb
from tkinterdnd2 import TkinterDnD, DND_FILES
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import LatentDirichletAllocation
from wordcloud import WordCloud
from PIL import Image, ImageTk
🔍 What’s happening?
Tkinter / ttkbootstrap → UI
scikit-learn → topic modeling (LDA)
WordCloud + PIL → image generation
json / csv → saving & exporting results
2️⃣ App Metadata (Professional Touch)
Define product-level metadata for versioning and licensing.
APP_NAME = "TopicVis"
APP_VERSION = "2.1"
COMPANY = "Mate Technologies"
LICENSE_TEXT = (
"This software is licensed for personal, academic, and commercial use.\n\n"
"THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY."
)
This makes your app feel commercial-ready.
3️⃣ Creating the Main Application Window
Now we initialize the main window using TkinterDnD.
app = TkinterDnD.Tk()
app.title(f"{APP_NAME} v{APP_VERSION}")
app.geometry("1250x720")
tb.Style("darkly")
💡 Why ttkbootstrap?
It gives us modern themes without writing custom CSS.
4️⃣ Global App State
We store app-wide data here.
documents = []
topics_cache = []
wordcloud_images = []
These lists hold:
File paths
Topic modeling results
WordCloud images (to prevent garbage collection)
5️⃣ Utility Dialog Functions
Reusable helpers for showing messages.
def show_error(title, msg):
messagebox.showerror(title, msg)
def show_info(title, msg):
messagebox.showinfo(title, msg)
6️⃣ App Header UI
A subtle header for branding.
tb.Label(
app,
text="Commercial Topic Modeling & Visualization Tool",
font=("Segoe UI", 10, "italic"),
foreground="#AAAAAA"
).pack(fill="x", padx=12, pady=(6, 8))
7️⃣ Model Settings Panel
Users control the topic model here.
row1 = tb.Labelframe(app, text="Model Settings", padding=10)
row1.pack(fill="x", padx=10)
Reusable Input Field Helper
def field(parent, label, default, width=10):
tb.Label(parent, text=label, width=14, anchor="w").pack(side="left")
entry = tb.Entry(parent, width=width)
entry.insert(0, default)
entry.pack(side="left", padx=(0, 8))
return entry
Fields
num_topics = field(row1, "Num Topics", "5", 6)
max_words = field(row1, "Words / Topic", "10", 6)
theme_name = field(row1, "Theme Name", "superhero", 16)
8️⃣ Action Buttons
Buttons for loading files, running the model, and exporting data.
row2 = tb.Labelframe(app, text="Actions", padding=10)
row2.pack(fill="x", padx=10, pady=6)
Add Text Files
tb.Button(
row2,
text="Add Text Files",
command=lambda: add_documents(
filedialog.askopenfilenames(filetypes=[("Text Files","*.txt")])
)
).pack(side="left", padx=4)
Run Model Button
tb.Button(
row2,
text="RUN MODEL",
bootstyle="success",
command=lambda: run_model()
).pack(side="right", padx=6)
9️⃣ Scrollable Word Cloud Gallery
This section displays generated topic visuals.
gallery = tb.Labelframe(app, text="Topic Word Clouds", padding=10)
gallery.pack(fill="both", expand=True, padx=10)
Canvas + Scrollbar Setup
canvas = tk.Canvas(gallery)
scroll = tk.Scrollbar(gallery, command=canvas.yview)
frame = tk.Frame(canvas)
canvas.create_window((0,0), window=frame, anchor="nw")
canvas.configure(yscrollcommand=scroll.set)
This lets us show many topics without resizing the window.
🔟 Loading Documents
def add_documents(paths):
for p in app.tk.splitlist(paths):
if p not in documents:
documents.append(p)
Each selected .txt file is added once.
1️⃣1️⃣ Running Topic Modeling (Core Logic)
Read Text Files
texts = [open(p, encoding="utf-8").read() for p in documents]
Convert Text → Numbers
vectorizer = CountVectorizer(stop_words="english")
X = vectorizer.fit_transform(texts)
Train LDA Model
lda = LatentDirichletAllocation(
n_components=int(num_topics.get()),
random_state=42
)
lda.fit(X)
1️⃣2️⃣ Generating Word Clouds
for idx, topic in enumerate(lda.components_):
words = {
vectorizer.get_feature_names_out()[i]: topic[i]
for i in topic.argsort()[-int(max_words.get()):]
}
Create WordCloud Image
wc = WordCloud(
width=420,
height=300,
background_color="white"
).generate_from_frequencies(words)
Display in UI
img = wc.to_image()
img.thumbnail((260, 200))
tk_img = ImageTk.PhotoImage(img)
lbl = tb.Label(
frame,
image=tk_img,
text=f"Topic {idx+1}",
compound="top"
)
lbl.grid(row=idx//4, column=idx%4, padx=6, pady=6)
1️⃣3️⃣ Export Topics to CSV
def export_csv():
path = filedialog.asksaveasfilename(defaultextension=".csv")
if not path: return
with open(path, "w", newline="", encoding="utf-8") as f:
writer = csv.writer(f)
writer.writerow(["Topic", "Word", "Weight"])
for i, topic in enumerate(topics_cache, 1):
for word, weight in topic.items():
writer.writerow([i, word, round(weight, 4)])
1️⃣4️⃣ Save & Load Projects
Save
json.dump({
"documents": documents,
"theme": theme_name.get(),
"topics": topics_cache
}, open(path, "w", encoding="utf-8"), indent=2)
Load
data = json.load(open(path, encoding="utf-8"))
documents[:] = data.get("documents", [])
1️⃣5️⃣ About & License Window
def show_about():
win = tb.Toplevel(app)
win.title("About")
win.geometry("560x460")
This adds professional polish and licensing clarity.
🚀 Final Step: Run the App
app.mainloop()
🎉 What You Built
✅ A real desktop ML app
✅ Topic modeling with LDA
✅ Visual word clouds
✅ CSV export
✅ Save/load projects

Top comments (0)