RAJAT ROY

Posted on Jul 29

Transforming Legacy Insurance with Neo4J,Redis and AI — robust, scalable, low latency — Smart Crawling Legacy Portal.

#redischallenge #devchallenge #database #ai

Redis AI Challenge: Real-Time AI Innovators

This is a submission for the Redis AI Challenge: Real-Time AI Innovators.

📘 Story: Transforming Legacy Insurance with Neo4J, Redis & AI

The Problem: Legacy Hazard

Most traditional insurance systems built using JSPs (JavaServer Pages) over 20+ years ago suffer from:

Fragmented front-ends (200+ JSPs)
Poor documentation
Difficult user navigation and steep learning curves

Agents, underwriters, and claim processors often rely on tribal knowledge or IT support for simple queries like:

“Where do I upload KYC documents?”

Real‑World Use Case: Meet I‑Helper

A conversational AI bot that allows users to ask questions in natural language:

“How to check policy cancellation reasons?”

It responds:

“Please check CancellationReason.jsp under Policy Actions tab → Cancel Policy section.”

No more wandering across JSPs — it understands intent and maps it directly to legacy screens.

🧠 Vision: AI-Powered Legacy Navigation (AWS)

The Core Stack Includes:

🕸 Neo4j: To represent JSP pages, fields, and their interlinked flow
⚡ Redis: Supercharged low-latency semantic search
🧠 AI models: To interpret and generate answers
🔌 WebSocket API: Real-time question answering

GIT Hub link:
https://github.com/roy777rajat/jsp-crawler-insurance

Demo (Live Playground)

https://hiinsurance.streamlit.app/

Architecture Overview

Output Image

🚀 Redis: The Real-Time Vector Engine

🧩 Where Redis Fits

Once the JSP pages are crawled and structured:

Each label, field, and screen name is embedded into vectors using models like Titan Embeddings.
These vectors are stored in Redis, specifically Redis with Vector Similarity Search (VSS) enabled.
Redis acts as the first responder in the QnA pipeline:
- When a user asks a question like “Where to enter nominee name?”
- The system vectorizes the query and retrieves top-K matches (e.g., NomineeName.jsp, ClaimantDetails.jsp) in under 10 milliseconds.

🏎 Why Redis?

Feature	Why It Matters
🕒 Low Latency	Responses are returned in <10ms, crucial for chat-based UX
💾 Memory-first	In-memory retrieval ensures zero IO lag
💡 Semantic Matching	Redis vector index retrieves conceptually similar matches, not just keywords
🆓 Free Tier Advantage	Up to 30 GB vector data available in Redis Cloud Free Tier, ideal for PoCs
🔌 Easy Integration	Works well with Python (`redis-py`) and AWS Lambda in event-driven architectures

🧠 How Redis Complements Neo4j

Redis = “What’s most similar to the user’s query?”
Neo4j = “How is this concept related to other UI screens and fields?”

Redis brings fast recall, while Neo4j brings deep reasoning.

✅ Steps followed

Use redis-py v5+ to Access VSS (Python Client)
pip install redis>=5.0.0

Use VSS feature like :
(Used during real time QnA with I-Helper, Backed by Redis)

import redis
from redis.commands.search.query import Query
secret =  get_secrets("dev/python/api")
REDIS_HOST = secret["REDIS_HOST"]
REDIS_PORT = secret["REDIS_PORT"]
REDIS_USER = secret["REDIS_USER"]
REDIS_PASS = secret["REDIS_PASS"]
REDIS_INDEX = "page_index"
VECTOR_DIM = 1024
TOP_K = 2

redis_conn = redis.Redis(
    host=REDIS_HOST,
    port=REDIS_PORT,
    username=REDIS_USER,
    password=REDIS_PASS,
    decode_responses=True
)
def search_redis_vector(query_embedding):
    base_query = f'*=>[KNN {TOP_K} @embedding $vec AS vector_score]'
    q = Query(base_query).sort_by("vector_score").return_fields("page_name", "vector_score", "text").dialect(2)
    vector_bytes = np.array(query_embedding, dtype=np.float32).tobytes()
    results = redis_conn.ft(REDIS_INDEX).search(q, query_params={"vec": vector_bytes})

    SCORE_THRESHOLD = 0.4
    docs = []
    for doc in results.docs:
        score = float(doc.vector_score)
        if score > SCORE_THRESHOLD:
            docs.append({"page_name": doc.page_name, "score": score, "vector_score": score,"text": getattr(doc, "text", "")})
            print(f"Found doc: {doc.page_name} with score: {score}")
    return docs

Use VSS feature like :
(One time and batch based insertion (Insert+Update , dynamically index creation, Backed by Redis)

import redis
import numpy as np
from redis.commands.search.field import TextField, VectorField
from redis.commands.search.indexDefinition import IndexDefinition,IndexType
from redis.commands.search.query import Query

# === Configs ===
secretsmanager = boto3.client("secretsmanager", region_name="eu-west-1")
def get_secrets(secret_name):
    try:
        response = secretsmanager.get_secret_value(SecretId=secret_name)
        secret_dict = json.loads(response['SecretString'])
        return secret_dict
    except Exception as e:
        print(f"Failed to retrieve {secret_name} credentials: {e}")
        raise

secret =  get_secrets("dev/python/api")


# Redis connection
redis_conn = redis.Redis(
    host=secret["REDIS_HOST"],
    port=secret["REDIS_PORT"],
    username= secret["REDIS_USER"],
    decode_responses=True,
    password=  secret["REDIS_PASS"]
)

REDIS_INDEX_NAME = "page_index"
REDIS_VECTOR_DIM = 1024 # Its same as Titan

# === Redis Upsert ===
def create_redis_index():
    try:
        redis_conn.ft(REDIS_INDEX_NAME).create_index(
            fields=[
                TextField("page_name"),
                TextField("text"),  # Added later stage....
                VectorField("embedding", "FLAT", {
                    "TYPE": "FLOAT32",
                    "DIM": REDIS_VECTOR_DIM,
                    "DISTANCE_METRIC": "COSINE"
                })
            ],
            definition=IndexDefinition(prefix=["doc:"], index_type=IndexType.HASH)
        )
        print("Redis vector index created.")
    except Exception as e:
        print(f"Redis index may already exist or failed: {e}")


def upsert_to_redis(doc_id: str, embedding: list[float], metadata: dict):
    key = f"doc:{doc_id}"
    vector_bytes = np.array(embedding, dtype=np.float32).tobytes()

    # Prepare plain text summary (all nodes from Neo4J)
    fields_text = f"Fields: {', '.join(metadata.get('fields', []))}"
    actions_text = f"Actions: {', '.join(metadata.get('actions', []))}"
    relationship_list = [
        f"{r['from']} -[{r['relation']}]-> {r['to']}"
        for r in metadata.get('relationships', [])
    ]
    relationships_text = f"Relationships: {', '.join(relationship_list)}"
    flat_text = f"{fields_text}\n{actions_text}\n{relationships_text}"

    # HSET command
    redis_conn.execute_command(
        "HSET", key,
        "embedding", vector_bytes,
        "page_name", metadata["page_name"],
        "text", flat_text
    )

    print(f"Upserted doc_id: {doc_id} into Redis")



    # Store vector, page_name, and contextual text
    redis_conn.execute_command(
        "HSET", key,
        "embedding", vector_bytes,
        "page_name", metadata["page_name"],
        "text", flat_text  # Later added
    )
    print(f"Upserted doc_id: {doc_id} into Redis")

DEV Community