DEV Community

Cover image for Transforming Legacy Insurance with Neo4J,Redis and AI β€” robust, scalable, low latency β€” Smart Crawling Legacy Portal.
RAJAT ROY
RAJAT ROY

Posted on

Transforming Legacy Insurance with Neo4J,Redis and AI β€” robust, scalable, low latency β€” Smart Crawling Legacy Portal.

Redis AI Challenge: Real-Time AI Innovators

This is a submission for the Redis AI Challenge: Real-Time AI Innovators.

πŸ“˜ Story: Transforming Legacy Insurance with Neo4J, Redis & AI

The Problem: Legacy Hazard

Most traditional insurance systems built using JSPs (JavaServer Pages) over 20+ years ago suffer from:

  • Fragmented front-ends (200+ JSPs)
  • Poor documentation
  • Difficult user navigation and steep learning curves

Agents, underwriters, and claim processors often rely on tribal knowledge or IT support for simple queries like:

β€œWhere do I upload KYC documents?”


Real‑World Use Case: Meet I‑Helper

A conversational AI bot that allows users to ask questions in natural language:

β€œHow to check policy cancellation reasons?”

It responds:

β€œPlease check CancellationReason.jsp under Policy Actions tab β†’ Cancel Policy section.”

No more wandering across JSPs β€” it understands intent and maps it directly to legacy screens.


🧠 Vision: AI-Powered Legacy Navigation (AWS)

The Core Stack Includes:

  • πŸ•Έ Neo4j: To represent JSP pages, fields, and their interlinked flow
  • ⚑ Redis: Supercharged low-latency semantic search
  • 🧠 AI models: To interpret and generate answers
  • πŸ”Œ WebSocket API: Real-time question answering

GIT Hub link:
https://github.com/roy777rajat/jsp-crawler-insurance


Demo (Live Playground)

https://hiinsurance.streamlit.app/

Architecture Overview

High Level Layout (AWS Cloud)

Output Image

Final output β€” I-Helper is rockzz

πŸš€ Redis: The Real-Time Vector Engine

🧩 Where Redis Fits

Once the JSP pages are crawled and structured:

  1. Each label, field, and screen name is embedded into vectors using models like Titan Embeddings.
  2. These vectors are stored in Redis, specifically Redis with Vector Similarity Search (VSS) enabled.
  3. Redis acts as the first responder in the QnA pipeline:
    • When a user asks a question like β€œWhere to enter nominee name?”
    • The system vectorizes the query and retrieves top-K matches (e.g., NomineeName.jsp, ClaimantDetails.jsp) in under 10 milliseconds.

🏎 Why Redis?

Feature Why It Matters
πŸ•’ Low Latency Responses are returned in <10ms, crucial for chat-based UX
πŸ’Ύ Memory-first In-memory retrieval ensures zero IO lag
πŸ’‘ Semantic Matching Redis vector index retrieves conceptually similar matches, not just keywords
πŸ†“ Free Tier Advantage Up to 30 GB vector data available in Redis Cloud Free Tier, ideal for PoCs
πŸ”Œ Easy Integration Works well with Python (redis-py) and AWS Lambda in event-driven architectures

🧠 How Redis Complements Neo4j

  • Redis = β€œWhat’s most similar to the user’s query?”
  • Neo4j = β€œHow is this concept related to other UI screens and fields?”

Redis brings fast recall, while Neo4j brings deep reasoning.

βœ… Steps followed

Use redis-py v5+ to Access VSS (Python Client)
pip install redis>=5.0.0

Use VSS feature like :
(Used during real time QnA with I-Helper, Backed by Redis)

import redis
from redis.commands.search.query import Query
secret =  get_secrets("dev/python/api")
REDIS_HOST = secret["REDIS_HOST"]
REDIS_PORT = secret["REDIS_PORT"]
REDIS_USER = secret["REDIS_USER"]
REDIS_PASS = secret["REDIS_PASS"]
REDIS_INDEX = "page_index"
VECTOR_DIM = 1024
TOP_K = 2

redis_conn = redis.Redis(
    host=REDIS_HOST,
    port=REDIS_PORT,
    username=REDIS_USER,
    password=REDIS_PASS,
    decode_responses=True
)
def search_redis_vector(query_embedding):
    base_query = f'*=>[KNN {TOP_K} @embedding $vec AS vector_score]'
    q = Query(base_query).sort_by("vector_score").return_fields("page_name", "vector_score", "text").dialect(2)
    vector_bytes = np.array(query_embedding, dtype=np.float32).tobytes()
    results = redis_conn.ft(REDIS_INDEX).search(q, query_params={"vec": vector_bytes})

    SCORE_THRESHOLD = 0.4
    docs = []
    for doc in results.docs:
        score = float(doc.vector_score)
        if score > SCORE_THRESHOLD:
            docs.append({"page_name": doc.page_name, "score": score, "vector_score": score,"text": getattr(doc, "text", "")})
            print(f"Found doc: {doc.page_name} with score: {score}")
    return docs

Enter fullscreen mode Exit fullscreen mode

Use VSS feature like :
(One time and batch based insertion (Insert+Update , dynamically index creation, Backed by Redis)

import redis
import numpy as np
from redis.commands.search.field import TextField, VectorField
from redis.commands.search.indexDefinition import IndexDefinition,IndexType
from redis.commands.search.query import Query

# === Configs ===
secretsmanager = boto3.client("secretsmanager", region_name="eu-west-1")
def get_secrets(secret_name):
    try:
        response = secretsmanager.get_secret_value(SecretId=secret_name)
        secret_dict = json.loads(response['SecretString'])
        return secret_dict
    except Exception as e:
        print(f"Failed to retrieve {secret_name} credentials: {e}")
        raise

secret =  get_secrets("dev/python/api")


# Redis connection
redis_conn = redis.Redis(
    host=secret["REDIS_HOST"],
    port=secret["REDIS_PORT"],
    username= secret["REDIS_USER"],
    decode_responses=True,
    password=  secret["REDIS_PASS"]
)

REDIS_INDEX_NAME = "page_index"
REDIS_VECTOR_DIM = 1024 # Its same as Titan

# === Redis Upsert ===
def create_redis_index():
    try:
        redis_conn.ft(REDIS_INDEX_NAME).create_index(
            fields=[
                TextField("page_name"),
                TextField("text"),  # Added later stage....
                VectorField("embedding", "FLAT", {
                    "TYPE": "FLOAT32",
                    "DIM": REDIS_VECTOR_DIM,
                    "DISTANCE_METRIC": "COSINE"
                })
            ],
            definition=IndexDefinition(prefix=["doc:"], index_type=IndexType.HASH)
        )
        print("Redis vector index created.")
    except Exception as e:
        print(f"Redis index may already exist or failed: {e}")


def upsert_to_redis(doc_id: str, embedding: list[float], metadata: dict):
    key = f"doc:{doc_id}"
    vector_bytes = np.array(embedding, dtype=np.float32).tobytes()

    # Prepare plain text summary (all nodes from Neo4J)
    fields_text = f"Fields: {', '.join(metadata.get('fields', []))}"
    actions_text = f"Actions: {', '.join(metadata.get('actions', []))}"
    relationship_list = [
        f"{r['from']} -[{r['relation']}]-> {r['to']}"
        for r in metadata.get('relationships', [])
    ]
    relationships_text = f"Relationships: {', '.join(relationship_list)}"
    flat_text = f"{fields_text}\n{actions_text}\n{relationships_text}"

    # HSET command
    redis_conn.execute_command(
        "HSET", key,
        "embedding", vector_bytes,
        "page_name", metadata["page_name"],
        "text", flat_text
    )

    print(f"Upserted doc_id: {doc_id} into Redis")



    # Store vector, page_name, and contextual text
    redis_conn.execute_command(
        "HSET", key,
        "embedding", vector_bytes,
        "page_name", metadata["page_name"],
        "text", flat_text  # Later added
    )
    print(f"Upserted doc_id: {doc_id} into Redis")
Enter fullscreen mode Exit fullscreen mode

Features:

πŸ§‘β€πŸ’» Just me!

https://www.linkedin.com/in/royrajat]

Thanks

RAJAT ROY

Top comments (0)