<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: InterSystems</title>
    <description>The latest articles on DEV Community by InterSystems (@intersystems).</description>
    <link>https://dev.to/intersystems</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F2450%2F5c611adb-602d-4948-b84b-5fe47046fd5c.png</url>
      <title>DEV Community: InterSystems</title>
      <link>https://dev.to/intersystems</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/intersystems"/>
    <language>en</language>
    <item>
      <title>Step-by-Step Guide: Setting Up RAG for Gen AI Agents Using IRIS Vector DB in Python</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Sun, 29 Mar 2026 12:14:04 +0000</pubDate>
      <link>https://dev.to/intersystems/step-by-step-guide-setting-up-rag-for-gen-ai-agents-using-iris-vector-db-in-python-3b00</link>
      <guid>https://dev.to/intersystems/step-by-step-guide-setting-up-rag-for-gen-ai-agents-using-iris-vector-db-in-python-3b00</guid>
      <description>&lt;p&gt;&lt;span&gt;&lt;strong&gt;How to set up RAG for OpenAI agents using IRIS Vector DB in Python&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;In this article, I’ll walk you through an example of using InterSystems IRIS Vector DB to store embeddings and integrate them with an OpenAI agent.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;To demonstrate this, we’ll create an OpenAI agent with knowledge of InterSystems technology. We’ll achieve this by storing embeddings of some InterSystems documentation in IRIS and then using IRIS vector search to retrieve relevant content—enabling a Retrieval-Augmented Generation (RAG) workflow.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Note: &lt;/span&gt;&lt;span&gt;Section 1 details how process text into embeddings. If you are only interested in IRIS vector search you can skip ahead to Section 2.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Section 1: Embedding Data&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Your embeddings are only as good as your data! To get the best results, you should prepare your data carefully. This may include:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;Cleaning the text (removing special characters or excess whitespace)&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;Chunking the data into smaller pieces&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;Other preprocessing techniques&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;For this example, the documentation is stored in simple text files that require minimal cleaning. However, we will divide the text into chunks to enable more efficient and accurate RAG.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Step 1: Chunking Text Files&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Chunking text into manageable pieces benefits RAG systems in two ways:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li value="1"&gt;&lt;span&gt;&lt;span&gt;More accurate retrieval – embeddings represent smaller, more specific sections of text.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;More efficient retrieval – less text per query reduces cost and improves performance.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;For this example, we’ll store the chunked text in Parquet files before uploading to IRIS (though you can use any approach, including direct upload).&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Chunking Function&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;We’ll use RecursiveCharacterTextSplitter from langchain_text_splitters to split text strategically based on paragraph, sentence, and word boundaries.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;Chunk size: 300 tokens (larger chunks provide more context but increase retrieval cost)&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;Chunk overlap: 50 tokens (helps maintain context across chunks)&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="mention"&gt;from&lt;/span&gt; langchain_text_splitters &lt;span class="mention"&gt;import&lt;/span&gt; RecursiveCharacterTextSplitter

&lt;p&gt;&lt;span&gt;def&lt;/span&gt; &lt;span&gt;chunk_text_by_tokens&lt;/span&gt;&lt;span&gt;(text: str, chunk_size: int, chunk_overlap: int)&lt;/span&gt; -&amp;gt; list[str]:&lt;br&gt;
    """&lt;br&gt;
    Chunk text prioritizing paragraph and sentence boundaries using&lt;br&gt;
    RecursiveCharacterTextSplitter. Returns a list of chunk strings.&lt;br&gt;
    """&lt;br&gt;
    splitter = RecursiveCharacterTextSplitter(&lt;br&gt;
        &lt;span&gt;# Prioritize larger semantic units first, then fall back to smaller ones&lt;/span&gt;&lt;br&gt;
        separators=[&lt;span&gt;"\n\n"&lt;/span&gt;, &lt;span&gt;"\n"&lt;/span&gt;, &lt;span&gt;". "&lt;/span&gt;, &lt;span&gt;" "&lt;/span&gt;, &lt;span&gt;""&lt;/span&gt;],&lt;br&gt;
        chunk_size=chunk_size,&lt;br&gt;
        chunk_overlap=chunk_overlap,&lt;br&gt;
        length_function=len,&lt;br&gt;
        is_separator_regex=&lt;span&gt;False&lt;/span&gt;,&lt;br&gt;
    )&lt;br&gt;
    &lt;span&gt;return&lt;/span&gt; splitter.split_text(text)&lt;br&gt;
&lt;/p&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Next, we’ll use the chunking function to process one text file at a time and apply a tiktoken encoder to calculate token counts and generate metadata. This metadata will be useful later when creating embeddings and storing them in IRIS.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="mention"&gt;from&lt;/span&gt; pathlib &lt;span class="mention"&gt;import&lt;/span&gt; Path&lt;br&gt;
&lt;span class="mention"&gt;import&lt;/span&gt; tiktoken

&lt;p&gt;&lt;span&gt;def&lt;/span&gt; &lt;span&gt;chunk_file&lt;/span&gt;&lt;span&gt;(path: Path, chunk_size: int, chunk_overlap: int, encoding_name: str = &lt;/span&gt;&lt;span&gt;"cl100k_base"&lt;/span&gt;) -&amp;gt; list[dict]:&lt;br&gt;
    """&lt;br&gt;
    Read a file, split its contents into token-aware chunks, and return metadata for each chunk.&lt;br&gt;
    Returns a list of dicts with keys:&lt;br&gt;
    - filename&lt;br&gt;
    - relative_path&lt;br&gt;
    - absolute_path&lt;br&gt;
    - chunk_index&lt;br&gt;
    - chunk_text&lt;br&gt;
    - token_count&lt;br&gt;
    - modified_time&lt;br&gt;
    - size_bytes&lt;br&gt;
    """&lt;br&gt;
    p = Path(path)&lt;br&gt;
    &lt;span&gt;if&lt;/span&gt; &lt;span&gt;not&lt;/span&gt; p.exists() &lt;span&gt;or&lt;/span&gt; &lt;span&gt;not&lt;/span&gt; p.is_file():&lt;br&gt;
        &lt;span&gt;raise&lt;/span&gt; FileNotFoundError(&lt;span&gt;f"File not found: &lt;/span&gt;&lt;span&gt;{path}&lt;/span&gt;")&lt;br&gt;
    &lt;span&gt;try&lt;/span&gt;:&lt;br&gt;
        text = p.read_text(encoding=&lt;span&gt;"utf-8"&lt;/span&gt;, errors=&lt;span&gt;"replace"&lt;/span&gt;)&lt;br&gt;
    &lt;span&gt;except&lt;/span&gt; Exception &lt;span&gt;as&lt;/span&gt; e:&lt;br&gt;
        &lt;span&gt;raise&lt;/span&gt; RuntimeError(&lt;span&gt;f"Failed to read file &lt;/span&gt;&lt;span&gt;{p}&lt;/span&gt;: &lt;span&gt;{e}&lt;/span&gt;")&lt;br&gt;
    &lt;span&gt;# Prepare tokenizer for accurate token counts&lt;/span&gt;&lt;br&gt;
    &lt;span&gt;try&lt;/span&gt;:&lt;br&gt;
        encoding = tiktoken.get_encoding(encoding_name)&lt;br&gt;
    &lt;span&gt;except&lt;/span&gt; Exception &lt;span&gt;as&lt;/span&gt; e:&lt;br&gt;
        &lt;span&gt;raise&lt;/span&gt; ValueError(&lt;span&gt;f"Invalid encoding name '&lt;/span&gt;&lt;span&gt;{encoding_name}&lt;/span&gt;': &lt;span&gt;{e}&lt;/span&gt;")&lt;br&gt;
    &lt;span&gt;# Create chunks using provided chunker&lt;/span&gt;&lt;br&gt;
    chunks = chunk_text_by_tokens(text, chunk_size, chunk_overlap)&lt;br&gt;
    &lt;span&gt;# File metadata&lt;/span&gt;&lt;br&gt;
    stat = p.stat()&lt;br&gt;
    &lt;span&gt;from&lt;/span&gt; datetime &lt;span&gt;import&lt;/span&gt; datetime, timezone&lt;br&gt;
    modified_time = datetime.fromtimestamp(stat.st_mtime, tz=timezone.utc).isoformat()&lt;br&gt;
    absolute_path = str(p.resolve())&lt;br&gt;
    &lt;span&gt;try&lt;/span&gt;:&lt;br&gt;
        relative_path = str(p.resolve().relative_to(Path.cwd()))&lt;br&gt;
    &lt;span&gt;except&lt;/span&gt; Exception:&lt;br&gt;
        relative_path = p.name&lt;br&gt;
    &lt;span&gt;# Build rows&lt;/span&gt;&lt;br&gt;
    rows: list[dict] = []&lt;br&gt;
    &lt;span&gt;for&lt;/span&gt; idx, chunk &lt;span&gt;in&lt;/span&gt; enumerate(chunks):&lt;br&gt;
        token_count = len(encoding.encode(chunk))&lt;br&gt;
        rows.append({&lt;br&gt;
            &lt;span&gt;"filename"&lt;/span&gt;: p.name,&lt;br&gt;
            &lt;span&gt;"relative_path"&lt;/span&gt;: relative_path,&lt;br&gt;
            &lt;span&gt;"absolute_path"&lt;/span&gt;: absolute_path,&lt;br&gt;
            &lt;span&gt;"chunk_index"&lt;/span&gt;: idx,&lt;br&gt;
            &lt;span&gt;"chunk_text"&lt;/span&gt;: chunk,&lt;br&gt;
            &lt;span&gt;"token_count"&lt;/span&gt;: token_count,&lt;br&gt;
            &lt;span&gt;"modified_time"&lt;/span&gt;: modified_time,&lt;br&gt;
            &lt;span&gt;"size_bytes"&lt;/span&gt;: stat.st_size,&lt;br&gt;
        })&lt;br&gt;
    &lt;span&gt;return&lt;/span&gt; rows&lt;br&gt;
&lt;/p&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Step 2: Creating embeddings&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;You can generate embeddings using cloud providers (e.g., OpenAI) or local models via Ollama (e.g., nomic-embed-text). In this example, we’ll use OpenAI’s text-embedding-3-small model to embed each chunk and save the results back to Parquet for later ingestion into IRIS Vector DB.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="mention"&gt;from&lt;/span&gt; openai &lt;span class="mention"&gt;import&lt;/span&gt; OpenAI&lt;br&gt;
&lt;span class="mention"&gt;import&lt;/span&gt; pandas &lt;span class="mention"&gt;as&lt;/span&gt; pd

&lt;p&gt;&lt;span&gt;def&lt;/span&gt; &lt;span&gt;embed_and_save_parquet&lt;/span&gt;&lt;span&gt;(input_parquet_path: str, output_parquet_path: str)&lt;/span&gt;:&lt;br&gt;
    """&lt;br&gt;
    Loads a Parquet file, creates embeddings for the 'chunk_text' column using &lt;br&gt;
    OpenAI's small embedding model, and saves the result to a new Parquet file.&lt;br&gt;
    Args:&lt;br&gt;
        input_parquet_path (str): Path to the input Parquet file containing 'chunk_text'.&lt;br&gt;
        output_parquet_path (str): Path to save the new Parquet file with embeddings.&lt;br&gt;
        openai_api_key (str): Your OpenAI API key.&lt;br&gt;
    """&lt;br&gt;
    key = os.getenv(&lt;span&gt;"OPENAI_API_KEY"&lt;/span&gt;)&lt;br&gt;
    &lt;span&gt;if&lt;/span&gt; &lt;span&gt;not&lt;/span&gt; key:&lt;br&gt;
        print(&lt;span&gt;"ERROR: OPENAI_API_KEY environment variable is not set."&lt;/span&gt;, file=sys.stderr)&lt;br&gt;
        sys.exit(&lt;span&gt;1&lt;/span&gt;)&lt;br&gt;
    &lt;span&gt;try&lt;/span&gt;:&lt;br&gt;
        &lt;span&gt;# Load the Parquet file&lt;/span&gt;&lt;br&gt;
        df = pd.read_parquet(input_parquet_path)&lt;br&gt;
        &lt;span&gt;# Initialize OpenAI client&lt;/span&gt;&lt;br&gt;
        client = OpenAI(api_key=key)&lt;br&gt;
        &lt;span&gt;# Generate embeddings for each chunk_text&lt;/span&gt;&lt;br&gt;
        embeddings = []&lt;br&gt;
        &lt;span&gt;for&lt;/span&gt; text &lt;span&gt;in&lt;/span&gt; df[&lt;span&gt;'chunk_text'&lt;/span&gt;]:&lt;br&gt;
            response = client.embeddings.create(&lt;br&gt;
                input=text,&lt;br&gt;
                model=&lt;span&gt;"text-embedding-3-small"&lt;/span&gt;  &lt;span&gt;# Using the small embedding model&lt;/span&gt;&lt;br&gt;
            )&lt;br&gt;
            embeddings.append(response.data[&lt;span&gt;0&lt;/span&gt;].embedding)&lt;br&gt;
        &lt;span&gt;# Add embeddings to the DataFrame&lt;/span&gt;&lt;br&gt;
        df[&lt;span&gt;'embedding'&lt;/span&gt;] = embeddings&lt;br&gt;
        &lt;span&gt;# Save the new DataFrame to a Parquet file&lt;/span&gt;&lt;br&gt;
        df.to_parquet(output_parquet_path, index=&lt;span&gt;False&lt;/span&gt;)&lt;br&gt;
        print(&lt;span&gt;f"Embeddings generated and saved to &lt;/span&gt;&lt;span&gt;{output_parquet_path}&lt;/span&gt;")&lt;br&gt;
    &lt;span&gt;except&lt;/span&gt; FileNotFoundError:&lt;br&gt;
        print(&lt;span&gt;f"Error: Input file not found at &lt;/span&gt;&lt;span&gt;{input_parquet_path}&lt;/span&gt;")&lt;br&gt;
    &lt;span&gt;except&lt;/span&gt; KeyError:&lt;br&gt;
        print(&lt;span&gt;"Error: 'chunk_text' column not found in the input Parquet file."&lt;/span&gt;)&lt;br&gt;
    &lt;span&gt;except&lt;/span&gt; Exception &lt;span&gt;as&lt;/span&gt; e:&lt;br&gt;
        print(&lt;span&gt;f"An unexpected error occurred: &lt;/span&gt;&lt;span&gt;{e}&lt;/span&gt;")&lt;/p&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Step 3: Put the data processing together&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Now it’s time to run the pipeline. In this example, we’ll load and chunk the Business Service documentation, generate embeddings, and write the results to Parquet for IRIS ingestion.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;CHUNK_SIZE_TOKENS = &lt;span class="mention"&gt;300&lt;/span&gt;&lt;br&gt;
    CHUNK_OVERLAP_TOKENS = &lt;span class="mention"&gt;50&lt;/span&gt;&lt;br&gt;
    ENCODING_NAME=&lt;span class="mention"&gt;"cl100k_base"&lt;/span&gt;&lt;br&gt;
    current_file_path = Path(&lt;strong&gt;file&lt;/strong&gt;).resolve()
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;load_documentation_to_parquet(input_dir=current_file_path.parent / &amp;lt;span class="mention"&amp;gt;"Documentation"&amp;lt;/span&amp;gt; / &amp;lt;span class="mention"&amp;gt;"BusinessService"&amp;lt;/span&amp;gt;, 
                              output_file=current_file_path.parent / &amp;lt;span class="mention"&amp;gt;"BusinessService.parquet"&amp;lt;/span&amp;gt;, 
                              chunk_size=CHUNK_SIZE_TOKENS, 
                              chunk_overlap=CHUNK_OVERLAP_TOKENS, 
                              encoding_name=ENCODING_NAME)
embed_and_save_parquet(input_parquet_path=current_file_path.parent / &amp;lt;span class="mention"&amp;gt;"BusinessService.parquet"&amp;lt;/span&amp;gt;, 
                        output_parquet_path=current_file_path.parent / &amp;lt;span class="mention"&amp;gt;"BusinessService_embedded.parquet"&amp;lt;/span&amp;gt;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;A row in our final business service parquet file will look something like this:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;{&lt;span class="mention"&gt;"filename"&lt;/span&gt;:&lt;span class="mention"&gt;"FileInboundAdapters.txt"&lt;/span&gt;,&lt;span class="mention"&gt;"relative_path"&lt;/span&gt;:&lt;span class="mention"&gt;"Documentation\BusinessService\Adapters\FileInboundAdapters.txt"&lt;/span&gt;,&lt;span class="mention"&gt;"absolute_path"&lt;/span&gt;:&lt;span class="mention"&gt;"C:\Users\…\Documentation\BusinessService\Adapters\FileInboundAdapters.txt"&lt;/span&gt;,&lt;span class="mention"&gt;"chunk_index"&lt;/span&gt;:&lt;span class="mention"&gt;0&lt;/span&gt;,&lt;span class="mention"&gt;"chunk_text"&lt;/span&gt;:&lt;span class="mention"&gt;"Settings for the File Inbound Adapter\nProvides reference information for settings of the file inbound adapter, EnsLib.File.InboundAdapterOpens in a new tab. You can configure these settings after you have added a business service that uses this adapter to your production.\nSummary"&lt;/span&gt;,&lt;span class="mention"&gt;"token_count"&lt;/span&gt;:&lt;span class="mention"&gt;52&lt;/span&gt;,&lt;span class="mention"&gt;"modified_time"&lt;/span&gt;:&lt;span class="mention"&gt;"2025-11-25T18:34:16.120336+00:00"&lt;/span&gt;,&lt;span class="mention"&gt;"size_bytes"&lt;/span&gt;:&lt;span class="mention"&gt;13316&lt;/span&gt;,&lt;span class="mention"&gt;"embedding"&lt;/span&gt;:[&lt;span class="mention"&gt;-0.02851865254342556&lt;/span&gt;,&lt;span class="mention"&gt;0.01860344596207142&lt;/span&gt;,…,&lt;span class="mention"&gt;0.0135544464207155&lt;/span&gt;]}&lt;br&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Section 2: Using IRIS Vector Search&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Step 4: Upload Your Embeddings to IRIS&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Choose the IRIS namespace and table name you’ll use to store embeddings. (The script below will create the table if it doesn’t already exist.) Then use the InterSystems IRIS Python DB-API driver to insert the chunks and their embeddings.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;The function below reads a Parquet file containing chunk text and embeddings, normalizes the embedding column to a JSON-serializable list of floats, connects to IRIS, creates the destination table if it doesn’t exist (with a VECTOR(FLOAT, 1536) column, where 1536 is the number of dimensions in the embedding), and then inserts each row using TO_VECTOR(?) in a parameterized SQL statement. It commits the transaction on success, logs progress, and cleans up the connection, rolling back on database errors.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="mention"&gt;import&lt;/span&gt; iris  &lt;span class="mention"&gt;# The InterSystems IRIS Python DB-API driver &lt;/span&gt;&lt;br&gt;
&lt;span class="mention"&gt;import&lt;/span&gt; pandas &lt;span class="mention"&gt;as&lt;/span&gt; pd&lt;br&gt;
&lt;span class="mention"&gt;import&lt;/span&gt; numpy &lt;span class="mention"&gt;as&lt;/span&gt; np&lt;br&gt;
&lt;span class="mention"&gt;import&lt;/span&gt; json&lt;br&gt;
&lt;span class="mention"&gt;from&lt;/span&gt; pathlib &lt;span class="mention"&gt;import&lt;/span&gt; Path

&lt;p&gt;&lt;span&gt;# --- Configuration ---&lt;/span&gt;&lt;br&gt;
PARQUET_FILE_PATH = &lt;span&gt;"your_embeddings.parquet"&lt;/span&gt;&lt;br&gt;
IRIS_HOST = &lt;span&gt;"localhost"&lt;/span&gt;&lt;br&gt;
IRIS_PORT = &lt;span&gt;8881&lt;/span&gt;&lt;br&gt;
IRIS_NAMESPACE = &lt;span&gt;"VECTOR"&lt;/span&gt;&lt;br&gt;
IRIS_USERNAME = &lt;span&gt;"superuser"&lt;/span&gt;&lt;br&gt;
IRIS_PASSWORD = &lt;span&gt;"sys"&lt;/span&gt;&lt;br&gt;
TABLE_NAME = &lt;span&gt;"AIDemo.Embeddings"&lt;/span&gt; &lt;span&gt;# Must match the table created in IRIS&lt;/span&gt;&lt;br&gt;
EMBEDDING_DIMENSIONS = &lt;span&gt;1536&lt;/span&gt; &lt;span&gt;# Must match the dimensions for the embeddings you used&lt;/span&gt;&lt;br&gt;
&lt;span&gt;def&lt;/span&gt; &lt;span&gt;upload_embeddings_to_iris&lt;/span&gt;&lt;span&gt;(parquet_path: str)&lt;/span&gt;:&lt;br&gt;
    """&lt;br&gt;
    Reads a Parquet file with 'chunk_text' and 'embedding' columns &lt;br&gt;
    and uploads them to an InterSystems IRIS vector database table.&lt;br&gt;
    """&lt;br&gt;
    &lt;span&gt;# 1. Load data from the Parquet file using pandas&lt;/span&gt;&lt;br&gt;
    &lt;span&gt;try&lt;/span&gt;:&lt;br&gt;
        df = pd.read_parquet(parquet_path)&lt;br&gt;
        &lt;span&gt;if&lt;/span&gt; &lt;span&gt;'chunk_text'&lt;/span&gt; &lt;span&gt;not&lt;/span&gt; &lt;span&gt;in&lt;/span&gt; df.columns &lt;span&gt;or&lt;/span&gt; &lt;span&gt;'embedding'&lt;/span&gt; &lt;span&gt;not&lt;/span&gt; &lt;span&gt;in&lt;/span&gt; df.columns:&lt;br&gt;
            print(&lt;span&gt;"Error: Parquet file must contain 'chunk_text' and 'embedding' columns."&lt;/span&gt;)&lt;br&gt;
            &lt;span&gt;return&lt;/span&gt;&lt;br&gt;
    &lt;span&gt;except&lt;/span&gt; FileNotFoundError:&lt;br&gt;
        print(&lt;span&gt;f"Error: The file at &lt;/span&gt;&lt;span&gt;{parquet_path}&lt;/span&gt; was not found.")&lt;br&gt;
        &lt;span&gt;return&lt;/span&gt;&lt;br&gt;
    &lt;span&gt;# Ensure embeddings are in a format compatible with TO_VECTOR function (list of floats)&lt;/span&gt;&lt;br&gt;
    &lt;span&gt;# Parquet often saves numpy arrays as lists&lt;/span&gt;&lt;br&gt;
    &lt;span&gt;if&lt;/span&gt; isinstance(df[&lt;span&gt;'embedding'&lt;/span&gt;].iloc[&lt;span&gt;0&lt;/span&gt;], np.ndarray):&lt;br&gt;
        df[&lt;span&gt;'embedding'&lt;/span&gt;] = df[&lt;span&gt;'embedding'&lt;/span&gt;].apply(&lt;span&gt;lambda&lt;/span&gt; x: x.tolist())&lt;br&gt;
    print(&lt;span&gt;f"Loaded &lt;/span&gt;&lt;span&gt;{len(df)}&lt;/span&gt; records from &lt;span&gt;{parquet_path}&lt;/span&gt;.")&lt;br&gt;
    &lt;span&gt;# 2. Establish connection to InterSystems IRIS&lt;/span&gt;&lt;br&gt;
    connection = &lt;span&gt;None&lt;/span&gt;&lt;br&gt;
    &lt;span&gt;try&lt;/span&gt;:&lt;br&gt;
        conn_string = &lt;span&gt;f"&lt;/span&gt;&lt;span&gt;{IRIS_HOST}&lt;/span&gt;:&lt;span&gt;{IRIS_PORT}&lt;/span&gt;/&lt;span&gt;{IRIS_NAMESPACE}&lt;/span&gt;"&lt;br&gt;
        connection = iris.connect(conn_string, IRIS_USERNAME, IRIS_PASSWORD)&lt;br&gt;
        cursor = connection.cursor()&lt;br&gt;
        print(&lt;span&gt;"Successfully connected to InterSystems IRIS."&lt;/span&gt;)&lt;br&gt;
        &lt;span&gt;# Create embedding table if it doesn't exist&lt;/span&gt;&lt;br&gt;
        cursor.execute(f"""&lt;br&gt;
            CREATE TABLE IF NOT EXISTS  &lt;span&gt;{TABLE_NAME}&lt;/span&gt; (&lt;br&gt;
            ID INTEGER IDENTITY PRIMARY KEY,&lt;br&gt;
            chunk_text VARCHAR(2500), embedding VECTOR(FLOAT, &lt;span&gt;{EMBEDDING_DIMENSIONS}&lt;/span&gt;)&lt;br&gt;
            )"""&lt;br&gt;
        )&lt;br&gt;
        &lt;span&gt;# 3. Prepare the SQL INSERT statement&lt;/span&gt;&lt;br&gt;
        &lt;span&gt;# InterSystems IRIS uses the TO_VECTOR function for inserting vector data via SQL&lt;/span&gt;&lt;br&gt;
        insert_sql = f"""&lt;br&gt;
        INSERT INTO &lt;span&gt;{TABLE_NAME}&lt;/span&gt; (chunk_text, embedding) &lt;br&gt;
        VALUES (?, TO_VECTOR(?))&lt;br&gt;
        """&lt;br&gt;
        &lt;span&gt;# 4. Iterate and insert data&lt;/span&gt;&lt;br&gt;
        count = &lt;span&gt;0&lt;/span&gt;&lt;br&gt;
        &lt;span&gt;for&lt;/span&gt; index, row &lt;span&gt;in&lt;/span&gt; df.iterrows():&lt;br&gt;
            text = row[&lt;span&gt;'chunk_text'&lt;/span&gt;]&lt;br&gt;
            &lt;span&gt;# Convert the list of floats to a JSON string, which is required by TO_VECTOR when using DB-API&lt;/span&gt;&lt;br&gt;
            vector_json_str = json.dumps(row[&lt;span&gt;'embedding'&lt;/span&gt;]) &lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;        cursor.execute(insert_sql, (text, vector_json_str))
        count += &amp;lt;span class="mention"&amp;gt;1&amp;lt;/span&amp;gt;
        &amp;lt;span class="mention"&amp;gt;if&amp;lt;/span&amp;gt; count % &amp;lt;span class="mention"&amp;gt;100&amp;lt;/span&amp;gt; == &amp;lt;span class="mention"&amp;gt;0&amp;lt;/span&amp;gt;:
            print(&amp;lt;span class="mention"&amp;gt;f"Inserted &amp;lt;/span&amp;gt;&amp;lt;span class="mention"&amp;gt;{count}&amp;lt;/span&amp;gt; rows...")

    &amp;lt;span class="mention"&amp;gt;# Commit the transaction&amp;lt;/span&amp;gt;
    connection.commit()
    print(&amp;lt;span class="mention"&amp;gt;f"Data upload complete. Total rows inserted: &amp;lt;/span&amp;gt;&amp;lt;span class="mention"&amp;gt;{count}&amp;lt;/span&amp;gt;.")
&amp;lt;span class="mention"&amp;gt;except&amp;lt;/span&amp;gt; iris.DBAPIError &amp;lt;span class="mention"&amp;gt;as&amp;lt;/span&amp;gt; e:
    print(&amp;lt;span class="mention"&amp;gt;f"A database error occurred: &amp;lt;/span&amp;gt;&amp;lt;span class="mention"&amp;gt;{e}&amp;lt;/span&amp;gt;")
    &amp;lt;span class="mention"&amp;gt;if&amp;lt;/span&amp;gt; connection:
        connection.rollback()
&amp;lt;span class="mention"&amp;gt;except&amp;lt;/span&amp;gt; Exception &amp;lt;span class="mention"&amp;gt;as&amp;lt;/span&amp;gt; e:
    print(&amp;lt;span class="mention"&amp;gt;f"An unexpected error occurred: &amp;lt;/span&amp;gt;&amp;lt;span class="mention"&amp;gt;{e}&amp;lt;/span&amp;gt;")
&amp;lt;span class="mention"&amp;gt;finally&amp;lt;/span&amp;gt;:
    &amp;lt;span class="mention"&amp;gt;if&amp;lt;/span&amp;gt; connection:
        connection.close()
        print(&amp;lt;span class="mention"&amp;gt;"Database connection closed."&amp;lt;/span&amp;gt;)
&lt;/code&gt;&lt;/pre&gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Example usage:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;current_file_path = Path(&lt;strong&gt;file&lt;/strong&gt;).resolve()&lt;br&gt;
    upload_embeddings_to_iris(current_file_path.parent / &lt;span class="mention"&gt;"BusinessService_embedded.parquet"&lt;/span&gt;)&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Step 5: Create your embedding search functionality&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Next, we’ll create a search function that embeds the user’s query, runs a vector similarity search in IRIS via the Python DB&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;‑&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;API, and returns the top&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;‑&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;k matching chunks from our embeddings table.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;The example function below reads a Parquet file containing text chunks and their corresponding embeddings, then uploads this data into the InterSystems IRIS vector storage table. It first validates the Parquet file and normalizes the embedding format into a JSON array string compatible with IRIS’s TO_VECTOR function. After establishing a connection to IRIS, the function creates the target table if it does not exist, prepares a parameterized SQL INSERT statement, and iterates through each row to insert the chunk text and embedding. Finally, it commits the transaction, logs progress, and ensures proper error handling and cleanup of the database connection.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="mention"&gt;import&lt;/span&gt; iris&lt;br&gt;
&lt;span class="mention"&gt;from&lt;/span&gt; typing &lt;span class="mention"&gt;import&lt;/span&gt; List&lt;br&gt;
&lt;span class="mention"&gt;import&lt;/span&gt; os&lt;br&gt;
&lt;span class="mention"&gt;from&lt;/span&gt; openai &lt;span class="mention"&gt;import&lt;/span&gt; OpenAI

&lt;p&gt;&lt;span&gt;# --- Configuration ---&lt;/span&gt;&lt;br&gt;
PARQUET_FILE_PATH = &lt;span&gt;"your_embeddings.parquet"&lt;/span&gt;&lt;br&gt;
IRIS_HOST = &lt;span&gt;"localhost"&lt;/span&gt;&lt;br&gt;
IRIS_PORT = &lt;span&gt;8881&lt;/span&gt;&lt;br&gt;
IRIS_NAMESPACE = &lt;span&gt;"VECTOR"&lt;/span&gt;&lt;br&gt;
IRIS_USERNAME = &lt;span&gt;"superuser"&lt;/span&gt;&lt;br&gt;
IRIS_PASSWORD = &lt;span&gt;"sys"&lt;/span&gt;&lt;br&gt;
TABLE_NAME = &lt;span&gt;"AIDemo.Embeddings"&lt;/span&gt; &lt;span&gt;# Must match the table created in IRIS&lt;/span&gt;&lt;br&gt;
EMBEDDING_DIMENSIONS = &lt;span&gt;1536&lt;/span&gt;&lt;br&gt;
MODEL = &lt;span&gt;"text-embedding-3-small"&lt;/span&gt;&lt;br&gt;
&lt;span&gt;def&lt;/span&gt; &lt;span&gt;get_embedding&lt;/span&gt;&lt;span&gt;(text: str, model: str, client)&lt;/span&gt; -&amp;gt; List[float]:&lt;br&gt;
    &lt;span&gt;# Normalize newlines and coerce to str&lt;/span&gt;&lt;br&gt;
    payload = [(&lt;span&gt;""&lt;/span&gt; &lt;span&gt;if&lt;/span&gt; text &lt;span&gt;is&lt;/span&gt; &lt;span&gt;None&lt;/span&gt; &lt;span&gt;else&lt;/span&gt; str(text)).replace(&lt;span&gt;"\n"&lt;/span&gt;, &lt;span&gt;" "&lt;/span&gt;) &lt;span&gt;for&lt;/span&gt; _ &lt;span&gt;in&lt;/span&gt; range(&lt;span&gt;1&lt;/span&gt;)]&lt;br&gt;
    resp = client.embeddings.create(model=model, input=payload, encoding_format=&lt;span&gt;"float"&lt;/span&gt;)&lt;br&gt;
    &lt;span&gt;return&lt;/span&gt; resp.data[&lt;span&gt;0&lt;/span&gt;].embedding&lt;br&gt;
&lt;span&gt;def&lt;/span&gt; &lt;span&gt;search_embeddings&lt;/span&gt;&lt;span&gt;(search: str, top_k: int)&lt;/span&gt;:&lt;br&gt;
    print(&lt;span&gt;"-------RAG--------"&lt;/span&gt;)&lt;br&gt;
    print(&lt;span&gt;f"Searching IRIS vector store for: "&lt;/span&gt;, search)&lt;br&gt;
    key = os.getenv(&lt;span&gt;"OPENAI_API_KEY"&lt;/span&gt;)&lt;br&gt;
    client = OpenAI(api_key=key)&lt;br&gt;
 &lt;span&gt;# 2. Establish connection to InterSystems IRIS&lt;/span&gt;&lt;br&gt;
    connection = &lt;span&gt;None&lt;/span&gt;&lt;br&gt;
    &lt;span&gt;try&lt;/span&gt;:&lt;br&gt;
        conn_string = &lt;span&gt;f"&lt;/span&gt;&lt;span&gt;{IRIS_HOST}&lt;/span&gt;:&lt;span&gt;{IRIS_PORT}&lt;/span&gt;/&lt;span&gt;{IRIS_NAMESPACE}&lt;/span&gt;"&lt;br&gt;
        connection = iris.connect(conn_string, IRIS_USERNAME, IRIS_PASSWORD)&lt;br&gt;
        cursor = connection.cursor()&lt;br&gt;
        print(&lt;span&gt;"Successfully connected to InterSystems IRIS."&lt;/span&gt;)&lt;br&gt;
        &lt;span&gt;# Embed query for searching&lt;/span&gt;&lt;br&gt;
        &lt;span&gt;#emb_raw = str(test_embedding) # FOR TESTING&lt;/span&gt;&lt;br&gt;
        emb_raw = get_embedding(search, model=MODEL, client=client)&lt;br&gt;
        emb_raw = str(emb_raw)&lt;br&gt;
        &lt;span&gt;#print("EMB_RAW:", emb_raw)&lt;/span&gt;&lt;br&gt;
        emb_values = []&lt;br&gt;
        &lt;span&gt;for&lt;/span&gt; x &lt;span&gt;in&lt;/span&gt; emb_raw.replace(&lt;span&gt;'['&lt;/span&gt;, &lt;span&gt;''&lt;/span&gt;).replace(&lt;span&gt;']'&lt;/span&gt;, &lt;span&gt;''&lt;/span&gt;).split(&lt;span&gt;','&lt;/span&gt;):&lt;br&gt;
            &lt;span&gt;try&lt;/span&gt;:&lt;br&gt;
                emb_values.append(str(float(x.strip())))&lt;br&gt;
            &lt;span&gt;except&lt;/span&gt; ValueError:&lt;br&gt;
                &lt;span&gt;continue&lt;/span&gt;&lt;br&gt;
        emb_str = &lt;span&gt;", "&lt;/span&gt;.join(emb_values)&lt;br&gt;
        &lt;span&gt;# Prepare the SQL SELECT statement&lt;/span&gt;&lt;br&gt;
        search_sql = f"""&lt;br&gt;
        SELECT TOP &lt;span&gt;{top_k}&lt;/span&gt; ID, chunk_text FROM &lt;span&gt;{TABLE_NAME}&lt;/span&gt;&lt;br&gt;
        ORDER BY VECTOR_DOT_PRODUCT((embedding), TO_VECTOR(('&lt;span&gt;{emb_str}&lt;/span&gt;'), FLOAT)) DESC&lt;br&gt;
        """&lt;br&gt;
        cursor.execute(search_sql)&lt;br&gt;
        results = []&lt;br&gt;
        row = cursor.fetchone()&lt;br&gt;
        &lt;span&gt;while&lt;/span&gt; row &lt;span&gt;is&lt;/span&gt; &lt;span&gt;not&lt;/span&gt; &lt;span&gt;None&lt;/span&gt;:&lt;br&gt;
            results.append(row[:])&lt;br&gt;
            row = cursor.fetchone()&lt;/p&gt;

&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;span class="mention"&amp;gt;except&amp;lt;/span&amp;gt; iris.DBAPIError &amp;lt;span class="mention"&amp;gt;as&amp;lt;/span&amp;gt; e:
    print(&amp;lt;span class="mention"&amp;gt;f"A database error occurred: &amp;lt;/span&amp;gt;&amp;lt;span class="mention"&amp;gt;{e}&amp;lt;/span&amp;gt;")
    &amp;lt;span class="mention"&amp;gt;if&amp;lt;/span&amp;gt; connection:
        connection.rollback()
&amp;lt;span class="mention"&amp;gt;except&amp;lt;/span&amp;gt; Exception &amp;lt;span class="mention"&amp;gt;as&amp;lt;/span&amp;gt; e:
    print(&amp;lt;span class="mention"&amp;gt;f"An unexpected error occurred: &amp;lt;/span&amp;gt;&amp;lt;span class="mention"&amp;gt;{e}&amp;lt;/span&amp;gt;")
&amp;lt;span class="mention"&amp;gt;finally&amp;lt;/span&amp;gt;:
    &amp;lt;span class="mention"&amp;gt;if&amp;lt;/span&amp;gt; connection:
        connection.close()
        print(&amp;lt;span class="mention"&amp;gt;"Database connection closed."&amp;lt;/span&amp;gt;)
    print(&amp;lt;span class="mention"&amp;gt;"------------RAG Finished-------------"&amp;lt;/span&amp;gt;)
    &amp;lt;span class="mention"&amp;gt;return&amp;lt;/span&amp;gt; results
&lt;/code&gt;&lt;/pre&gt;


&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;strong&gt;Step 6: Add RAG context to your agent&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Now that you’ve:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;

&lt;li&gt;&lt;span&gt;&lt;span&gt;Chunked and embedded your documentation,&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;

&lt;li&gt;&lt;span&gt;&lt;span&gt;Uploaded embeddings to IRIS and created a vector index,&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;

&lt;li&gt;&lt;span&gt;&lt;span&gt;Built a search function for IRIS vector queries,&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;

&lt;/ul&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;it’s time to put it all together into an interactive Retrieval-Augmented Generation (RAG) chat using the OpenAI Responses API. For this example we will give the agent access to the search function directly (for more fine-grained control of the agent), but this can also be done using a library like langchain as well.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;First, you will need to create your instructions for the agent, making sure give it access to the search function:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="mention"&gt;import&lt;/span&gt; os&lt;br&gt;&lt;br&gt;
&lt;span class="mention"&gt;# ---------------------------- Configuration ----------------------------&lt;/span&gt;&lt;br&gt;&lt;br&gt;
MODEL = os.getenv(&lt;span class="mention"&gt;"OPENAI_RESPONSES_MODEL"&lt;/span&gt;, &lt;span class="mention"&gt;"gpt-5-nano"&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
SYSTEM_INSTRUCTIONS = (&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;"You are a helpful assistant that answers questions about InterSystems "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;"business services and related integration capabilities. You have access "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;"to a vector database of documentation chunks about business services. "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;"\n\n"&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;"Use the &lt;code&gt;search_business_docs&lt;/code&gt; tool whenever the user asks about specific "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;"settings, configuration options, or how to perform tasks with business "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;"services. Ground your answers in the retrieved context, quoting or "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;"summarizing relevant chunks. If nothing relevant is found, say so "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;"clearly and answer from your general knowledge with a disclaimer."&lt;/span&gt;&lt;br&gt;&lt;br&gt;
)


&lt;p&gt;&lt;span&gt;# ---------------------------- Tool Definition ----------------------------&lt;/span&gt;&lt;br&gt;&lt;br&gt;
TOOLS = [&lt;br&gt;&lt;br&gt;
    {&lt;br&gt;&lt;br&gt;
        &lt;span&gt;"type"&lt;/span&gt;: &lt;span&gt;"function"&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
        &lt;span&gt;"name"&lt;/span&gt;: &lt;span&gt;"search_business_docs"&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
        &lt;span&gt;"description"&lt;/span&gt;: (&lt;br&gt;&lt;br&gt;
            &lt;span&gt;"Searches a vector database of documentation chunks related to "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
            &lt;span&gt;"business services and returns the most relevant snippets."&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        ),&lt;br&gt;&lt;br&gt;
        &lt;span&gt;"parameters"&lt;/span&gt;: {&lt;br&gt;&lt;br&gt;
            &lt;span&gt;"type"&lt;/span&gt;: &lt;span&gt;"object"&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
            &lt;span&gt;"properties"&lt;/span&gt;: {&lt;br&gt;&lt;br&gt;
                &lt;span&gt;"query"&lt;/span&gt;: {&lt;br&gt;&lt;br&gt;
                    &lt;span&gt;"type"&lt;/span&gt;: &lt;span&gt;"string"&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
                    &lt;span&gt;"description"&lt;/span&gt;: (&lt;br&gt;&lt;br&gt;
                        &lt;span&gt;"Natural language search query describing what you want "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
                        &lt;span&gt;"to know about business services."&lt;/span&gt;&lt;br&gt;&lt;br&gt;
                    ),&lt;br&gt;&lt;br&gt;
                },&lt;br&gt;&lt;br&gt;
                &lt;span&gt;"top_k"&lt;/span&gt;: {&lt;br&gt;&lt;br&gt;
                    &lt;span&gt;"type"&lt;/span&gt;: &lt;span&gt;"integer"&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
                    &lt;span&gt;"description"&lt;/span&gt;: (&lt;br&gt;&lt;br&gt;
                        &lt;span&gt;"Maximum number of results to retrieve from the vector DB."&lt;/span&gt;&lt;br&gt;&lt;br&gt;
                    ),&lt;br&gt;&lt;br&gt;
                    &lt;span&gt;"minimum"&lt;/span&gt;: &lt;span&gt;1&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
                    &lt;span&gt;"maximum"&lt;/span&gt;: &lt;span&gt;10&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
                },&lt;br&gt;&lt;br&gt;
            },&lt;br&gt;&lt;br&gt;
            &lt;span&gt;"required"&lt;/span&gt;: [&lt;span&gt;"query"&lt;/span&gt;, &lt;span&gt;"top_k"&lt;/span&gt;],&lt;br&gt;&lt;br&gt;
            &lt;span&gt;"additionalProperties"&lt;/span&gt;: &lt;span&gt;False&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
        },&lt;br&gt;&lt;br&gt;
        &lt;span&gt;"strict"&lt;/span&gt;: &lt;span&gt;True&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
    }&lt;br&gt;&lt;br&gt;
]&lt;br&gt;&lt;br&gt;
&lt;/p&gt;&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Now we need a small “router” method to let the model actually use our RAG tool.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;call_rag_tool(name, args) receives a function call emitted by the OpenAI Responses API and routes it to our local implementation (the search_business_docs tool that wraps Search.search_embeddings). It takes the model’s query and top_k, runs the IRIS vector search, and returns a JSON&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;‑&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;encoded payload of the top matches (IDs and text snippets). This stringified JSON is important because the Responses API expects tool outputs as strings; by formatting the results predictably, we make it easy for the model to ground its final answer in the retrieved documentation. If an unknown tool name is requested, the function returns an error payload so the model can handle it gracefully.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="mention"&gt;def&lt;/span&gt; &lt;span class="mention"&gt;call_rag_tool&lt;/span&gt;&lt;span class="mention"&gt;(name: str, args: Dict[str, Any])&lt;/span&gt; -&amp;gt; str:&lt;br&gt;&lt;br&gt;
    """Route function calls from the model to our local Python implementations.&lt;br&gt;&lt;br&gt;
    Currently only supports the &lt;code&gt;search_business_docs&lt;/code&gt; tool, which wraps&lt;br&gt;&lt;br&gt;
    &lt;code&gt;Search.search_embeddings&lt;/code&gt;.&lt;br&gt;&lt;br&gt;
    The return value must be a string. We will JSON-encode a small structure&lt;br&gt;&lt;br&gt;
    so the model can consume the results reliably.&lt;br&gt;&lt;br&gt;
    """&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;if&lt;/span&gt; name == &lt;span class="mention"&gt;"search_business_docs"&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
        query = args.get(&lt;span class="mention"&gt;"query"&lt;/span&gt;, &lt;span class="mention"&gt;""&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
        top_k = args.get(&lt;span class="mention"&gt;"top_k"&lt;/span&gt;, &lt;span class="mention"&gt;""&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
        results = search_embeddings(query, top_k)&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;# Expecting each row to be something like (ID, chunk_text)&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        formatted: List[Dict[str, Any]] = []&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;for&lt;/span&gt; row &lt;span class="mention"&gt;in&lt;/span&gt; results:&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;if&lt;/span&gt; &lt;span class="mention"&gt;not&lt;/span&gt; row:&lt;br&gt;&lt;br&gt;
                &lt;span class="mention"&gt;continue&lt;/span&gt;&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;# Be defensive in case row length/structure changes&lt;/span&gt;&lt;br&gt;&lt;br&gt;
            doc_id = row[&lt;span class="mention"&gt;0&lt;/span&gt;] &lt;span class="mention"&gt;if&lt;/span&gt; len(row) &amp;gt; &lt;span class="mention"&gt;0&lt;/span&gt; &lt;span class="mention"&gt;else&lt;/span&gt; &lt;span class="mention"&gt;None&lt;/span&gt;&lt;br&gt;&lt;br&gt;
            text = row[&lt;span class="mention"&gt;1&lt;/span&gt;] &lt;span class="mention"&gt;if&lt;/span&gt; len(row) &amp;gt; &lt;span class="mention"&gt;1&lt;/span&gt; &lt;span class="mention"&gt;else&lt;/span&gt; &lt;span class="mention"&gt;None&lt;/span&gt;&lt;br&gt;&lt;br&gt;
            formatted.append({&lt;span class="mention"&gt;"id"&lt;/span&gt;: doc_id, &lt;span class="mention"&gt;"text"&lt;/span&gt;: text})&lt;br&gt;&lt;br&gt;
        payload = {&lt;span class="mention"&gt;"query"&lt;/span&gt;: query, &lt;span class="mention"&gt;"results"&lt;/span&gt;: formatted}&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;return&lt;/span&gt; json.dumps(payload, ensure_ascii=&lt;span class="mention"&gt;False&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;# Unknown tool; return an error-style payload&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;return&lt;/span&gt; json.dumps({&lt;span class="mention"&gt;"error"&lt;/span&gt;: &lt;span class="mention"&gt;f"Unknown tool name: &lt;/span&gt;&lt;span class="mention"&gt;{name}&lt;/span&gt;"})&lt;br&gt;&lt;br&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;Now that we have our RAG tool, we can start work on the chat loop logic. First, we need a helper to reliably pull the model’s final answer and any tool outputs from the OpenAI Responses API. extract_answer_and_sources(response) walks the response.output items containing the models outputs and concatenates them into a single answer string. It also collects the function_call_output payloads (the JSON we returned from our RAG tool), parses them, and exposes them as tool_context for transparency and debugging. The function parses the model output into a compact structure: {"answer": ..., "tool_context": [...]}.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="mention"&gt;def&lt;/span&gt; &lt;span class="mention"&gt;extract_answer_and_sources&lt;/span&gt;&lt;span class="mention"&gt;(response: Any)&lt;/span&gt; -&amp;gt; Dict[str, Any]:&lt;br&gt;&lt;br&gt;
    """Extract a structured answer and optional sources from a Responses API object.&lt;br&gt;&lt;br&gt;
    We don't enforce a global JSON response schema here. Instead, we:&lt;br&gt;&lt;br&gt;
    - Prefer the SDK's &lt;code&gt;output_text&lt;/code&gt; convenience when present&lt;br&gt;&lt;br&gt;
    - Fall back to concatenating any &lt;code&gt;output_text&lt;/code&gt; content parts&lt;br&gt;&lt;br&gt;
    - Also surface any tool-call-output payloads we got back this turn as&lt;br&gt;&lt;br&gt;
      &lt;code&gt;tool_context&lt;/code&gt; for debugging/inspection.&lt;br&gt;&lt;br&gt;
    """&lt;br&gt;&lt;br&gt;
    answer_text = &lt;span class="mention"&gt;""&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;# Preferred: SDK convenience&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;if&lt;/span&gt; hasattr(response, &lt;span class="mention"&gt;"output_text"&lt;/span&gt;) &lt;span class="mention"&gt;and&lt;/span&gt; response.output_text:&lt;br&gt;&lt;br&gt;
        answer_text = response.output_text&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;else&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;# Fallback: walk output items&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        parts: List[str] = []&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;for&lt;/span&gt; item &lt;span class="mention"&gt;in&lt;/span&gt; getattr(response, &lt;span class="mention"&gt;"output"&lt;/span&gt;, []) &lt;span class="mention"&gt;or&lt;/span&gt; []:&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;if&lt;/span&gt; getattr(item, &lt;span class="mention"&gt;"type"&lt;/span&gt;, &lt;span class="mention"&gt;None&lt;/span&gt;) == &lt;span class="mention"&gt;"message"&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
                &lt;span class="mention"&gt;for&lt;/span&gt; c &lt;span class="mention"&gt;in&lt;/span&gt; getattr(item, &lt;span class="mention"&gt;"content"&lt;/span&gt;, []) &lt;span class="mention"&gt;or&lt;/span&gt; []:&lt;br&gt;&lt;br&gt;
                    &lt;span class="mention"&gt;if&lt;/span&gt; getattr(c, &lt;span class="mention"&gt;"type"&lt;/span&gt;, &lt;span class="mention"&gt;None&lt;/span&gt;) == &lt;span class="mention"&gt;"output_text"&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
                        parts.append(getattr(c, &lt;span class="mention"&gt;"text"&lt;/span&gt;, &lt;span class="mention"&gt;""&lt;/span&gt;))&lt;br&gt;&lt;br&gt;
        answer_text = &lt;span class="mention"&gt;""&lt;/span&gt;.join(parts)&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;# Collect any function_call_output items for visibility&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    tool_context: List[Dict[str, Any]] = []&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;for&lt;/span&gt; item &lt;span class="mention"&gt;in&lt;/span&gt; getattr(response, &lt;span class="mention"&gt;"output"&lt;/span&gt;, []) &lt;span class="mention"&gt;or&lt;/span&gt; []:&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;if&lt;/span&gt; getattr(item, &lt;span class="mention"&gt;"type"&lt;/span&gt;, &lt;span class="mention"&gt;None&lt;/span&gt;) == &lt;span class="mention"&gt;"function_call_output"&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;try&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
                tool_context.append({&lt;br&gt;&lt;br&gt;
                    &lt;span class="mention"&gt;"call_id"&lt;/span&gt;: getattr(item, &lt;span class="mention"&gt;"call_id"&lt;/span&gt;, &lt;span class="mention"&gt;None&lt;/span&gt;),&lt;br&gt;&lt;br&gt;
                    &lt;span class="mention"&gt;"output"&lt;/span&gt;: json.loads(getattr(item, &lt;span class="mention"&gt;"output"&lt;/span&gt;, &lt;span class="mention"&gt;""&lt;/span&gt;)),&lt;br&gt;&lt;br&gt;
                })&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;except&lt;/span&gt; Exception:&lt;br&gt;&lt;br&gt;
                tool_context.append({&lt;br&gt;&lt;br&gt;
                    &lt;span class="mention"&gt;"call_id"&lt;/span&gt;: getattr(item, &lt;span class="mention"&gt;"call_id"&lt;/span&gt;, &lt;span class="mention"&gt;None&lt;/span&gt;),&lt;br&gt;&lt;br&gt;
                    &lt;span class="mention"&gt;"output"&lt;/span&gt;: getattr(item, &lt;span class="mention"&gt;"output"&lt;/span&gt;, &lt;span class="mention"&gt;""&lt;/span&gt;),&lt;br&gt;&lt;br&gt;
                })&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;return&lt;/span&gt; {&lt;span class="mention"&gt;"answer"&lt;/span&gt;: answer_text.strip(), &lt;span class="mention"&gt;"tool_context"&lt;/span&gt;: tool_context}&lt;br&gt;&lt;br&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;With the help of extract_answer_and_sources we can build the whole chat loop to orchestrate a two&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;‑&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;phase, tool&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;‑&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;calling conversation with the OpenAI Responses API. The chat_loop() function runs an interactive CLI: it collects the user’s question, sends a first request with system instructions and the search_business_docs tool, and then inspects any function_call items the model emits. For each function call, it executes our local RAG tool (call_rag_tool, which wraps search_embeddings) and appends the result back to the conversation as a function_call_output. It then makes a second request asking the model to use those tool outputs to produce a grounded answer, parses that answer via extract_answer_and_sources, and prints it. The loop maintains running context in input_items so each turn can build on prior messages and tool results.&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;&lt;span class="mention"&gt;def&lt;/span&gt; &lt;span class="mention"&gt;chat_loop&lt;/span&gt;&lt;span class="mention"&gt;()&lt;/span&gt; -&amp;gt; &lt;span class="mention"&gt;None&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
    """Run an interactive CLI chat loop using the OpenAI Responses API.&lt;br&gt;&lt;br&gt;
    The loop supports multi-step tool-calling:&lt;br&gt;&lt;br&gt;
    - First call may return one or more &lt;code&gt;function_call&lt;/code&gt; items&lt;br&gt;&lt;br&gt;
    - We execute those locally (e.g., call search_embeddings)&lt;br&gt;&lt;br&gt;
    - We send the tool outputs back in a second &lt;code&gt;responses.create&lt;/code&gt; call&lt;br&gt;&lt;br&gt;
    - Then we print the model's final, grounded answer&lt;br&gt;&lt;br&gt;
    """&lt;br&gt;&lt;br&gt;
    key = os.getenv(&lt;span class="mention"&gt;"OPENAI_API_KEY"&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;if&lt;/span&gt; &lt;span class="mention"&gt;not&lt;/span&gt; key:&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;raise&lt;/span&gt; RuntimeError(&lt;span class="mention"&gt;"OPENAI_API_KEY is not set in the environment."&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
    client = OpenAI(api_key=key)&lt;br&gt;&lt;br&gt;
    print(&lt;span class="mention"&gt;"\nBusiness Service RAG Chat"&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
    print(&lt;span class="mention"&gt;"Type 'exit' or 'quit' to stop.\n"&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;# Running list of inputs (messages + tool calls + tool outputs) for context&lt;/span&gt;&lt;br&gt;&lt;br&gt;
    input_items: List[Dict[str, Any]] = []&lt;br&gt;&lt;br&gt;
    &lt;span class="mention"&gt;while&lt;/span&gt; &lt;span class="mention"&gt;True&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
        user_input = input(&lt;span class="mention"&gt;"You: "&lt;/span&gt;).strip()&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;if&lt;/span&gt; &lt;span class="mention"&gt;not&lt;/span&gt; user_input:&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;continue&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;if&lt;/span&gt; user_input.lower() &lt;span class="mention"&gt;in&lt;/span&gt; {&lt;span class="mention"&gt;"exit"&lt;/span&gt;, &lt;span class="mention"&gt;"quit"&lt;/span&gt;}:&lt;br&gt;&lt;br&gt;
            print(&lt;span class="mention"&gt;"Goodbye."&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;break&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;# Add user message&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        input_items.append({&lt;span class="mention"&gt;"role"&lt;/span&gt;: &lt;span class="mention"&gt;"user"&lt;/span&gt;, &lt;span class="mention"&gt;"content"&lt;/span&gt;: user_input})&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;# 1) First call: let the model decide whether to call tools&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        response = client.responses.create(&lt;br&gt;&lt;br&gt;
            model=MODEL,&lt;br&gt;&lt;br&gt;
            instructions=SYSTEM_INSTRUCTIONS,&lt;br&gt;&lt;br&gt;
            tools=TOOLS,&lt;br&gt;&lt;br&gt;
            input=input_items,&lt;br&gt;&lt;br&gt;
        )&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;# Save model output items to our running conversation&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        input_items += response.output&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;# 2) Execute any function calls&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;# The Responses API returns &lt;code&gt;function_call&lt;/code&gt; items in &lt;code&gt;response.output&lt;/code&gt;.&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;for&lt;/span&gt; item &lt;span class="mention"&gt;in&lt;/span&gt; response.output:&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;if&lt;/span&gt; getattr(item, &lt;span class="mention"&gt;"type"&lt;/span&gt;, &lt;span class="mention"&gt;None&lt;/span&gt;) != &lt;span class="mention"&gt;"function_call"&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
                &lt;span class="mention"&gt;continue&lt;/span&gt;&lt;br&gt;&lt;br&gt;
            name = getattr(item, &lt;span class="mention"&gt;"name"&lt;/span&gt;, &lt;span class="mention"&gt;None&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
            raw_args = getattr(item, &lt;span class="mention"&gt;"arguments"&lt;/span&gt;, &lt;span class="mention"&gt;"{}"&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;try&lt;/span&gt;:&lt;br&gt;&lt;br&gt;
                args = json.loads(raw_args) &lt;span class="mention"&gt;if&lt;/span&gt; isinstance(raw_args, str) &lt;span class="mention"&gt;else&lt;/span&gt; raw_args&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;except&lt;/span&gt; json.JSONDecodeError:&lt;br&gt;&lt;br&gt;
                args = {&lt;span class="mention"&gt;"query"&lt;/span&gt;: user_input}&lt;br&gt;&lt;br&gt;
            result_str = call_rag_tool(name, args &lt;span class="mention"&gt;or&lt;/span&gt; {})&lt;br&gt;&lt;br&gt;
            &lt;span class="mention"&gt;# Append tool result back as function_call_output&lt;/span&gt;&lt;br&gt;&lt;br&gt;
            input_items.append(&lt;br&gt;&lt;br&gt;
                {&lt;br&gt;&lt;br&gt;
                    &lt;span class="mention"&gt;"type"&lt;/span&gt;: &lt;span class="mention"&gt;"function_call_output"&lt;/span&gt;,&lt;br&gt;&lt;br&gt;
                    &lt;span class="mention"&gt;"call_id"&lt;/span&gt;: getattr(item, &lt;span class="mention"&gt;"call_id"&lt;/span&gt;, &lt;span class="mention"&gt;None&lt;/span&gt;),&lt;br&gt;&lt;br&gt;
                    &lt;span class="mention"&gt;"output"&lt;/span&gt;: result_str,&lt;br&gt;&lt;br&gt;
                }&lt;br&gt;&lt;br&gt;
            )&lt;br&gt;&lt;br&gt;
        &lt;span class="mention"&gt;# 3) Second call: ask the model to answer using tool outputs&lt;/span&gt;&lt;br&gt;&lt;br&gt;
        followup = client.responses.create(&lt;br&gt;&lt;br&gt;
            model=MODEL,&lt;br&gt;&lt;br&gt;
            instructions=(&lt;br&gt;&lt;br&gt;
                SYSTEM_INSTRUCTIONS&lt;br&gt;&lt;br&gt;
                + &lt;span class="mention"&gt;"\n\nYou have just received outputs from your tools. "&lt;/span&gt;&lt;br&gt;&lt;br&gt;
                + &lt;span class="mention"&gt;"Use them to give a concise, well-structured answer."&lt;/span&gt;&lt;br&gt;&lt;br&gt;
            ),&lt;br&gt;&lt;br&gt;
            tools=TOOLS,&lt;br&gt;&lt;br&gt;
            input=input_items,&lt;br&gt;&lt;br&gt;
        )&lt;br&gt;&lt;br&gt;
        structured = extract_answer_and_sources(followup)&lt;br&gt;&lt;br&gt;
        print(&lt;span class="mention"&gt;"Agent:\n"&lt;/span&gt; + structured[&lt;span class="mention"&gt;"answer"&lt;/span&gt;] + &lt;span class="mention"&gt;"\n"&lt;/span&gt;)&lt;br&gt;&lt;br&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;That’s it! You’ve built a complete RAG pipeline powered by IRIS Vector Search. While this example focused on a simple use case, IRIS Vector Search opens the door to many more possibilities:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;

&lt;li&gt;&lt;span&gt;&lt;span&gt;Knowledge store for more complex customer support agents&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;

&lt;li&gt;&lt;span&gt;&lt;span&gt;Conversational context storage for hyper-personalized agents &lt;/span&gt;&lt;/span&gt;&lt;/li&gt;

&lt;li&gt;&lt;span&gt;&lt;span&gt;Anomaly detection in textual data&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;

&lt;li&gt;&lt;span&gt;&lt;span&gt;Clustering analysis for textual data&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;

&lt;/ul&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;I hope this walkthrough gave you a solid starting point for exploring vector search and building your own AI-driven applications with InterSystems IRIS!&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;The full codebase can be found here:&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;

&lt;li&gt;&lt;a href="https://openexchange.intersystems.com/portal/products/IRISVectorSearchRAGExample" rel="noopener noreferrer"&gt;&lt;span&gt;&lt;span&gt;&lt;a href="https://openexchange.intersystems.com/portal/products/IRISVectorSearchRAGExample" rel="noopener noreferrer"&gt;&lt;/a&gt;&lt;a href="https://openexchange.intersystems.com/portal/products/IRISVectorSearchRAGExample" rel="noopener noreferrer"&gt;https://openexchange.intersystems.com/portal/products/IRISVectorSearchRAGExample&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;

&lt;li&gt;&lt;a href="https://github.com/isc-epolakie/IRISVectorSearchRAGExample" rel="noopener noreferrer"&gt;&lt;span&gt;&lt;span&gt;&lt;a href="https://github.com/isc-epolakie/IRISVectorSearchRAGExample" rel="noopener noreferrer"&gt;&lt;/a&gt;&lt;a href="https://github.com/isc-epolakie/IRISVectorSearchRAGExample" rel="noopener noreferrer"&gt;https://github.com/isc-epolakie/IRISVectorSearchRAGExample&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;/li&gt;

&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>api</category>
      <category>python</category>
      <category>sql</category>
    </item>
    <item>
      <title>Virtualizing large databases - VMware CPU capacity planning</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Wed, 25 Mar 2026 19:39:06 +0000</pubDate>
      <link>https://dev.to/intersystems/virtualizing-large-databases-vmware-cpu-capacity-planning-13b6</link>
      <guid>https://dev.to/intersystems/virtualizing-large-databases-vmware-cpu-capacity-planning-13b6</guid>
      <description>&lt;p&gt;I am often asked by customers, vendors or internal teams to explain CPU capacity planning for &lt;em&gt;large production databases&lt;/em&gt;  running on VMware vSphere. &lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;This post was originally written in 2017, I am updating the post in February 2026. For context I have kept the original post, but highlighted changes. This post was originally written for ESXi 6.0. The core principles remain valid for vSphere 7.x and 8.x, though there have been improvements to vNUMA handling, CPU scheduling (particularly for AMD EPYC), and CPU Hot Add compatibility with vNUMA in vSphere 8. Always consult the Performance Best Practices guide for your specific vSphere version. For a deeper dive see the "Additional links" section at the end of the post.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Changes are marked with:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;UPDATE 2026:&lt;/strong&gt; ...&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;In summary there are a few simple best practices to follow for sizing CPU for large production databases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Plan for one vCPU per physical CPU core.&lt;/li&gt;
&lt;li&gt;Consider NUMA and ideally size VMs to keep CPU and memory local to a NUMA node. &lt;/li&gt;
&lt;li&gt;Right-size virtual machines. Add vCPUs only when needed. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Generally this leads to a couple of common questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Because of hyper-threading VMware lets me create VMs with 2x the number of physical CPUs. Doesn’t that double capacity? Shouldn’t I create VMs with as many CPUs as possible?&lt;/li&gt;
&lt;li&gt;What is a NUMA node? Should I care about NUMA?&lt;/li&gt;
&lt;li&gt;VMs should be right-sized, but how do I know when they are?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I answer these questions with examples below. Bust also remember, best practices are not written in stone. Sometimes you need to make compromises. For example, it is likely that large production database VMs will NOT fit in a NUMA node, and as we will see that’s OK. Best practices are guidelines that you will have to evaluate and validate for your applications and environment.&lt;/p&gt;

&lt;p&gt;Although I am writing this with examples for databases running on InterSystems data platforms, the concepts and rules apply generally for capacity and performance planning for any large (Monster) VMs.&lt;/p&gt;



&lt;br&gt;
For virtualisation best practices and more posts on performance and capacity planning;&lt;br&gt;
&lt;a href="https://community.intersystems.com/post/capacity-planning-and-performance-series-index" rel="noopener noreferrer"&gt;A list of other posts in the InterSystems Data Platforms and performance series is here.&lt;/a&gt;



&lt;h1&gt;
  
  
  Monster VMs
&lt;/h1&gt;

&lt;p&gt;This post is mostly about deploying &lt;em&gt;Monster VMs &lt;/em&gt;, sometimes called &lt;em&gt;Wide VMs&lt;/em&gt;. The CPU resource requirements of high transaction databases mean they are often deployed on Monster VMs. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A monster VM is a VM with more Virtual CPUs or memory than a physical NUMA node.&lt;/p&gt;
&lt;/blockquote&gt;



&lt;h1&gt;
  
  
  CPU architecture and NUMA
&lt;/h1&gt;

&lt;p&gt;Current Intel processor architecture has Non-Uniform Memory Architecture (NUMA) architecture. For example, the servers I am using to run tests for this post have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Two CPU sockets, each with a processor with 12 cores (Intel E5-2680 v3). &lt;/li&gt;
&lt;li&gt;256 GB memory (16 x 16GB RDIMM)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each 12-core processor has its own local memory (128GB of RDIMMs and local cache) and can also access memory on other processors in the same host.  Each 12-core package of CPU, CPU cache and 128 GB RDIMM memory is a NUMA node. To access memory on another processor NUMA nodes are connected by a fast inter-connect.&lt;/p&gt;

&lt;p&gt;Processes running on a processor accessing local RDIMM and Cache memory have lower latency than going across the interconnect to access remote memory on another processor. Access across the interconnect increases latency, so performance is non-uniform. The same design applies to servers with more than two sockets. A four socket Intel server has four NUMA nodes.&lt;/p&gt;

&lt;p&gt;ESXi understands physical NUMA and the ESXi CPU scheduler is designed to optimise performance on NUMA systems. One of the ways ESXi maximises performance is to create data locality on a physical NUMA node. In our example if you have a VM with 12 vCPU and less than 128GB memory, ESXi will assign that VM to run on one of the physical NUMA nodes. Which leads to the rule;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If possible size VMs to keep CPU and memory local to a NUMA node. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you need a Monster VM larger than a NUMA node that is OK, ESXi does a very good job of automatically calculating and managing requirements. For example,  ESXi will create virtual NUMA nodes (vNUMA) that intelligently schedule onto the physical NUMA nodes for optimal performance. The vNUMA structure is exposed to the operating system. For example, if you have a host server with two 12-core processors and a VM with 16 vCPUs ESXi may use eight physical cores on on each of two processors to schedule VM vCPUs, the operating system (Linux or Windows) will see two NUMA nodes. &lt;/p&gt;

&lt;p&gt;It is also important to right-size your VMs and not allocate more resources than are needed as that can lead to wasted resources and loss of performance. As well as helping you size for NUMA, it is more efficient and will result in better performance, to have a 12 vCPU VM with high (but safe) CPU utilisation than a 24 vCPU VM with low or middling VM CPU utilisation, especially if there are other VMs on this host needing to be scheduled and competing for resources. This also re-enforces the rule;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Right-size virtual machines.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; There are differences between Intel and AMD implementations of NUMA. AMD has multiple NUMA nodes per processor. It’s been a while since I have seen AMD processors in a customer server, but if you have them review NUMA layout as part of your planning. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;UPDATE 2026:&lt;/strong&gt; Note: AMD EPYC processors are now common in datacenter environments and have a different NUMA architecture than Intel. EPYC processors can have multiple NUMA nodes per socket, configured via NPS (NUMA Per Socket) BIOS settings. Starting with vSphere 7.0 Update 2, the ESXi CPU scheduler includes significant optimizations for AMD EPYC that can achieve up to 50% better performance out-of-the-box. For AMD EPYC, the default BIOS settings (NPS-1, CCX-as-NUMA disabled) provide optimal performance for most virtualization workloads. Review AMD's VMware vSphere Tuning Guides for your specific EPYC generation (7003, 8004, 9004 series) for detailed recommendations.&lt;/p&gt;
&lt;/blockquote&gt;



&lt;h2&gt;
  
  
  Wide VMs and Licencing
&lt;/h2&gt;

&lt;p&gt;For best NUMA scheduling configure wide VMs;&lt;br&gt;
Correction June 2017:  Configure VMs with 1 vCPU per socket. &lt;br&gt;
For example, by default a VM with 24 vCPUs should be configured as 24 CPU sockets each with one core. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Follow VMware best practice rules .&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Please see &lt;a href="https://blogs.vmware.com/performance/2017/03/virtual-machine-vcpu-and-vnuma-rightsizing-rules-of-thumb.html" rel="noopener noreferrer"&gt;this post on the VMware blogs for examples. &lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The VMware blog post goes into detail, but the author, Mark Achtemichuk, recommends the following rules of thumb:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;While there are many advanced vNUMA settings, only in rare cases do they need to be changed from defaults.&lt;/li&gt;
&lt;li&gt;Always configure the virtual machine vCPU count to be reflected as Cores per Socket, until you exceed the physical core count of a single physical NUMA node.&lt;/li&gt;
&lt;li&gt;When you need to configure more vCPUs than there are physical cores in the NUMA node, evenly divide the vCPU count across the minimum number of NUMA nodes.&lt;/li&gt;
&lt;li&gt;Don’t assign an odd number of vCPUs when the size of your virtual machine exceeds a physical NUMA node.&lt;/li&gt;
&lt;li&gt;Don’t enable vCPU Hot Add unless you’re okay with vNUMA being disabled.&lt;/li&gt;
&lt;li&gt;Don’t create a VM larger than the total number of physical cores of your host.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;UPDATE 2026:&lt;/strong&gt; Starting with vSphere 6.5, vNUMA behavior was decoupled from the Cores per Socket setting. ESXi now automatically calculates and presents the optimal vNUMA topology to the guest OS. For most workloads, leaving the default settings is recommended. vSphere 8.0 introduced an enhanced virtual topology feature that automatically selects optimal coresPerSocket values for VMs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UPDATE 2026:&lt;/strong&gt; For vSphere 8 and later: The limitation where CPU Hot Add disabled vNUMA has been lifted in vSphere 8 for VMs using virtual hardware version 20. VMs can now be configured to expose vNUMA topology even with CPU Hot-Add enabled. However, this requires the VM to use the latest virtual hardware compatibility and the new vSphere 8 API property to be configured. **For earlier vSphere versions, the original guidance still applies: do not enable Hot Add for monster VMs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;IRIS licensing counts cores so this is not a problem, however for software or databases other than IRIS specifying that a VM has 24 sockets could make a difference to software licensing so you must check with vendors. &lt;/p&gt;



&lt;h1&gt;
  
  
  Hyper-threading and the CPU schedular
&lt;/h1&gt;

&lt;p&gt;Hyper-threading (HT) often comes up in discussions, I hear; “hyper-threading doubles the number of CPU cores”. Which obviously at the physical level it can’t — you have as many physical cores as you have. Hyper-threading should be enabled and will increase system performance. An expectation is maybe 20%-30% or more application performance increase, but the actual amount is dependant on the application and the workload. But certainly not double. &lt;/p&gt;

&lt;p&gt;As I posted in the &lt;a href="https://community.intersystems.com/post/intersystems-data-platforms-and-performance-%E2%80%93-part-9-cach%C3%A9-vmware-best-practice-guide" rel="noopener noreferrer"&gt;VMware best practice post&lt;/a&gt;, a good starting point for sizing &lt;em&gt;large production database VMs&lt;/em&gt; is to assume is that the vCPU has full physical core dedication on the server —basically ignore hyper-threading when capacity planning. For example; &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;For a 24-core host server plan for a total of up to 24 vCPU for production database VMs knowing there may be available headroom.  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Once you have spent time monitoring the application, operating system and VMware performance during peak processing times you can decide if higher VM consolidation is possible. In the best practice post I stated the rule as;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;One physical CPU (includes hyper-threading) = One vCPU (includes hyper-threading).&lt;/p&gt;
&lt;/blockquote&gt;



&lt;h2&gt;
  
  
  Why Hyper-threading does not double CPU
&lt;/h2&gt;

&lt;p&gt;HT on Intel Xeon processors is a way of creating two &lt;em&gt;logical&lt;/em&gt; CPUs on one physical core. The operating system can efficiently schedule against the two logical processors — if a process or thread on a logical processor is waiting, for example for IO, the physical CPU resources can be used by the other logical processor. Only one logical processor can be progressing at any point in time, so although the physical core is more efficiently utilised &lt;em&gt;performance is not doubled&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;With HT enabled in the host BIOS, when creating a VM you can configure a vCPU per HT logical processor. For example, on a 24-physical core server with HT enabled you can create a VM with up to 48 vCPUS. The ESXi CPU scheduler will optimise processing by running VMs processes on separate physical cores first (while still considering NUMA). I explore later in the post whether allocating more vCPUs than physical cores on a Monster database VM helps scaling.&lt;/p&gt;

&lt;h3&gt;
  
  
  co-stop and CPU scheduling
&lt;/h3&gt;

&lt;p&gt;After monitoring host and application performance you may decide that some overcommitment of host CPU resources is possible. Whether this is a good idea will be very dependant on the applications and workloads. An understanding of the schedular and a key metric to monitor can help you be sure that you are not over committing host resources.&lt;/p&gt;

&lt;p&gt;I sometimes hear; for a VM to be progressing there must be the same number of free logical CPUs as there are vCPUs in the VM. For example, a 12 vCPU VM must ‘wait’ for 12 logical CPUs to be ‘available’ before execution progresses. However it should be noted that ESXi after version 3 this is not the case. ESXi uses relaxed co-scheduling for CPU for better application performance.&lt;/p&gt;

&lt;p&gt;Because multiple cooperating threads or processes frequently synchronise with each other not scheduling them together can increase latency in their operations. For example a thread waiting to be scheduled by another thread in a spin loop. For best performance ESXi tries to schedule as many sibling vCPUs together as possible. But the CPU scheduler can flexibly schedule vCPUs when there a multiple VMs competing for CPU resources in a consolidated environment. If there is too much time difference as some vCPUs make progress while siblings don’t (the time difference is called skew) then the leading vCPU will decide whether to stop itself (co-stop). Note that it is vCPUs that co-stop (or co-start), not the entire VM. This works very well when even when there is some over commitment of resources, however as you would expect; too much over commitment of CPU resources will inevitably impact performance. I show an example of over commitment and co-stop later in Example 2.&lt;/p&gt;

&lt;p&gt;Remember it is not a flat-out race for CPU resources between VMs; the ESXi CPU scheduler’s job is to ensure that policies such as CPU shares, reservations and limits are followed while maximising CPU utilisation and to ensure fairness, throughput, responsiveness and scalability. A discussion of using reservations and shares to prioritise production workloads is beyond the scope of this post and dependant on your application and workload mix. I may revisit this at a later time if I find any IRIS specific recommendations. There are many factors that come into play with the CPU scheduler, this section just skims the surface. For a deep dive see the VMware white paper and other links in the references at the end of the post. &lt;/p&gt;



&lt;h1&gt;
  
  
  Examples
&lt;/h1&gt;

&lt;p&gt;To illustrate the different vCPU configurations, I ran a series of benchmarks using a high transaction rate browser based Hospital Information System application. A similar concept to the DVD Store database benchmark developed by VMware.&lt;/p&gt;

&lt;p&gt;The scripts for the benchmark are created based on observations and metrics from live hospital implementations and include high use workflows, transactions and components that use the highest system resources. Driver VMs on other hosts simulate web sessions (users) by executing scripts with randomised input data at set workflow transaction rates. A benchmark with a rate of 1x is the baseline. Rates can be scaled up and down in increments. &lt;/p&gt;

&lt;p&gt;Along with the database and operating system metrics a good metric to gauge how the benchmark database VM is performing is component (also could be a transaction) response time as measured on the server. An example of a component is part of an end user screen. An increase in component response time means users would start to see a change for the worse in application response time. A well performing database system must provide &lt;em&gt;consistent&lt;/em&gt; high performance for end users. In the following charts, I am measuring against consistent test performance and an indication of end user experience by averaging the response time of the 10 slowest high-use components. Average component response time is expected to be  sub-second, a user screen may be made up of one component, or complex screens may have many components. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Remember you are always sizing for peak workload, plus a buffer for unexpected spikes in activity. I usually aim for average 80% peak CPU utilisation. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A full list of benchmark hardware and software is at the end of the post.  &lt;/p&gt;



&lt;h2&gt;
  
  
  Example 1. Right-sizing - single monster VM per host
&lt;/h2&gt;

&lt;p&gt;It is possible to create a database VM that is sized to use all the physical cores of a host server, for example a 24 vCPU VM on the 24 physical core host. Rather than run the server “bare-metal” in a IRIS database mirror for HA or introduce the complication of operating system failover clustering, the database VM is included in a vSphere cluster for management and HA, for example DRS and VMware HA. &lt;/p&gt;

&lt;p&gt;I have seen customers follow old-school thinking and size a primary database VM for expected capacity at the end of five years hardware life, but as we know from above it is better to right-size; you will get better performance and consolidation if your VMs are not oversized and managing HA will be easier; think Tetris if there is maintenance or host failure and the database monster VM has to migrate or restart on another host. If transaction rate is forecast to increase significantly vCPUs can be added ahead of time during planned maintenance. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note, 'hot add' CPU option disables vNUMA so do not use it for monster VMs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Consider the following chart showing a series of tests on the 24-core host. 3x transaction rate is the sweet spot and the capacity planning target for this 24-core system.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A single VM is running on the host.&lt;/li&gt;
&lt;li&gt;Four VM sizes were used to show performance at 12, 24, 36 and 48 vCPU. &lt;/li&gt;
&lt;li&gt;Transaction rates (1x, 2x, 3x, 4x, 5x) were run for each VM size (if possible).&lt;/li&gt;
&lt;li&gt;Performance/user experience is shown as component response time (bars).&lt;/li&gt;
&lt;li&gt;Average CPU% utilisation in the guest VM (lines).&lt;/li&gt;
&lt;li&gt;Host CPU utilisation reached 100% (red dashed line) at 4x rate for all VM sizes.&lt;/li&gt;
&lt;/ul&gt;



&lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcommunity.intersystems.com%2Fsites%2Fdefault%2Ffiles%2Finline%2Fimages%2Fsingle_guest_vm.png" title="Single Guest VM" alt="24 Physical Core Host&amp;lt;br&amp;gt;
Single guest VM average CPU% and Component Response time " width="800" height="383"&gt;&lt;br&gt;


&lt;p&gt;There is a lot going on in this chart, but we can focus on a couple of interesting things. &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The 24 vCPU VM (orange) scaled up smoothly to the target 3x transaction rate. At 3x rate the in-guest VM is averaging 76% CPU (peaks were around 91%).  Host CPU utilisation is not much more than the guest VM. Component response time is pretty much flat up to 3x, so users are happy. As far as our target transaction rate — &lt;em&gt;this VM is right-sized&lt;/em&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So much for right-sizing, what about increasing vCPUs, that means using hyper threads. Is it possible to double performance and scalability? The short answer is &lt;em&gt;No!&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In this case the answer can be seen by looking at component response time from 4x onwards. While the performance is ‘better’ with more logical cores (vCPUs) allocated, it is still not flat and as consistent as it was up to 3x. Users will be reporting slower response times at 4x no matter how many vCPUs are allocated. Remember at 4x the &lt;em&gt;host &lt;/em&gt; is already flat-lined at 100% CPU utilisation as reported by vSphere. At higher vCPU counts even though in-guest CPU metrics (vmstat) are reporting less than 100% utilisation this is not the case for physical resources. Remember the guest operating system does not know it is virtualised and is just reporting on resources presented to it. Also note the guest operating system does not see HT threads, all vCPUs are presented as physical cores.&lt;/p&gt;

&lt;p&gt;The point is that database processes (there are more than 200 IRIS processes at 3x transaction rate) are very busy and make very efficient use of processors, there is not a lot of slack for logical processors to schedule more work, or consolidate more VMs to this host. For example, a large part of IRIS processing is happening in-memory so there is not a lot of wait on IO. So while you can allocate more vCPUs than physical cores there is not a lot to be gained because the host is already 100% utilised.&lt;/p&gt;

&lt;p&gt;IRIS is very good at handling high workloads. Even when the host and VM are at 100% CPU utilisation the application is still running, and transaction rate is still increasing — scaling is not linear, and as we can see response times are getting longer and user experience will suffer — but the application does not ‘fall off a cliff’ and although not a good place to be users can still work. If you have an application that is not so sensitive to response times it is good to know you can push to the edge, and beyond, and IRIS still works safely.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Remember you do not want to run your database VM or your host at 100% CPU. You need capacity for unexpected spikes and growth in the VM, and ESXi hypervisor needs resources for all the networking, storage and other activities it does. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I always plan for peaks of 80% CPU utilisation. Even then sizing vCPU only up to the number of physical cores leaves some headroom for ESXi hypervisor on logical threads even in extreme situations.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If you are running a hyper-converged (HCI) solution you MUST also factor in HCI CPU requirements at the host level. See my &lt;a href="https://community.intersystems.com/post/intersystems-data-platforms-and-performance-%E2%80%93-part-8-hyper-converged-infrastructure-capacity" rel="noopener noreferrer"&gt;previous post on HCI&lt;/a&gt; for more details. Basic CPU sizing of VMs deployed on HCI is the same as other VMs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Remember, You must validate and test everything in your own environment and with your applications.&lt;/p&gt;



&lt;h2&gt;
  
  
  Example 2. Over committed resources
&lt;/h2&gt;

&lt;p&gt;I have seen customer sites reporting ‘slow’ application performance while the guest operating system reports there are CPU resources to spare. &lt;/p&gt;

&lt;p&gt;Remember the guest operating system does not know it is virtualised. Unfortunately in-guest metrics, for example as reported by vmstat (for example in pButtons) can be deceiving, you must also get host level metrics and ESXi metrics (for example &lt;code&gt;esxtop&lt;/code&gt;) to truly understand system health and capacity. &lt;/p&gt;

&lt;p&gt;As you can see in the chart above when the host is reporting 100% utilisation the guest VM can be reporting a lower utilisation. The 36 vCPU VM (red) is reporting 80% average CPU utilisation at 4x rate while the host is reporting 100%. Even a right-sized VM can be starved of resources, if for example, after go-live other VMs are migrated on to the host, or resources are over-committed through badly configured DRS rules.&lt;/p&gt;

&lt;p&gt;To show key metrics, for this series of tests I configured the following;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Two database VMs running on the host.&lt;/li&gt;
&lt;li&gt; - a 24vCPU running at a constant 2x transaction rate (not shown on chart).&lt;/li&gt;
&lt;li&gt;- a 24vCPU running at 1x, 2x, 3x (these metrics are shown on chart).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With another database using resources;  at 3x rate, the guest OS (RHEL 7) vmstat is only reporting 86% average CPU utilisation and the run queue is only averaging 25. However, users of this system will be complaining loudly as the component response time shot up as processes are slowed.&lt;/p&gt;

&lt;p&gt;As shown in the following chart Co-stop and Ready Time tell the story why user performance is so bad. Ready Time (&lt;code&gt;%RDY&lt;/code&gt;) and CoStop (&lt;code&gt;%CoStop&lt;/code&gt;) metrics show CPU resources are massively over committed at the target 3x rate. This should not really be a surprise as the &lt;em&gt;host&lt;/em&gt; is running 2x (other VM) &lt;em&gt;and&lt;/em&gt; this database VMs 3x rate. &lt;/p&gt;



&lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcommunity.intersystems.com%2Fsites%2Fdefault%2Ffiles%2Finline%2Fimages%2Fovercommit_3.png" title="Over-committed host" width="800" height="492"&gt;&lt;br&gt;


&lt;p&gt;The chart shows Ready time increases when total CPU load on the host increases.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Ready time is time that a VM is ready to run but cannot because CPU resources are not available. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Co-stop also increases. There are not enough free logical CPUs to allow the database VM to progress (as I detailed in the HT section above). The end result is processing is delayed due to contention for physical CPU resources. &lt;/p&gt;

&lt;p&gt;I have seen exactly this situation at a customer site where our support view from pButtons and vmstat only showed the virtualised operating system. While vmstat reported CPU headroom user performance experience was terrible. &lt;/p&gt;

&lt;p&gt;The lesson here is it was not until ESXi metrics and a host level view was made available that the real problem was diagnosed; over committed CPU resources caused by general cluster CPU resource shortage and to make the situation worse bad DRS rules causing high transaction database VMs to migrate together and overwhelm host resources. &lt;/p&gt;



&lt;h2&gt;
  
  
  Example 3. Over committed resources
&lt;/h2&gt;

&lt;p&gt;In this example I used a baseline 24 vCPU database VM running at 3x transaction rate, then two 24 vCPU database VMs at a constant 3x transaction rate. &lt;/p&gt;

&lt;p&gt;The average baseline CPU utilisation (see Example 1 above) was 76% for the VM and 85% for the host. A single 24 vCPU database VM is using all 24 physical processors. Running two 24 vCPU VMs means the VMs are competing for resources and are using all 48 logical execution threads on the server. &lt;/p&gt;



&lt;br&gt;
&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcommunity.intersystems.com%2Fsites%2Fdefault%2Ffiles%2Finline%2Fimages%2Fovercommit_2vm.png" title="Over-committed host" width="773" height="405"&gt;&lt;br&gt;


&lt;p&gt;Remembering that the host was not 100% utilised with a single VM, we can still see a significant drop in throughput and performance as two very busy 24 vCPU VMs attempt to use the 24 physical cores on the host (even with HT). Although IRIS is very efficient using the available CPU resources there is still a 16% drop in database throughput per VM, and more importantly a more than 50% increase in component (user) response time. &lt;/p&gt;



&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;My aim for this post is to answer the common questions. See the reference section below for a deeper dive into CPU host resources and the VMware CPU schedular.&lt;/p&gt;

&lt;p&gt;Even though there are many levels of nerd-knob twiddling and ESXi rat holes to go down to squeeze the last drop of performance out of your system, the basic rules are pretty simple.&lt;/p&gt;

&lt;p&gt;For &lt;em&gt;large production databases&lt;/em&gt; :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Plan for one vCPU per physical CPU core.&lt;/li&gt;
&lt;li&gt;Consider NUMA and ideally size VMs to keep CPU and memory local to a NUMA node. &lt;/li&gt;
&lt;li&gt;Right-size virtual machines. Add vCPUs only when needed. &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to consolidate VMs remember large databases are very busy and will heavily utilise CPUs (physical and logical) at peak times. Don't oversubscribe them until your monitoring tells you it is safe.&lt;/p&gt;



&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://blogs.vmware.com/vsphere/2014/02/overcommit-vcpupcpu-monster-vms.html" rel="noopener noreferrer"&gt;VMware Blog - When to Overcommit vCPU:pCPU for Monster VMs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://frankdenneman.nl/2016/07/06/introduction-2016-numa-deep-dive-series" rel="noopener noreferrer"&gt;Introduction 2016 NUMA Deep Dive Series&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/vmware-vsphere-cpu-sched-performance-white-paper.pdf" rel="noopener noreferrer"&gt;The CPU Scheduler in VMware vSphere 5.1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;



&lt;h2&gt;
  
  
  Tests
&lt;/h2&gt;

&lt;p&gt;I ran the examples in this post on a vSphere cluster made up of two processor Dell R730’s attached to an all flash array. During the examples there was no bottlenecks on the network or storage.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;IRIS 2016.2.1.803.0&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PowerEdge R730&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;2x Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz&lt;/li&gt;
&lt;li&gt;16x 16GB RDIMM, 2133 MT/s, Dual Rank, x4 Data Width&lt;/li&gt;
&lt;li&gt;SAS 12Gbps HBA External Controller &lt;/li&gt;
&lt;li&gt;HyperThreading (HT) on&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;PowerVault MD3420, 12G SAS, 2U-24 drive &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;24x 24 960GB Solid State Drive SAS Read Intensive MLC 12Gbps 2.5in Hot-plug Drive, PX04SR &lt;/li&gt;
&lt;li&gt;2 Controller, 12G SAS, 2U MD34xx, 8G Cache &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;VMware ESXi 6.0.0 build-2494585&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;VMs are configured for best practice; VMXNET3, PVSCSI, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;RHEL 7&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Large pages&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Baseline 1x rate averaged 700,000 glorefs/second (database access/second). 5x rate averaged more than 3,000,000 glorefs/second for 24 vCPUs. The tests were allowed to burn in until constant performance is achieved and then 15 minute samples were taken and averaged. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;These examples only to show the theory, you MUST validate with your own application!&lt;/p&gt;
&lt;/blockquote&gt;





&lt;h1&gt;
  
  
  Additional Links (Feb 2026)
&lt;/h1&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.vmware.com/docs/vsphere-esxi-vcenter-server-80U3-performance-best-practices" rel="noopener noreferrer"&gt;Performance Best Practices for VMware vSphere 8.0 Update 3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.vmware.com/docs/vsphere80-virtual-topology-perf" rel="noopener noreferrer"&gt;VMware vSphere 8.0 Virtual Topology Performance Study&lt;/a&gt; (also referenced in the &lt;a href="https://blogs.vmware.com/cloud-foundation/2022/11/10/extreme-performance-series-automatic-vtopology-for-vms-vsphere8/" rel="noopener noreferrer"&gt;Extreme Performance Series blog post&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.vmware.com/techpapers/2021/vsphere70u2-cpu-sched-amd-epyc.html" rel="noopener noreferrer"&gt;Performance Optimizations in VMware vSphere 7.0 U2 CPU Scheduler for AMD EPYC Processors&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blogs.vmware.com/cloud-foundation/2017/03/09/virtual-machine-vcpu-and-vnuma-rightsizing-rules-of-thumb/" rel="noopener noreferrer"&gt;Virtual Machine vCPU and vNUMA Rightsizing – Guidelines (Mark Achtemichuk)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://frankdenneman.nl/2021/12/02/vsphere-7-cores-per-socket-and-virtual-numa/" rel="noopener noreferrer"&gt;vSphere 7 Cores per Socket and Virtual NUMA (Frank Denneman, 2021)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://frankdenneman.nl/2022/11/03/vsphere-8-cpu-topology-for-large-memory-footprint-vms-exceeding-numa-boundaries/" rel="noopener noreferrer"&gt;vSphere 8 CPU Topology for Large Memory Footprint VMs (Frank Denneman, 2022)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://frankdenneman.nl/2016/12/12/decoupling-cores-per-socket-virtual-numa-topology-vsphere-6-5/" rel="noopener noreferrer"&gt;Decoupling of Cores per Socket from Virtual NUMA Topology in vSphere 6.5 (Frank Denneman)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/58003_amd-epyc-9004-tg-vmware-vsphere.pdf" rel="noopener noreferrer"&gt;AMD EPYC 9004 VMware vSphere Tuning Guide (AMD)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.vmware.com/docs/perf-latency-tuning-vsphere8" rel="noopener noreferrer"&gt;Performance Tuning for Latency-Sensitive Workloads in vSphere 8 (January 2025)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://williamlam.com/2022/11/virtual-numa-vnuma-and-cpu-hot-add-support-in-vsphere-8.html" rel="noopener noreferrer"&gt;vNUMA and CPU Hot-Add support in vSphere 8 (William Lam)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://frankdenneman.nl/category/numa/" rel="noopener noreferrer"&gt;NUMA Deep Dive Series (Frank Denneman)&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>redis</category>
      <category>a11y</category>
      <category>beginners</category>
      <category>programming</category>
    </item>
    <item>
      <title>InterSystems Data Platforms and performance – VM Backups and IRIS freeze/thaw scripts</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Wed, 25 Mar 2026 19:36:07 +0000</pubDate>
      <link>https://dev.to/intersystems/intersystems-data-platforms-and-performance-vm-backups-and-iris-freezethaw-scripts-7gi</link>
      <guid>https://dev.to/intersystems/intersystems-data-platforms-and-performance-vm-backups-and-iris-freezethaw-scripts-7gi</guid>
      <description>&lt;p&gt;Hi, this post was initially written for Caché. In June 2023, I finally updated it for IRIS. If you are revisiting the post since then, the only real change is substituting Caché for IRIS! I also updated the links for IRIS documentation and fixed a few typos and grammatical errors.  Enjoy :)&lt;/p&gt;

&lt;p&gt;In this post, I show strategies for backing up InterSystems IRIS using &lt;em&gt;External Backup&lt;/em&gt; with examples of integrating with snapshot-based solutions. Most solutions I see today are deployed on Linux on VMware, so a lot of the post shows how solutions integrate VMware snapshot technology as examples.&lt;/p&gt;

&lt;h2&gt;
  
  
  IRIS backup - batteries included?
&lt;/h2&gt;

&lt;p&gt;IRIS online backup is included with an IRIS install for uninterrupted backup of IRIS databases. But there are more efficient backup solutions you should consider as systems scale up. &lt;em&gt;External Backup&lt;/em&gt; integrated with snapshot technologies is the recommended solution for backing up systems, including IRIS databases. &lt;/p&gt;

&lt;h2&gt;
  
  
  Are there any special considerations for external backup?
&lt;/h2&gt;

&lt;p&gt;Online documentation for &lt;a href="http://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GCDI_backup#GCDI_backup_methods_ext" rel="noopener noreferrer"&gt;External Backup&lt;/a&gt; has all the details. A key consideration is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"To ensure the integrity of the snapshot, IRIS provides methods to freeze writes to databases while the snapshot is created. Only physical writes to the database files are frozen during the snapshot creation, allowing user processes to continue performing updates in memory uninterrupted."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It is also important to note that part of the snapshot process on virtualised systems causes a short pause on a VM being backed up, often called stun time. Usually less than a second, so not noticed by users or impacting system operation; however, in some circumstances, the stun can last longer. If the stun is longer than the quality of service (QoS) timeout for IRIS database mirroring, then the backup node will think there has been a failure on the primary and will failover. Later in this post, I explain how you can review stun times in case you need to change the mirroring QoS timeout.&lt;/p&gt;



&lt;br&gt;
&lt;a href="https://community.intersystems.com/post/capacity-planning-and-performance-series-index" rel="noopener noreferrer"&gt;A list of other InterSystems Data Platforms and performance series posts is here.&lt;/a&gt;

&lt;p&gt;You should also review &lt;a href="http://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GCDI_backup" rel="noopener noreferrer"&gt;IRIS online documentation Backup and Restore Guide for this post.&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;

&lt;h1&gt;
  
  
  Backup choices
&lt;/h1&gt;
&lt;h2&gt;
  
  
  Minimal Backup Solution - IRIS Online Backup
&lt;/h2&gt;

&lt;p&gt;If you have nothing else, this comes in the box with the InterSystems data platform for zero downtime backups. Remember, &lt;em&gt;IRIS online backup&lt;/em&gt; only backs up IRIS database files, capturing all blocks in the databases that are allocated for data with the output written to a sequential file. IRIS Online Backup supports cumulative and incremental backups.  &lt;/p&gt;

&lt;p&gt;In the context of VMware, an IRIS Online Backup is an in-guest backup solution. Like other in-guest solutions, IRIS Online Backup operations are essentially the same whether the application is virtualised or runs directly on a host. IRIS Online Backup must be coordinated with a system backup to copy the IRIS online backup output file to backup media and all other file systems used by your application. At a minimum, system backup must include the installation directory, journal and alternate journal directories, application files, and any directory containing external files the application uses. &lt;/p&gt;

&lt;p&gt;IRIS Online Backup should be considered as an entry-level approach for smaller sites wishing to implement a low-cost solution to back up only IRIS databases or ad-hoc backups; for example, it is helpful in the set-up of mirroring. However, as databases increase in size and as IRIS is typically only part of a customer's data landscape, &lt;em&gt;External Backups&lt;/em&gt; combined with snapshot technology and third-party utilities are recommended as best practice with advantages such as including the backup of non-database files, faster restore times, enterprise-wide view of data and better catalogue and management tools.&lt;/p&gt;



&lt;h2&gt;
  
  
  Recommended Backup Solution - External backup
&lt;/h2&gt;

&lt;p&gt;Using VMware as an example, Virtualising on VMware adds functionality and choices for protecting entire VMs. Once you have virtualised a solution, you have effectively encapsulated your system — including the operating system, the application and the data — all within .vmdk (and some other) files. When required, these files can be straightforward to manage and used to recover a whole system, which is very different from the same situation on a physical system where you must recover and configure the components separately -- operating system, drivers, third-party applications, database and database files, etc. &lt;/p&gt;



&lt;h1&gt;
  
  
  VMware snapshot
&lt;/h1&gt;

&lt;p&gt;VMware’s vSphere Data Protection (VDP) and other third-party backup solutions for VM backup, such as Veeam or Commvault, take advantage of the functionality of VMware virtual machine snapshots to create backups. A high-level explanation of VMware snapshots follows; see the VMware documentation for more details.&lt;/p&gt;

&lt;p&gt;It is important to remember that snapshots are applied to the whole VM and that the operating system and any applications or the database engine are unaware that the snapshot is happening. Also, remember:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;By themselves, VMware snapshots are not backups!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Snapshots &lt;em&gt;enable&lt;/em&gt; backup software to make backups, but they are not backups by themselves.&lt;/p&gt;

&lt;p&gt;VDP and third-party backup solutions use the VMware snapshot process in conjunction with the backup application to manage the creation and, very importantly, deletion of snapshots. At a high level, the process and sequence of events for an external backup using VMware snapshots are as follows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Third-party backup software requests the ESXi host to trigger a VMware snapshot.&lt;/li&gt;
&lt;li&gt;A VM's .vmdk files are put into a read-only state, and a child vmdk delta file is created for each of the VM's .vmdk files.&lt;/li&gt;
&lt;li&gt;Copy on write is used with all changes to the VM written to the delta files. Any reads are from the delta file first.&lt;/li&gt;
&lt;li&gt;The backup software manages copying the read-only parent .vmdk files to the backup target. &lt;/li&gt;
&lt;li&gt;When the backup is complete, the snapshot is committed (VM disks resume writes and updated blocks in delta files written to parent). &lt;/li&gt;
&lt;li&gt;The VMware snapshot is now removed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Backup solutions also use other features such as Change Block Tracking (CBT) to allow incremental or cumulative backups for speed and efficiency (especially important for space saving), and typically also add other important functions such as data deduplication and compression, scheduling, mounting VMs with changed IP addresses for integrity checks etc., full VM and file level restores, and catalogue management.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;VMware snapshots that are not appropriately managed or left to run for a long time can use excessive storage (as more and more data is changed, delta files continue to grow) and also slow down your VMs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You should think carefully before running a manual snapshot on a production instance. Why are you doing this? What will happen if you revert &lt;em&gt;back in time&lt;/em&gt; to when the snapshot was created? What happens to all the application transactions between creation and rollback? &lt;/p&gt;

&lt;p&gt;It is OK if your backup software creates and deletes a snapshot. The snapshot should only be around for a short time. And a crucial part of your backup strategy will be to choose a time when the system has low usage to minimise any further impact on users and performance.&lt;/p&gt;

&lt;h2&gt;
  
  
  IRIS database considerations for snapshots
&lt;/h2&gt;

&lt;p&gt;Before the snapshot is taken, the database must be quiesced so that all pending writes are committed, and the database is in a consistent state. IRIS provides methods and an API to commit and then freeze (stop) writes to databases for a short period while the snapshot is created. This way, only physical writes to the database files are frozen during the creation of the snapshot, allowing user processes to continue performing updates in memory uninterrupted. Once the snapshot has been triggered, database writes are thawed, and the backup continues copying data to backup media. The time between freeze and thaw should be quick (a few seconds).&lt;/p&gt;

&lt;p&gt;In addition to pausing writes, the IRIS freeze also handles switching journal files and writing a backup marker to the journal. The journal file continues to be written normally while physical database writes are frozen. If the system were to crash while the physical database writes are frozen, data would be recovered from the journal as usual during start-up.&lt;/p&gt;

&lt;p&gt;The following diagram shows freeze and thaw with VMware snapshot steps to create a backup with a consistent database image.&lt;/p&gt;



&lt;h2&gt;
  
  
  VMware snapshot + IRIS freeze/thaw timeline (not to scale)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1zdgh6huxqq1nod38w7d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1zdgh6huxqq1nod38w7d.png" alt=" " width="800" height="731"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Note the short time between Freeze and Thaw -- only the time to create the snapshot, not the time to copy the read-only parent to the backup target.&lt;/em&gt;&lt;/p&gt;


&lt;/blockquote&gt;



&lt;h2&gt;
  
  
  Summary - Why do I need to freeze and thaw the IRIS database when VMware is taking a snapshot?
&lt;/h2&gt;

&lt;p&gt;The process of freezing and thawing the database is crucial to ensure data consistency and integrity. This is because:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data Consistency:&lt;/strong&gt; IRIS can be writing journals, or the WIJ or doing random writes to the database at any time. A snapshot captures the state of the VM at a specific point in time. If the database is actively being written during the snapshot, it can lead to a snapshot that contains partial or inconsistent data. Freezing the database ensures that all transactions are completed and no new transactions start during the snapshot, leading to a consistent disk state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Quiescing the File System:&lt;/strong&gt; VMware's snapshot technology can quiesce the file system to ensure file system consistency. However, this does not account for the application or database level consistency. Freezing the database ensures that the database is in a consistent state at the application level, complementing VMware's quiescing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reducing Recovery Time:&lt;/strong&gt; Restoring from a snapshot that was taken without freezing the database might require additional steps like database repair or consistency checks, which can significantly increase recovery time. Freezing and thawing ensure the database is immediately usable upon restoration, reducing downtime.&lt;/p&gt;



&lt;h1&gt;
  
  
  Integrating IRIS Freeze and Thaw
&lt;/h1&gt;

&lt;p&gt;vSphere allows a script to be automatically called on either side of snapshot creation; this is when IRIS Freeze and Thaw are called. Note: For this functionality to work correctly, the ESXi host requests the guest operating system to quiesce the disks via &lt;em&gt;VMware Tools.&lt;/em&gt; &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;VMware tools must be installed in the guest operating system.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The scripts must adhere to strict name and location rules. File permissions must also be set. For VMware on Linux, the script names are:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# /usr/sbin/pre-freeze-script
# /usr/sbin/post-thaw-script
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Below are examples of freeze and thaw scripts our team use with Veeam backup for our internal test lab instances, but these scripts should also work with other solutions.  These examples have been tested and used on vSphere 6 and Red Hat 7.  &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;While these scripts can be used as examples and illustrate the method, you must validate them for your environments!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Example pre-freeze-script:
&lt;/h3&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#!/bin/sh
#
# Script called by VMWare immediately prior to snapshot for backup.
# Tested on Red Hat 7.2
#

LOGDIR=/var/log
SNAPLOG=$LOGDIR/snapshot.log

echo &amp;gt;&amp;gt; $SNAPLOG
echo "`date`: Pre freeze script started" &amp;gt;&amp;gt; $SNAPLOG
exit_code=0

# Only for running instances
for INST in `iris qall 2&amp;gt;/dev/null | tail -n +3 | grep '^up' | cut -c5-  | awk '{print $1}'`; do

    echo "`date`: Attempting to freeze $INST" &amp;gt;&amp;gt; $SNAPLOG

    # Detailed instances specific log    
    LOGFILE=$LOGDIR/$INST-pre_post.log

    # Freeze
    irissession $INST -U '%SYS' "##Class(Backup.General).ExternalFreeze(\"$LOGFILE\",,,,,,1800)" &amp;gt;&amp;gt; $SNAPLOG $
    status=$?

    case $status in
        5) echo "`date`:   $INST IS FROZEN" &amp;gt;&amp;gt; $SNAPLOG
           ;;
        3) echo "`date`:   $INST FREEZE FAILED" &amp;gt;&amp;gt; $SNAPLOG
           logger -p user.err "freeze of $INST failed"
           exit_code=1
           ;;
        *) echo "`date`:   ERROR: Unknown status code: $status" &amp;gt;&amp;gt; $SNAPLOG
           logger -p user.err "ERROR when freezing $INST"
           exit_code=1
           ;;
    esac
    echo "`date`:   Completed freeze of $INST" &amp;gt;&amp;gt; $SNAPLOG
done

echo "`date`: Pre freeze script finished" &amp;gt;&amp;gt; $SNAPLOG
exit $exit_code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Example thaw script:
&lt;/h3&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;#!/bin/sh
#
# Script called by VMWare immediately after backup snapshot has been created
# Tested on Red Hat 7.2
#

LOGDIR=/var/log
SNAPLOG=$LOGDIR/snapshot.log

echo &amp;gt;&amp;gt; $SNAPLOG
echo "`date`: Post thaw script started" &amp;gt;&amp;gt; $SNAPLOG
exit_code=0

if [ -d "$LOGDIR" ]; then

    # Only for running instances    
    for INST in `iris qall 2&amp;gt;/dev/null | tail -n +3 | grep '^up' | cut -c5-  | awk '{print $1}'`; do

        echo "`date`: Attempting to thaw $INST" &amp;gt;&amp;gt; $SNAPLOG

        # Detailed instances specific log
        LOGFILE=$LOGDIR/$INST-pre_post.log

        # Thaw
        irissession $INST -U%SYS "##Class(Backup.General).ExternalThaw(\"$LOGFILE\")" &amp;gt;&amp;gt; $SNAPLOG 2&amp;gt;&amp;amp;1
        status=$?

        case $status in
            5) echo "`date`:   $INST IS THAWED" &amp;gt;&amp;gt; $SNAPLOG
               irissession $INST -U%SYS "##Class(Backup.General).ExternalSetHistory(\"$LOGFILE\")" &amp;gt;&amp;gt; $SNAPLOG$
               ;;
            3) echo "`date`:   $INST THAW FAILED" &amp;gt;&amp;gt; $SNAPLOG
               logger -p user.err "thaw of $INST failed"
               exit_code=1
               ;;
            *) echo "`date`:   ERROR: Unknown status code: $status" &amp;gt;&amp;gt; $SNAPLOG
               logger -p user.err "ERROR when thawing $INST"
               exit_code=1
               ;;
        esac
        echo "`date`:   Completed thaw of $INST" &amp;gt;&amp;gt; $SNAPLOG
    done
fi

echo "`date`: Post thaw script finished" &amp;gt;&amp;gt; $SNAPLOG
exit $exit_code
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  Remember to set permissions:
&lt;/h3&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# sudo chown root.root /usr/sbin/pre-freeze-script /usr/sbin/post-thaw-script
# sudo chmod 0700 /usr/sbin/pre-freeze-script /usr/sbin/post-thaw-script
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Testing Freeze and Thaw
&lt;/h2&gt;

&lt;p&gt;To test the scripts are running correctly, you can manually run a snapshot on a VM and check the script output. The following screenshot shows the "Take VM Snapshot" dialogue and options. &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsolgyriys7js56de5epc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsolgyriys7js56de5epc.png" alt=" " width="427" height="324"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deselect&lt;/strong&gt;- "Snapshot the virtual machine's memory".&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Select&lt;/strong&gt; - the "Quiesce guest file system (Needs VMware Tools installed)" check box to pause running processes on the guest operating system so that file system contents are in a known consistent state when you take the snapshot.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Important! After your test, remember to delete the snapshot!!!!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If the quiesce flag is true, and the virtual machine is powered on when the snapshot is taken, VMware Tools is used to quiesce the file system in the virtual machine. Quiescing a file system is a process of bringing the on-disk data into a state suitable for backups. This process might include such operations as flushing dirty buffers from the operating system's in-memory cache to disk. &lt;/p&gt;

&lt;p&gt;The following output shows the contents of the &lt;code&gt;$SNAPSHOT&lt;/code&gt; log file set in the example freeze/thaw scripts above after running a backup that includes a snapshot as part of its operation. &lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Wed Jan  4 16:30:35 EST 2017: Pre freeze script started
Wed Jan  4 16:30:35 EST 2017: Attempting to freeze H20152
Wed Jan  4 16:30:36 EST 2017:   H20152 IS FROZEN
Wed Jan  4 16:30:36 EST 2017:   Completed freeze of H20152
Wed Jan  4 16:30:36 EST 2017: Pre freeze script finished

Wed Jan  4 16:30:41 EST 2017: Post thaw script started
Wed Jan  4 16:30:41 EST 2017: Attempting to thaw H20152
Wed Jan  4 16:30:42 EST 2017:   H20152 IS THAWED
Wed Jan  4 16:30:42 EST 2017:   Completed thaw of H20152
Wed Jan  4 16:30:42 EST 2017: Post thaw script finished
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;This example shows 6 seconds of elapsed time between freeze and thaw (16:30:36-16:30:42). User operations are NOT interrupted during this period. &lt;em&gt;You will have to gather metrics from your own systems&lt;/em&gt;, but for some context, this example is from a system running an application benchmark on a VM with no IO bottlenecks and an average of more than 2 million Glorefs/sec, 170,000 Gloupds/sec, and an average 1,100 physical reads/sec and 3,000 writes per write daemon cycle. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Remember that memory is not part of the snapshot, so on restarting, the VM will reboot and recover. Database files will be consistent. You don’t want to "resume" a backup; you want the files at a known point in time. You can then roll forward journals and whatever other recovery steps are needed for the application and transactional consistency once the files are recovered.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For additional data protection, a &lt;a href="http://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GCDI_journal#GCDI_journal_util_JRNSWTCH" rel="noopener noreferrer"&gt;journal switch&lt;/a&gt; can be done by itself, and journals can be backed up or replicated to another location, for example, hourly.&lt;/p&gt;

&lt;p&gt;Below is the output of the &lt;code&gt;$LOGFILE&lt;/code&gt;  in the example freeze/thaw scripts above, showing journal details for the snapshot.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;01/04/2017 16:30:35: Backup.General.ExternalFreeze: Suspending system

Journal file switched to:
/trak/jnl/jrnpri/h20152/H20152_20170104.011
01/04/2017 16:30:35: Backup.General.ExternalFreeze: Start a journal restore for this backup with journal file: /trak/jnl/jrnpri/h20152/H20152_20170104.011

Journal marker set at
offset 197192 of /trak/jnl/jrnpri/h20152/H20152_20170104.011
01/04/2017 16:30:36: Backup.General.ExternalFreeze: System suspended
01/04/2017 16:30:41: Backup.General.ExternalThaw: Resuming system
01/04/2017 16:30:42: Backup.General.ExternalThaw: System resumed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h1&gt;
  
  
  VM Stun Times
&lt;/h1&gt;

&lt;p&gt;At the creation point of a VM snapshot and after the backup is complete and the snapshot is committed, the VM needs to be frozen for a short period. This short freeze is often referred to as  stunning the VM. A good blog post on stun times is &lt;a href="http://cormachogan.com/2015/04/28/when-and-why-do-we-stun-a-virtual-machine/" rel="noopener noreferrer"&gt;here&lt;/a&gt;. I summarise the details below and put them in the context of IRIS database considerations.&lt;/p&gt;

&lt;p&gt;From the post on stun times: “To create a VM snapshot, the VM is “stunned” in order to (i) serialize device state to disk, and (ii) close the current running disk and create a snapshot point.…When consolidating, the VM is “stunned” in order to close the disks and put them in a state that is appropriate for consolidation.”&lt;/p&gt;

&lt;p&gt;Stun time is typically a few 100 milliseconds; however, if there is a very high disk write activity during the commit phase, stun time could be several seconds. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If the VM is a Primary or Backup member participating in IRIS Database Mirroring and the stun time is longer than the mirror Quality of Service (QoS) timeout, the mirror will report the Primary VM as failed and initiate a mirror takeover.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Update March 2018:&lt;/strong&gt;&lt;br&gt;
My colleague, Peter Greskoff, pointed out that a backup mirror member could initiate failover in as short a time as just over half QoS timeout during a VM stun or any other time the primary mirror member is unavailable. &lt;/p&gt;

&lt;p&gt;For a detailed description of QoS considerations and failover scenarios, see this great post: &lt;a href="https://community.intersystems.com/post/quality-service-timeout-guide-mirroring" rel="noopener noreferrer"&gt;Quality of Service Timeout Guide for Mirroring&lt;/a&gt;, however the short story regarding VM stun times and QoS is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If the backup mirror does not receive any messages from the primary mirror within half of the QoS timeout, it will send a message to ensure the primary is still alive. The backup then waits an additional half QoS time for a response from the primary machine. If there is no response from the primary, it is assumed to be down, and the backup will take over.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;On a busy system, journals are continuously sent from the primary to the backup mirror, and the backup would not need to check if the primary is still alive. However, during a quiet time — when backups are more likely to happen — if the application is idle, there may be no messages between the primary and backup mirror for more than half the QoS time.&lt;/p&gt;

&lt;p&gt;Here is Peter’s example; Think about this time frame for an idle system with a QoS timeout of:08  seconds and a VM stun time of:07 seconds:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;:00 Primary pings the arbiter with a keepalive, arbiter responds immediately&lt;/li&gt;
&lt;li&gt;:01 backup member sends keepalive to the primary, primary responds immediately&lt;/li&gt;
&lt;li&gt;:02&lt;/li&gt;
&lt;li&gt;:03 VM stun begins&lt;/li&gt;
&lt;li&gt;:04 primary tries to send keepalive to the arbiter, but it doesn’t get through until stun is complete&lt;/li&gt;
&lt;li&gt;:05 backup member sends a ping to primary, as half of QoS has expired&lt;/li&gt;
&lt;li&gt;:06&lt;/li&gt;
&lt;li&gt;:07&lt;/li&gt;
&lt;li&gt;:08 arbiter hasn’t heard from the primary in a full QoS timeout, so it closes the connection&lt;/li&gt;
&lt;li&gt;:09 The backup hasn’t gotten a response from the primary and confirms with the arbiter that it also lost connection, so it takes over&lt;/li&gt;
&lt;li&gt;:10 VM stun ends, too late!!&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Please also read the section, &lt;em&gt;Pitfalls and Concerns when Configuring your Quality of Service Timeout&lt;/em&gt;, in the linked post above to understand the balance to have QoS only as long as necessary. Having QoS too long, especially more than 30 seconds, can also cause problems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;End update March 2018:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For more information on Mirroring QoS, also see the &lt;a href="https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GHA_mirror#GHA_mirror_set_tunable_params_qos" rel="noopener noreferrer"&gt;documentation&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Strategies to keep stun time to a minimum include running backups when database activity is low and having well-set-up storage.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;As noted above, when creating a snapshot, there are several options you can specify; one of the options is to include the memory state in the snapshot - Remember, &lt;em&gt;memory state is NOT needed for IRIS database backups&lt;/em&gt;. If the memory flag is set, a dump of the internal state of the virtual machine is included in the snapshot. Memory snapshots take much longer to create. Memory snapshots are used to allow reversion to a running virtual machine state as it was when the snapshot was taken. This is NOT required for a database file backup.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When taking a memory snapshot, the entire state of the virtual machine will be stunned, &lt;strong&gt;stun time is variable&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;As noted previously, for backups, the quiesce flag must be set to true for manual snapshots or by the backup software to guarantee a consistent and usable backup. &lt;/p&gt;

&lt;h2&gt;
  
  
  Reviewing VMware logs for stun times
&lt;/h2&gt;

&lt;p&gt;Starting from ESXi 5.0, snapshot stun times are logged in each virtual machine's log file (vmware.log) with messages similar to:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;2017-01-04T22:15:58.846Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 38123 us&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Stun times are in microseconds, so in the above example, &lt;code&gt;38123 us&lt;/code&gt; is 38123/1,000,000 seconds or 0.038 seconds. &lt;/p&gt;

&lt;p&gt;To be sure that stun times are within acceptable limits or to troubleshoot if you suspect long stun times are causing problems, you can download and review the vmware.log files from the folder of the VM that you are interested in. Once downloaded, you can extract and sort the log using the example Linux commands below. &lt;/p&gt;

&lt;h3&gt;
  
  
  Example downloading vmware.log files
&lt;/h3&gt;

&lt;p&gt;There are several ways to download support logs, including creating a VMware support bundle through the vSphere management console or from the ESXi host command line. Consult the VMware documentation for all the details, but below is a simple method to create and gather a much smaller support bundle that includes the &lt;code&gt;vmware.log&lt;/code&gt; file so you can review stun times. &lt;/p&gt;

&lt;p&gt;You will need the long name of the directory where the VM files are located. Log on to the ESXi host where the database VM is running using ssh and use the command:  &lt;code&gt;vim-cmd vmsvc/getallvms &lt;/code&gt;  to list vmx files and the long names unique associated with them. &lt;/p&gt;

&lt;p&gt;For example, the long name for the example database VM used in this post is output as:&lt;br&gt;
&lt;code&gt;26     vsan-tc2016-db1              [vsanDatastore] e2fe4e58-dbd1-5e79-e3e2-246e9613a6f0/vsan-tc2016-db1.vmx              rhel7_64Guest           vmx-11&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Next, run the command to gather and bundle only log files:&lt;br&gt;&lt;br&gt;
&lt;code&gt;vm-support -a VirtualMachines:logs&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;The command will echo the location of the support bundle, for example:&lt;br&gt;
 &lt;code&gt;To see the files collected, check '/vmfs/volumes/datastore1 (3)/esx-esxvsan4.iscinternal.com-2016-12-30--07.19-9235879.tgz'&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;You can now use sftp to transfer the file off the host for further processing and review. &lt;/p&gt;

&lt;p&gt;In this example, after uncompressing the support bundle navigate to the path corresponding to the database VMs long name. For example, in this case:&lt;br&gt;
 &lt;code&gt;&amp;lt;bundle name&amp;gt;/vmfs/volumes/&amp;lt;host long name&amp;gt;/e2fe4e58-dbd1-5e79-e3e2-246e9613a6f0&lt;/code&gt;. &lt;/p&gt;

&lt;p&gt;You will see several numbered log files; the most recent log file has no number, i.e. &lt;code&gt;vmware.log&lt;/code&gt;. The log may be only a few 100 KB, but there is a lot of information; however, we care about the stun/unstun times, which are easy enough to find with &lt;code&gt;grep&lt;/code&gt;. For example:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;$ grep Unstun vmware.log
2017-01-04T21:30:19.662Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 1091706 us
--- 
2017-01-04T22:15:58.846Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 38123 us
2017-01-04T22:15:59.573Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 298346 us
2017-01-04T22:16:03.672Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 301099 us
2017-01-04T22:16:06.471Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 341616 us
2017-01-04T22:16:24.813Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 264392 us
2017-01-04T22:16:30.921Z| vcpu-0| I125: Checkpoint_Unstun: vm stopped for 221633 us
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;We can see two groups of stun times in the example, one from snapshot creation and a second set 45 minutes later for each disk when the snapshot is deleted/consolidated (e.g. after the backup software has completed copying the read-only vmx file). The above example shows that most stun times are sub-second, although the initial stun time is just over one second. &lt;/p&gt;

&lt;p&gt;Short stun times are not noticeable to an end user. However, system processes such as IRIS Database Mirroring continuously monitor whether an instance is ‘alive’. If the stun time exceeds the mirroring QoS timeout, the node may be considered uncontactable and ‘dead’, and a failover will be triggered. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Tip:&lt;/em&gt; To review all the logs or for trouble-shooting, a handy command is to grep all the &lt;code&gt;vmware*.log&lt;/code&gt; files and look for any outliers or instances where stun time is approaching QoS timeout. The following command pipes the output to awk for formatting:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;grep Unstun vmware* | awk '{ printf ("%'"'"'d", $8)} {print " ---" $0}' | sort -nr&lt;/code&gt;&lt;/p&gt;



&lt;h1&gt;
  
  
  Summary
&lt;/h1&gt;

&lt;p&gt;You should monitor your system regularly during normal operations to understand stun times and how they may impact QoS timeout for HA, such as mirroring. As noted, strategies to keep stun/unstun time to a minimum include running backups when database and storage activity is low and having well-set-up storage. For constant monitoring, logs may be processed by using VMware Log Insight or other tools.&lt;/p&gt;

&lt;p&gt;In future posts, I will revisit backup and restore operations for InterSystems Data Platforms. But for now, if you have any comments or suggestions based on the workflows of your systems, please share them via the comments sections below.&lt;/p&gt;

</description>
      <category>backup</category>
      <category>beginners</category>
      <category>cache</category>
      <category>productivity</category>
    </item>
    <item>
      <title>InterSystems Data Platforms and performance – Part 9 InterSystems IRIS VMware Best Practice Guide</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Wed, 25 Mar 2026 19:26:55 +0000</pubDate>
      <link>https://dev.to/intersystems/intersystems-data-platforms-and-performance-part-9-intersystems-iris-vmware-best-practice-guide-4ooi</link>
      <guid>https://dev.to/intersystems/intersystems-data-platforms-and-performance-part-9-intersystems-iris-vmware-best-practice-guide-4ooi</guid>
      <description>&lt;p&gt;This post provides guidelines for configuration, system sizing and capacity planning when deploying IRIS and IRIS on a VMware ESXi. This post is based on and replaces the earlier IRIS-era guidance and reflects current VMware and InterSystems recommendations.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Last update Jan 2026. These guidelines are a best effort, remember requirements and capabilities of VMware and IRIS can change.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I jump right in with recommendations assuming you already have an understanding of VMware vSphere virtualization platform. The recommendations in this guide are not specific to any particular hardware or site specific implementation, and are not intended as a fully comprehensive guide to planning and configuring a vSphere deployment -- rather this is a check list of best practice configuration choices you can make. I expect that the recommendations will be evaluated for a specific site by your expert VMware implementation team.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://community.intersystems.com/post/capacity-planning-and-performance-series-index" rel="noopener noreferrer"&gt;A list of other posts in the InterSystems Data Platforms and performance series is here.&lt;/a&gt;&lt;/p&gt;



&lt;h2&gt;
  
  
  Are InterSystems' products supported on ESXi?
&lt;/h2&gt;

&lt;p&gt;It is InterSystems policy and procedure to verify and release InterSystems’ products against processor types and operating systems including when operating systems are virtualised. For specifics see &lt;a href="https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=ISP_technologies" rel="noopener noreferrer"&gt;InterSystems Supported Technologies&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Note: If you do not write your own applications you must also check your application vendors support policy.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Supported Hardware
&lt;/h3&gt;

&lt;p&gt;VMware virtualization works well for IRIS when used with current server and storage components. IRIS using VMware virtualization has been deployed succesfully at customer sites and has been proven in benchmarks for performance and scalability. There is no significant performance impact using VMware virtualization on properly configured storage, network and servers with later model Intel Xeon processors and AMD EPYC processors.&lt;/p&gt;

&lt;p&gt;Generally IRIS and applications are installed and configured on the guest operating system in the same way as for the same operating system on bare-metal installations. &lt;/p&gt;

&lt;p&gt;It is the customers responsibility to check the &lt;a href="http://www.vmware.com/resources/compatibility/search.php" rel="noopener noreferrer"&gt;VMware compatibility guide&lt;/a&gt; for the specific servers and storage being used.&lt;/p&gt;



&lt;h1&gt;
  
  
  Virtualised architecture
&lt;/h1&gt;

&lt;p&gt;I see VMware commonly used in two standard configurations with IRIS applications:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Where primary production database operating system instances are on a ‘bare-metal’ cluster, and VMware is only used for additional production and non-production instances such as web servers, printing, test, training and so on.&lt;/li&gt;
&lt;li&gt;Where ALL operating system instances, including primary production instances are virtualized.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This post can be used as a guide for either scenario, however the focus is on the second scenario where all operating system instances including production are virtualised. The following diagram shows a typical physical server set up for that configuration.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcommunity.intersystems.com%2Fsites%2Fdefault%2Ffiles%2Finline%2Fimages%2Fcachebestpractice2016_201.png" class="article-body-image-wrapper"&gt;&lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcommunity.intersystems.com%2Fsites%2Fdefault%2Ffiles%2Finline%2Fimages%2Fcachebestpractice2016_201.png" width="514" height="556"&gt;&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;&lt;em&gt;Figure 1. Simple virtualised IRIS architecture&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;Figure 1 shows a common deployment with a minimum of three physical host servers to provide N+1 capacity and availability with host servers in a VMware HA cluster. Additional physical servers may be added to the cluster to scale resources. Additional physical servers may also be required for backup/restore media management and disaster recovery.&lt;/p&gt;




&lt;p&gt;For recommendations specific to &lt;em&gt;VMware vSAN&lt;/em&gt;, VMware's Hyper-Converged Infrastructure solution, see the following post: &lt;a href="https://community.intersystems.com/post/intersystems-data-platforms-and-performance-%E2%80%93-part-8-hyper-converged-infrastructure-capacity" rel="noopener noreferrer"&gt;Part 8 Hyper-Converged Infrastructure Capacity and Performance Planning&lt;/a&gt;. Most of the recommendations in this post can be applied to vSAN -- with the exception of some of the obvious differences in the Storage section below.&lt;/p&gt;



&lt;h1&gt;
  
  
  VMWare versions
&lt;/h1&gt;

&lt;p&gt;The following table shows key recommendations for IRIS:&lt;/p&gt;




&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;ESXi:&lt;/td&gt;
&lt;td&gt;Minimum vSphere 7.x or 8.x&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;vCenter:&lt;/td&gt;
&lt;td&gt;Required (VCSA preferred)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Licensing:&lt;/td&gt;
&lt;td&gt;Enterprise Plus strongly recommended. Contact VMware.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;ul&gt;
&lt;li&gt;DRS, HA, vMotion, vDS, and storage APIs are mandatory for production IRIS.
&lt;/li&gt;
&lt;li&gt;“Free” ESXi is &lt;strong&gt;not suitable&lt;/strong&gt; for enterprise IRIS deployments.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;vSphere is a suite of products including vCenter Server that allows centralised system management of hosts and virtual machines via the vSphere client.&lt;/p&gt;

&lt;p&gt;VMware has several licensing models; ultimately choice of version is based on what best suits your current and future infrastructure planning. Contact Broadcom for the latest VMware licensing choices.&lt;/p&gt;



&lt;h1&gt;
  
  
  ESXi Host BIOS settings
&lt;/h1&gt;

&lt;p&gt;The ESXi host is the physical server. Before configuring BIOS you should:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check with the hardware vendor that the server is running the latest BIOS&lt;/li&gt;
&lt;li&gt;Check whether there are any server/CPU model specific BIOS settings for VMware.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Default settings for server BIOS may not be optimal for VMware. The following settings can be used to optimize the physical host servers to get best performance. Not all settings in the following table are available on all vendors’ servers.&lt;/p&gt;




&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Setting&lt;/th&gt;
&lt;th&gt;Required Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;All CPU cores:&lt;/td&gt;
&lt;td&gt;Enabled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hyper-Threading:&lt;/td&gt;
&lt;td&gt;Enabled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Turbo Boost:&lt;/td&gt;
&lt;td&gt;Enabled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NUMA:&lt;/td&gt;
&lt;td&gt;Enabled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hardware Virtualization (VT-x / AMD-V):&lt;/td&gt;
&lt;td&gt;Enabled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Power Management:&lt;/td&gt;
&lt;td&gt;OS / ESXi controlled&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unused devices:&lt;/td&gt;
&lt;td&gt;Disabled&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;strong&gt;AMD EPYC note (Zen 3/4/5):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Review &lt;strong&gt;NUMA Per Socket (NPS)&lt;/strong&gt; settings.
&lt;/li&gt;
&lt;li&gt;NPS=1 or NPS=2 is typically optimal for IRIS database workloads.&lt;/li&gt;
&lt;/ul&gt;



&lt;h1&gt;
  
  
  Memory
&lt;/h1&gt;

&lt;p&gt;The following key rules must be considered for memory allocation:&lt;/p&gt;




&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;VM Memory Sizing:&lt;/td&gt;
&lt;td&gt;Size vRAM to fit within physical memory available&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production Database VMs:&lt;/td&gt;
&lt;td&gt;Reserve 100% memory (full reservation)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory Overcommitment:&lt;/td&gt;
&lt;td&gt;Avoid for production; acceptable for non-production&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NUMA Consideration:&lt;/td&gt;
&lt;td&gt;Ideally size VMs to keep memory local to NUMA node&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VMware Tools:&lt;/td&gt;
&lt;td&gt;Must be installed for memory management features&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Large Pages:&lt;/td&gt;
&lt;td&gt;Enable for database VMs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Swap:&lt;/td&gt;
&lt;td&gt;Avoid any swapping for production database VMs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Mandatory
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;All production IRIS database VMs MUST have 100% memory reservation.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Failure to do this causes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shared memory swapping
&lt;/li&gt;
&lt;li&gt;Severe and unpredictable latency&lt;/li&gt;
&lt;li&gt;Database instability&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;When running multiple IRIS instances or other applications on a single physical host VMware has several technologies for efficient memory management such as transparent page sharing (TPS), ballooning, swap, and memory compression. For example when multiple OS instances are running on the same host TPS allows overcommitment of memory without performance degradation by eliminating redundant copies of pages in memory, which allows virtual machines to run with less memory than on a physical machine. &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: VMware Tools must be installed in the operating system to take advantage of these and many other features of VMware.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Although these features exist to allow for overcommitting memory, the recommendation is to always start by sizing vRAM of all VMs to fit within the physical memory available. Especially important in production environments is to carefully consider the impact of overcommitting memory and overcommit only after collecting data to determine the amount of overcommitment possible. To determine the effectiveness of memory sharing and the degree of acceptable overcommitment for a given IRIS instance, run the workload and use Vmware commands &lt;code&gt;resxtop&lt;/code&gt; or &lt;code&gt;esxtop&lt;/code&gt; to observe the actual savings. &lt;/p&gt;

&lt;p&gt;A good reference is to go back and look at the &lt;a href="https://community.intersystems.com/post/intersystems-data-platforms-and-performance-part-4-looking-memory" rel="noopener noreferrer"&gt;fourth post in this series on memory&lt;/a&gt; when planning your IRIS instance memory requirements. Especially the section "VMware Virtualisation considerations" where I point out:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Set VMware memory reservation on production systems.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You &lt;em&gt;must&lt;/em&gt; avoid any swapping for shared memory. &lt;strong&gt;Reserve the full production database VMs memory (100% reservation)&lt;/strong&gt; to guarantee memory is available for your IRIS instance so there will be no  swapping or ballooning which will negatively impact database performance.&lt;/p&gt;

&lt;p&gt;Notes: Large memory reservations will impact vMotion operations so it is important to take this into consideration when designing the vMotion/management network. A virtual machine can only be live migrated, or started on another host with Vmware HA if the target host has free physical memory greater than or equal to the size of the reservation. This is especially important for production IRIS VMs. For example pay particular attention to HA Admission Control policies.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Ensure capacity planning allows for distribution of VMs in event of HA failover. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;For non-production environments (test, train, etc) more aggressive memory overcommitment is possible, however do not over commit IRIS shared memory, instead limit shared memory in the IRIS instance by having less global buffers. &lt;/p&gt;

&lt;p&gt;Current Intel processor architecture has a NUMA topology. Processors have their own local memory and can access memory on other processors in the same host. Not surprisingly accessing local memory has lower latency than remote. For a discussion of CPU check out the &lt;a href="https://community.intersystems.com/post/intersystems-data-platforms-and-performance-%E2%80%93-part-3-focus-cpu" rel="noopener noreferrer"&gt;third post in this series&lt;/a&gt; including a discussion about NUMA in the &lt;em&gt;comments section&lt;/em&gt;. &lt;/p&gt;

&lt;p&gt;As noted in the BIOS section above a strategy for optimal performance is to ideally size VMs only up to maximum of number of cores and memory on a single processor. For example if your capacity planning shows your biggest production IRIS database VM will be 14 vCPUs and 112 GB memory then consider whether a a cluster of servers with 2x 16-core processor and 256 GB memory is a good fit.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Ideally&lt;/strong&gt; size VMs to keep memory local to a NUMA node. But dont get too hung up on this.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If you need a "Monster VM" bigger than a NUMA node that is OK, VMware will manage NUMA for optimal performance. It also important to right-size your VMs and not allocate more resources than are needed (see below).&lt;/p&gt;



&lt;h2&gt;
  
  
  CPU
&lt;/h2&gt;

&lt;p&gt;The following key rules should be considered for virtual CPU allocation:&lt;/p&gt;




&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;Guidance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial sizing:&lt;/td&gt;
&lt;td&gt;Match bare-metal core count&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;vCPU oversizing:&lt;/td&gt;
&lt;td&gt;Avoid&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hyper-Threading:&lt;/td&gt;
&lt;td&gt;Does &lt;strong&gt;not&lt;/strong&gt; double capacity&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CPU Ready:&lt;/td&gt;
&lt;td&gt;Must remain low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consolidation:&lt;/td&gt;
&lt;td&gt;Only after measurement&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;strong&gt;Key rule&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;1 physical core (with HT) ≈ 1 vCPU&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Hyper-Threading typically provides ~20–30% uplift, workload-dependent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Processor selection&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Prefer &lt;strong&gt;high-frequency CPUs&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AMD EPYC “F” series &lt;/li&gt;
&lt;li&gt;Intel Xeon Gold / Platinum high-GHz SKUs&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Avoid excessive core counts at low clock speeds for DB servers&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;p&gt;Production IRIS systems should be sized based on benchmarks and measurements at live customer sites. For production systems use a strategy of initially sizing the system the same as bare-metal CPU cores and as per best practice monitoring to see if virtual CPUS (vCPUs) can be reduced. &lt;/p&gt;

&lt;h3&gt;
  
  
  Hyperthreading and capacity planning
&lt;/h3&gt;

&lt;p&gt;A good starting point for sizing &lt;strong&gt;production database&lt;/strong&gt; VMs based on your rules for physical servers is to calculate physical server CPU requirements for the target processor with hyper-threading enabled then simply make the transaltaion:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;One physical CPU (includes hyperthreading) = One vCPU (includes hyperthreading).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A common misconception is that hyper-threading somehow doubles vCPU capacity. This is NOT true for physical servers or for logical vCPUs. Hyperthreading on a bare-metal server may give a 30% uplift in performance over the same server without hyperthreading, but this can also be variable depending on the application.&lt;/p&gt;

&lt;p&gt;For initial sizing assume is that the vCPU has full core dedication. For example; if you have a 32-core (2x 16-core) server – size for a total of up to 32 vCPU capacity knowing there may be available headroom.  This configuration assumes hyper-threading is enabled at the host level. VMware will manage the scheduling between all the applications and VMs on the host. Once you have spent time monitoring the appliaction, operating system and VMware performance during peak processing times you can decide if higher consolidation is possible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Licencing
&lt;/h3&gt;

&lt;p&gt;In vSphere you can configure a VM with a certain number of sockets or cores. For example, if you have a dual-processor VM (2 vCPUs), it can be configured as two CPU sockets, or as a single socket with two CPU cores. From an execution standpoint it does not make much of a difference because the hypervisor will ultimately decide whether the VM executes on one or two physical sockets. However, specifying that the dual-CPU VM really has two cores instead of two sockets could make a difference for software licenses.&lt;/p&gt;



&lt;h1&gt;
  
  
  Storage
&lt;/h1&gt;

&lt;blockquote&gt;
&lt;p&gt;This section applies to the more traditional storage model using a shared storage array. For &lt;em&gt;vSAN&lt;/em&gt; recommendations also see the following post: &lt;a href="https://community.intersystems.com/post/intersystems-data-platforms-and-performance-%E2%80%93-part-8-hyper-converged-infrastructure-capacity" rel="noopener noreferrer"&gt;Part 8 Hyper-Converged Infrastructure Capacity and Performance Planning&lt;/a&gt;&lt;/p&gt;


&lt;/blockquote&gt;

&lt;p&gt;The following key rules should be considered for storage:&lt;/p&gt;




&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Area&lt;/th&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sizing metric:&lt;/td&gt;
&lt;td&gt;IOPS and latency, not GB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production disks:&lt;/td&gt;
&lt;td&gt;Thick-provisioned, eager-zeroed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Disk controllers:&lt;/td&gt;
&lt;td&gt;Multiple &lt;strong&gt;PVSCSI&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;I/O separation:&lt;/td&gt;
&lt;td&gt;DB data vs journals vs backups&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VMFS vs RDM:&lt;/td&gt;
&lt;td&gt;VMFS preferred&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VAAI:&lt;/td&gt;
&lt;td&gt;Required where supported&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Best practice&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Separate physical disk groups (or tiers) for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Random DB I/O&lt;/li&gt;
&lt;li&gt;Sequential journal / backup I/O&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Datastore separation alone is insufficient without physical isolation.    &lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;NVMe is strongly recommended for IRIS journal performance and in general.&lt;/p&gt;&lt;/li&gt;

&lt;li&gt;&lt;p&gt;Avoid thin-on-thin provisioning (array + VM).&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  Size storage for performance
&lt;/h2&gt;

&lt;p&gt;Bottlenecks in storage is one of the most common problems affecting IRIS system performance, the same is true for VMware vSphere configurations. The most common problem is sizing storage simply for GB capacity, rather than allocating a high enough number of  IOPS. Storage problems can be even more severe in VMware because more hosts can be accessing the same storage over the same physical connections.&lt;/p&gt;

&lt;h2&gt;
  
  
  VMware Storage overview
&lt;/h2&gt;

&lt;p&gt;VMware storage virtualization can be categorized into three layers, for example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The storage array is the bottom layer, consisting of physical storage presented as logical disks (storage array volumes or LUNs) to the layer above.&lt;/li&gt;
&lt;li&gt;The next layer is the virtual environment occupied by vSphere. Storage array LUNs are presented to ESXi hosts as datastores and are formatted as VMFS volumes.&lt;/li&gt;
&lt;li&gt;Virtual machines are made up of files in the datastore and include virtual disks are presented to the guest operating system as disks that can be partitioned and used in file systems.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;VMware offers two choices for managing disk access in a virtual machine—VMware Virtual Machine File System (VMFS) and raw device mapping (RDM), both offer similar performance. For simple management VMware generally recommends VMFS, but there may be situations where RDMs are required. As a general recommendation – unless there is a particular reason to use RDM choose VMFS, &lt;em&gt;new development by VMware is directed to VMFS and not RDM.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Virtual Machine File System (VMFS)
&lt;/h3&gt;

&lt;p&gt;VMFS is a file system developed by VMware that is dedicated and optimized for clustered virtual environments (allows read/write access from several hosts) and the storage of large files. The structure of VMFS makes it possible to store VM files in a single folder, simplifying VM administration. VMFS also enables VMware infrastructure services such as vMotion, DRS and VMware HA.&lt;/p&gt;

&lt;p&gt;Operating Systems, applications, and data are stored in virtual disk files (.vmdk files). vmdk files are stored in the Datastore.  A single VM can be made up of multiple vmdk files spread over several datastores. As the production VM in the diagram below shows a VM can include storage spread over several data stores. For production systems best performance is achieved with one vmdk file per LUN, for non-production systems (test, training etc) multiple VMs vmdk files can share a datastore and a LUN. &lt;/p&gt;

&lt;p&gt;When deploying IRIS typically multiple VMFS volumes mapped to LUNs on separate disk groups are used to separate IO patterns and improve performance. For example random or sequential IO disk groups or to separate production IO from IO from other environments. &lt;/p&gt;

&lt;p&gt;The following diagram shows an overview of an example VMware VMFS storage used with IRIS:&lt;/p&gt;


 &lt;br&gt;
&lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcommunity.intersystems.com%2Fsites%2Fdefault%2Ffiles%2Finline%2Fimages%2Fcachebestpractice2016_206.png" width="567" height="424"&gt;

&lt;p&gt;&lt;em&gt;Figure 2. Example IRIS storage on VMFS&lt;/em&gt;&lt;/p&gt;



&lt;h3&gt;
  
  
  RDM
&lt;/h3&gt;

&lt;p&gt;RDM allows management and access of raw SCSI disks or LUNs as VMFS files. An RDM is a special file on a VMFS volume that acts as a proxy for a raw device. VMFS is recommended for most virtual disk storage, but raw disks might be desirable in some cases. RDM is only available for Fibre Channel or iSCSI storage. &lt;/p&gt;

&lt;h3&gt;
  
  
  VMware vStorage APIs for Array Integration (VAAI)
&lt;/h3&gt;

&lt;p&gt;For the best storage performance, customers should consider using VAAI-capable storage hardware. VAAI can improve the performance in several areas including virtual machine provisioning and of thin-provisioned virtual disks. VAAI may be available as a firmware update from the array vendor for older arrays.&lt;/p&gt;

&lt;h3&gt;
  
  
  Virtual Disk Types
&lt;/h3&gt;

&lt;p&gt;ESXi supports multiple virtual disk types:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thick Provisioned&lt;/strong&gt; – where space is allocated at creation. There are further types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Eager Zeroed – writes 0’s to the entire drive. This increases the time it takes to create the disk, but results in the best performance, even on the first write to each block.&lt;/li&gt;
&lt;li&gt;Lazy Zeroed – writes 0’s as each block is first written to. Lazy zero results in a shorter creation time, but reduced performance the first time a block is written to. Subsequent writes, however, have the same performance as on eager-zeroed thick disks.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Thin Provisioned&lt;/strong&gt; – where space is allocated and zeroed upon write. There is a higher I/O cost (similar to that of lazy-zeroed thick disks) during the first write to an unwritten file block, but on subsequent writes thin-provisioned disks have the same performance as eager-zeroed thick disks&lt;/p&gt;

&lt;p&gt;&lt;em&gt;In all disk types VAAI can improve performance by offloading operations to the storage array.&lt;/em&gt; Some arrays also support thin provisioning at the array level, do not thin provision ESXi disks on thin provisioned array storage as there can be conflicts in provisioning and management. &lt;/p&gt;

&lt;h3&gt;
  
  
  Other Notes
&lt;/h3&gt;

&lt;p&gt;As noted above for best practice use the same strategies as bare-metal configurations; production storage may be separated at the array level into several disk groups:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Random access for IRIS production databases&lt;/li&gt;
&lt;li&gt;Sequential access for backups and journals, but also a place for other non-production storage such as test, train, and so on&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Remember that a datastore is an abstraction of the storage tier and, therefore, it is a logical representation not a physical representation of the storage. Creating a dedicated datastore to isolate a particular I/O workload (whether journal or database files), without isolating the physical storage layer as well, does not have the desired effect on performance.&lt;/p&gt;

&lt;p&gt;Although performance is key, choice of shared storage depends more on existing or planned infrastructure at site than impact of VMware. As with bare-metal implementations FC SAN is the best performing and is recommended.  For FC minimum of 8Gbps adapters are the recommended minimum. iSCSI storage is only supported if appropriate network infrastructure is in place,  including; minimum 10Gb Ethernet and jumbo frames (MTU 9000) must be supported on all components in the network between server and storage with separation from other traffic.  &lt;/p&gt;

&lt;p&gt;Use multiple VMware Paravirtual SCSI (PVSCSI) controllers for the database virtual machines or virtual machines with high I/O load. PVSCSI can provide some significant benefits by increasing overall storage throughput while reducing CPU utilization.  The use of multiple PVSCSI controllers allows the execution of several parallel I/O operations inside the guest operating system. It is also recommended to separate journal I/O traffic from the database I/O traffic through separate virtual SCSI controllers. As a best practice, you can use one controller for the operating system and swap, another controller for journals, and one or more additional controllers for database data files (depending on the number and size of the database data files). &lt;/p&gt;

&lt;p&gt;Aligning file system partitions is a well-known storage best practice for database workloads. Partition alignment on both physical machines and VMware VMFS partitions prevents performance I/O degradation caused by I/O crossing track boundaries. VMware test results show that aligning VMFS partitions to 64KB track boundaries results in reduced latency and increased throughput. VMFS partitions created using vCenter are aligned on 64KB boundaries as recommended by storage and operating system vendors.&lt;/p&gt;



&lt;h1&gt;
  
  
  Networking
&lt;/h1&gt;

&lt;p&gt;The following key rules should be considered for networking:&lt;/p&gt;




&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Guidance&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Adapter:&lt;/td&gt;
&lt;td&gt;VMXNET3 only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VMware Tools:&lt;/td&gt;
&lt;td&gt;Mandatory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Traffic separation:&lt;/td&gt;
&lt;td&gt;Mgmt / vMotion / Storage / App&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Switch type:&lt;/td&gt;
&lt;td&gt;Distributed vSwitch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bandwidth:&lt;/td&gt;
&lt;td&gt;≥10 Gb minimum&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;ul&gt;
&lt;li&gt;For large-memory IRIS VMs, consider &lt;strong&gt;25 Gb+&lt;/strong&gt; for vMotion networks.
&lt;/li&gt;
&lt;li&gt;Intra-host VM traffic is significantly faster; use DRS affinity rules carefully.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;As noted above VMXNET adapaters have better capabilities than the default E1000 adapter. VMXNET3 allows 10Gb and uses less CPU where as E1000 is only 1Gb. If there is only 1 gigabit network connections between hosts there is not a lot of difference for client to VM communication. However with VMXNET3 it will allow 10Gb between VMs on the same host, which does make a difference especially in multi-tier deployments or where there is high network IO requirements between instances. This feature should also be taken into consideration when planning affinity and antiaffinity DRS rules to keep VMs on the same or separate virtual switches.&lt;/p&gt;

&lt;p&gt;The E1000 use universal drivers that can be used in Windows or Linux. Once VMware Tools is installed on the guest operating system VMXNET virtual adapters can be installed.&lt;/p&gt;

&lt;p&gt;The following diagram shows a typical small server configuration with four physical NIC ports, two ports have been configured within VMware for infrastructure traffic: dvSwitch0 for Management and vMotion, and two ports for application use by VMs. NIC teaming and load balancing is used for best throughput and HA.&lt;/p&gt;


 &lt;br&gt;
&lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcommunity.intersystems.com%2Fsites%2Fdefault%2Ffiles%2Finline%2Fimages%2Fcachebestpractice2016_207_1.png" width="497" height="359"&gt;

&lt;p&gt;&lt;em&gt;Figure 3. A typical small server configuration with four physical NIC ports.&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;h1&gt;
  
  
  Guest Operating Systems
&lt;/h1&gt;

&lt;p&gt;The following are recommended:&lt;/p&gt;




&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Item&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;OS:&lt;/td&gt;
&lt;td&gt;RHEL 8 / 9 (or equivalent)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Architecture:&lt;/td&gt;
&lt;td&gt;64-bit only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;VMware Tools:&lt;/td&gt;
&lt;td&gt;Installed and current&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time sync:&lt;/td&gt;
&lt;td&gt;NTP (not VMware Tools)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OS tuning:&lt;/td&gt;
&lt;td&gt;Same as bare-metal IRIS&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;blockquote&gt;
&lt;p&gt;It is very important to load VMware tools in to all VM operating systems and keep the tools current. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;VMware Tools is a suite of utilities that enhances the performance of the virtual machine's guest operating system and improves management of the virtual machine. Without VMware Tools installed in your guest operating system, guest performance lacks important functionality.&lt;/p&gt;

&lt;p&gt;Its vital that the time is set correctly on all ESXi hosts - it ends up affecting the Guest VMs. The default setting for the VMs is not to sync the guest time with the host - but at certain times the guest still do sync their time with the host and if the time is out has been known to cause major issues. VMware recommends using NTP instead of VMware Tools periodic time synchronization. NTP is an industry standard and ensures accurate timekeeping in your guest. It may be necessary to open the firewall (UDP 123) to allow NTP traffic.&lt;/p&gt;



&lt;h1&gt;
  
  
  DNS Configuration
&lt;/h1&gt;

&lt;p&gt;If your DNS server is hosted on virtualized infrastructure and becomes unavailable, it prevents vCenter from resolving host names, making the virtual environment unmanageable -- however the virtual machines themselves keep operating without problem.&lt;/p&gt;




&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;DNS availability:&lt;/td&gt;
&lt;td&gt;Mandatory for vCenter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DNS redundancy:&lt;/td&gt;
&lt;td&gt;Required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Failure testing:&lt;/td&gt;
&lt;td&gt;Required&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;blockquote&gt;
&lt;p&gt;Virtual machines continue running without DNS, but &lt;strong&gt;management does not&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Best practice&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ensure at least one DNS resolver exists &lt;strong&gt;outside&lt;/strong&gt; the vSphere failure domain.&lt;/li&gt;
&lt;/ul&gt;



&lt;h1&gt;
  
  
  High Availability
&lt;/h1&gt;

&lt;p&gt;High availability is provided by features such as VMware vMotion, VMware Distributed Resource Scheduler (DRS) and VMware High Availability (HA). IRIS Database mirroring can also be used to increase uptime.&lt;/p&gt;

&lt;p&gt;It is important that IRIS production systems are designed with n+1 physical hosts. There must be enough resources (e.g. CPU and Memory) for all the VMs to run on remaining hosts in the event of a single host failure.  In the event of server failure if VMware cannot allocate enough CPU and memory resources on the remaining server VMware HA will not restart VMs on the remaining servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  vMotion
&lt;/h2&gt;

&lt;p&gt;vMotion can be used with IRIS. vMotion allows migration of a functioning VM from one ESXi host server to another in a fully transparent manner. The OS and applications such as IRIS running in the VM have no service interruption. &lt;/p&gt;

&lt;p&gt;When migrating using vMotion, only the status and memory of the VM—with its configuration—moves. The virtual disk does not need to move; it stays in the same shared-storage location. Once the VM has migrated, it is operating on the new physical host. &lt;/p&gt;

&lt;p&gt;vMotion can function only with a shared storage architecture (such as Shared SAS array, FC SAN or iSCSI). As IRIS is usually configured to use a large amount of shared memory it is important to have adequare network capacity available to vMotion, a 1Gb nework may be OK, however higher bandwidth may be required or multi-NIC vMotion can be configured.&lt;/p&gt;

&lt;h2&gt;
  
  
  DRS
&lt;/h2&gt;

&lt;p&gt;Distributed Resource Scheduler (DRS) is a method of automating the use of vMotion in a production environment by sharing the workload among different host servers in a cluster.&lt;br&gt;
DRS also presents the ability to implement QoS for a VM instance to protect resources for Production VMs by stopping non-production VMs over using resources.  DRS collects information about the use of the cluster’s host servers and optimize resources by distributing the VMs’ workload among the cluster’s different servers. This migration can be performed automatically or manually.&lt;/p&gt;

&lt;h2&gt;
  
  
  IRIS Database Mirror
&lt;/h2&gt;

&lt;p&gt;For mission critical tier-1 IRIS database application instances requiring the highest availability consider also using &lt;a href="http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=GHA_mirror#GHA_mirror_set_bp_vm" rel="noopener noreferrer"&gt;InterSystems synchronous database mirroring.&lt;/a&gt; Additional advantages of also using mirroring include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Separate copies of up-to-date data.&lt;/li&gt;
&lt;li&gt;Failover in seconds (faster than restarting a VM then operating System then recovering IRIS).&lt;/li&gt;
&lt;li&gt;Failover in case of application/IRIS failure (not detected by VMware).&lt;/li&gt;
&lt;/ul&gt;



&lt;h1&gt;
  
  
  vCenter Appliance
&lt;/h1&gt;

&lt;p&gt;The vCenter Server Appliance is a preconfigured Linux-based virtual machine optimized for running vCenter Server and associated services. I have been recommending sites with small clusters to use the VMware vCenter Server Appliance as an alternative to installing vCenter Server on a Windows VM. In vSphere 6.5 the appliance is recommended for all deployments. &lt;/p&gt;



&lt;h1&gt;
  
  
  Summary
&lt;/h1&gt;

&lt;p&gt;This post is a rundown of key best practices you should consider when deploying IRIS on VMware. Most of these best practices are not unique to IRIS but can be applied to other tier-1 business critical deployments on VMware.&lt;/p&gt;

&lt;p&gt;If you have any questions please let me know via the comments below.&lt;/p&gt;

</description>
      <category>redis</category>
      <category>beginners</category>
      <category>performance</category>
      <category>programming</category>
    </item>
    <item>
      <title>Building a Medical History Chatbot - FHIR, Vector Search and RAG for beginners</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Tue, 17 Mar 2026 17:24:11 +0000</pubDate>
      <link>https://dev.to/intersystems/building-a-medical-history-chatbot-fhir-vector-search-and-rag-for-beginners-27b8</link>
      <guid>https://dev.to/intersystems/building-a-medical-history-chatbot-fhir-vector-search-and-rag-for-beginners-27b8</guid>
      <description>&lt;h3&gt;
  
  
  Introduction
&lt;/h3&gt;

&lt;p&gt;Earlier this year, I set about creating kit to introduce young techy folk at a Health Tech hackathon to using InterSystems IRIS for health, particularly focusing on using FHIR and vector search.&lt;/p&gt;

&lt;p&gt;I wanted to publish this to the developer community because the tutorials included in the kit make a great introduction to using FHIR and to building a basic RAG system in IRIS. Its an all inclusive set of tutorials to show in detail how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connect to IRIS with Python &lt;/li&gt;
&lt;li&gt;Use the InterSystems FHIR Server &lt;/li&gt;
&lt;li&gt;Convert FHIR data into relational data with the &lt;strong&gt;FHIR-SQL builder&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Use InterSystems &lt;strong&gt;Vector Search&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;As a bonus using &lt;strong&gt;Ollama&lt;/strong&gt; to prompt local AI models &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This repo contains a full series of Jupyter Notebook tutorials for developing a medical history chatbot, as well as various other tutorials on using a FHIR server, so forgive me if this article is slightly light technical detail, but there's plenty of information in the linked Open Exchange Package!&lt;/p&gt;

&lt;h3&gt;
  
  
  Designing the Demo
&lt;/h3&gt;

&lt;p&gt;The design brief I was given was to build a hackathon kit (which I defined as a fully-worked through, easy to follow demo app) that used FHIR data and AI. &lt;/p&gt;

&lt;p&gt;The first question with this kind of project is where the data is coming from. I needed &lt;strong&gt;FHIR Data&lt;/strong&gt; with some sort of &lt;strong&gt;plain text&lt;/strong&gt; which could be vectorized for Vector Search. Here I had two problems: &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Real Patient data isn't easy to come across. 
    - &lt;strong&gt;Solution&lt;/strong&gt; - use synthetically generated patient data with Synthea&lt;/li&gt;
&lt;li&gt;Plain text resources are generally clinical notes in Document Reference FHIR resources.
    - &lt;strong&gt;Solution&lt;/strong&gt; - Use GenAI to write my own clinical notes and load them into FHIR Resource bundles&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Coming up with a source of plain text clinical data suitable for vectorization was my first major stumbling point, as I struggled to find anything worthwhile. The inspiration of using clinical notes to create a patient chatbot did not appear from nowhere. Instead, I saw a similar demonstration by &lt;a class="mentioned-user" href="https://dev.to/simon"&gt;@simon&lt;/a&gt;.Sha in the 2025 Demo Games. This was a great demo, so I wanted to create something similar to use for a fully guided tutorial!&lt;/p&gt;

&lt;h3&gt;
  
  
  Simplifying FHIR server set-up
&lt;/h3&gt;

&lt;p&gt;The first step of the tutorial was running an instance of IRIS for Health with a FHIR server, ideally with data pre-loaded. For this, I decided to use an Open Exchange template. If you are lost at where to start on a project, the Open Exchange is often a great place to have a look! &lt;/p&gt;

&lt;p&gt;I found two FHIR templates, &lt;a href="https://openexchange.intersystems.com/package/iris-fhir-template" rel="noopener noreferrer"&gt;iris-fhir-template&lt;/a&gt; by &lt;a class="mentioned-user" href="https://dev.to/evgeny"&gt;@evgeny&lt;/a&gt;.Shvarov, and  &lt;a href="https://github.com/pjamiesointersystems/Dockerfhir" rel="noopener noreferrer"&gt;Dockerfhir&lt;/a&gt; by &lt;a class="mentioned-user" href="https://dev.to/patrick"&gt;@patrick&lt;/a&gt;.Jamieson3621. Both of these templates are excellent, and in my final version of the hackathon kit, I ended up using a combination of them. If I was starting over, I would recommend the &lt;a href="https://openexchange.intersystems.com/package/iris-fhir-template" rel="noopener noreferrer"&gt;iris-fhir-template&lt;/a&gt;  because this has a built in user interface and swagger-UI to test the FHIR endpoints. Trying to combine the two at a later date became a nightmare because the iris-fhir-template has the FHIR server endpoint hardcoded. &lt;/p&gt;

&lt;p&gt;On the bright-side, the day I spent building and rebuilding docker containers made me much more confident on how a Dockerfile, module.xml and iris.script setup works. If you haven't already, I recommend breaking one of the many dev-templates available on the open exchange and learning how to rebuild or fix it. Its really useful to understand how these work when creating your own projects.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Vector Search
&lt;/h3&gt;

&lt;p&gt;In my eyes, the remarkable thing about vector search is how easy it is to set-up and perform, particularly in IRIS. Sure, there's refinement that can be done later, like using a hybrid vector/keyword search or adding some sort of re-ranking system, but the basic steps of: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Importing a model&lt;/li&gt;
&lt;li&gt;Creating Vectors from plain text&lt;/li&gt;
&lt;li&gt;Inserting vectors into a table in IRIS&lt;/li&gt;
&lt;li&gt;Converting a query to a vector&lt;/li&gt;
&lt;li&gt;Querying the database with the query vector&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Can all be performed in ~50 lines of Python code. &lt;/p&gt;

&lt;p&gt;This makes it a great place for newcomers to IRIS to start developing, which is why it was chosen for this hackathon kit. &lt;/p&gt;

&lt;h3&gt;
  
  
  Prompting with Ollama
&lt;/h3&gt;

&lt;p&gt;I've always liked the idea of prompting local models, knowing that it will always be free, doesn't need any API key set-up, and doesn't involving sending your data elsewhere. This last point can be particularly important with medical records, when its important to keep data private, and restrict third-party access. In the past, I used models with Hugging Faces Transformer module, and the results were incredibly slow, and incredibly poor. &lt;/p&gt;

&lt;p&gt;For this project I tried Ollama, which was a great improvement on Hugging Faces. Models that 'weigh' less than a Gigabyte, like gemma-1b give surprisingly coherent, and even accurate responses. The speed of response (at least on my computer) can be quite slow, particularly for large context windows, but if you are patient (or like taking constant tea-breaks while waiting for a model response), they perform quite well! &lt;/p&gt;

&lt;p&gt;I enjoyed putting together the Ollama prompting section, even if at a real hackathon, all the competitors just did the sensible thing and used the OpenAI API...&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-life use
&lt;/h3&gt;

&lt;p&gt;We shared this tutorial with teams at the Hackjak Brno Healthcare hackathon in November 2025 and received good feedback. 11 (out of 25) teams used aspects of the kit in their final solutions, &lt;/p&gt;

&lt;p&gt;The solutions built by hackathon teams were impressive and inspirational, with use cases ranging from using IRIS vector search in a RAG pipeline, to creating tools to fill out medical forms which connect directly to a FHIR server back-end. One of the teams (VIPIK) even uploaded their solution to &lt;a href="https://openexchange.intersystems.com/package/VIPIK" rel="noopener noreferrer"&gt;Open Exchange&lt;/a&gt;, which was really nice to see. &lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusions
&lt;/h3&gt;

&lt;p&gt;This demo was really fun to build and I'm really glad it proved useful at the hackathon in Czech Republic. I hope it will be used more in future, as its a nice entrypoint to using FHIR data with IRIS, Python and Vector Search!&lt;/p&gt;

&lt;p&gt;Thanks for reading, and check out the full tutorial on Open Exchange! &lt;/p&gt;

&lt;h3&gt;
  
  
  Acknowledgements
&lt;/h3&gt;

&lt;p&gt;Thanks to @Ruby.Howard, @tomd , &lt;a class="mentioned-user" href="https://dev.to/daniel"&gt;@daniel&lt;/a&gt;.Kutac and &lt;a class="mentioned-user" href="https://dev.to/ondrej"&gt;@ondrej&lt;/a&gt;.Hoferek for working through the tutorial and providing feedback and &lt;a class="mentioned-user" href="https://dev.to/simon"&gt;@simon&lt;/a&gt;.Sha for the original inspiration with your entry to the Demo Games last year.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>beginners</category>
      <category>python</category>
      <category>ai</category>
    </item>
    <item>
      <title>OMOP Odyssey - Vanna AI ( The Underworld )</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Tue, 17 Mar 2026 17:17:50 +0000</pubDate>
      <link>https://dev.to/intersystems/omop-odyssey-vanna-ai-the-underworld--4okh</link>
      <guid>https://dev.to/intersystems/omop-odyssey-vanna-ai-the-underworld--4okh</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu86ayuwlgqwahyzdoxrv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu86ayuwlgqwahyzdoxrv.png" alt=" " width="800" height="206"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Vanna.AI - Personalized AI InterSystems OMOP Agent&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu7v2ixddkr99hv3gi6gw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu7v2ixddkr99hv3gi6gw.png" alt=" " width="800" height="224"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Along this &lt;a href="https://community.intersystems.com/smartsearch?search=OMOP+Odyssey" rel="noopener noreferrer"&gt;OMOP Journey,&lt;/a&gt; from the OHDSI book to Achilles, you can begin to understand the power of the OMOP Common Data Model when you see the mix of well written R and SQL deriving results for large scale analytics that are shareable across organizations.  I however do not have a third normal form brain and about a month ago on the Journey &lt;a href="https://community.intersystems.com/post/omop-odyssey-no-code-cdm-exploration-databricks-aibi-genie-island-aeolus" rel="noopener noreferrer"&gt;we employed Databricks Genie&lt;/a&gt; to generate sql for us utilizing InterSystems OMOP and Python interoperability.  This was fantastic, but left some magic under the hood in Databricks on how the RAG "model" was being constructed and the LLM in use to pull it off. &lt;/p&gt;

&lt;p&gt;&lt;em&gt;This point in the OMOP Journey we met Vanna.ai on the same beaten path...&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Vanna is a Python package that uses retrieval augmentation to help you generate accurate SQL queries for your database using LLMs. Vanna works in two easy steps - train a RAG “model” on your data, and then ask questions which will return SQL queries that can be set up to automatically run on your database. &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ql5zu4r5e1zzzjzpzyr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5ql5zu4r5e1zzzjzpzyr.png" alt=" " width="501" height="157"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Vanna exposes all the pieces to do it ourselves with more control and our own stack against the OMOP Common Data Model.&lt;/p&gt;

&lt;p&gt;The approach from the Vanna camp I found particularly fantastic, and conceptually it felt like a magic trick was being performed, and one could certainly argue that was exactly what was happening.&lt;/p&gt;

&lt;p&gt;Vanna needs 3 choices to pull of its magic trick, a sql database, a vector database, and an LLM.  Just envision a dealer handing you out three piles and making you choose from each one.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fegzmhg7r25yktpsfzkxz.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fegzmhg7r25yktpsfzkxz.png" alt=" " width="753" height="270"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So if its not obvious, our sql database is InterSystems OMOP implementing the Common Data Model, our LLM of choice is Gemini, and for the quick and dirty evaluation we are using Chroma DB for a vector to get to the point quickly in python.&lt;/p&gt;

&lt;h2&gt;Gemini&lt;/h2&gt;

&lt;p&gt;So I cut a quick key and grew up a little bit and actually paid for it, I tried the free route with the rate limits of 50 prompts a day, and 1 per minute and it was unsettling... I may be happier being completely broke anyway, so we will see.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4r0v0ines7osb94f0rn1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4r0v0ines7osb94f0rn1.png" alt=" " width="800" height="232"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;InterSystems OMOP&lt;/h2&gt;

&lt;p&gt;I am using my same fading trial as the &lt;a href="https://community.intersystems.com/smartsearch?search=OMOP+Journey" rel="noopener noreferrer"&gt;other posts&lt;/a&gt;.  The CDM is loaded with about 100 patient pop per United State region with the pracs and orgs to boot.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr1xadg2xhnej4ehfaooc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr1xadg2xhnej4ehfaooc.png" alt=" " width="800" height="252"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Vanna&lt;/h2&gt;

&lt;p&gt;Let's turn the letters (get it?) notebook style and spin the wheel (get it again?) and put Vanna to work...&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;pip3&lt;/span&gt; &lt;span class="n"&gt;install&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;vanna[chromadb,gemini,sqlalchemy-iris]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lets organize our pythons.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vanna.chromadb&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ChromaDB_VectorStore&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vanna.google&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;GoogleGeminiChat&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_engine&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ssl&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;sqlalchemy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;create_engine&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Initialize the star of our show and introduce her to our model.  Kind of weird right, Vanna (White) is a model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyVanna&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ChromaDB_VectorStore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;GoogleGeminiChat&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;ChromaDB_VectorStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;GoogleGeminiChat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;api_key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;shaazButt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;model&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-2.0-flash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;vn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MyVanna&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let's connect to our InterSystems OMOP Cloud deployment using &lt;a href="https://github.com/caretdev/sqlalchemy-iris" rel="noopener noreferrer"&gt;sqlalchemy-iris&lt;/a&gt; from @caretdev.  The work done with this dialect is quickly becoming the key ingredient for modern data interoperability of iris products in the data world.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_engine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;iris://SQLAdmin:LordFauntleroy!!!@k8s-0a6bc2ca-adb040ad-c7bf2ee7c6-e6b05ee242f76bf2.elb.us-east-1.amazonaws.com:443/USER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;connect_args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sslcontext&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;SSLContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PROTOCOL_TLS_CLIENT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;verify_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CERT_OPTIONAL&lt;/span&gt;
&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check_hostname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load_verify_locations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vanna-omop.pem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You define a function that takes in a SQL query as a string and returns a pandas dataframe.  This gives Vanna a function that it can use to run the SQL on the OMOP Common Data Model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run_sql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sql&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_sql_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sql&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;df&lt;/span&gt;

&lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_sql&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;run_sql&lt;/span&gt;
&lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run_sql_is_set&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;Feeding the Model with a Menu&lt;/h2&gt;

&lt;p&gt;The information schema query may need some tweaking depending on your database. This is a good starting point.&lt;br&gt;
This will break up the information schema into bite-sized chunks that can be referenced by the LLM...&lt;br&gt;
If you like the plan, then uncomment this and run it to train Vanna.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;df_information_schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_sql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM INFORMATION_SCHEMA.COLUMNS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;plan&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_training_plan_generic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df_information_schema&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;plan&lt;/span&gt;

&lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;Training&lt;/h2&gt;

&lt;p&gt;The following are methods for adding training data. Make sure you modify the examples to match your database.&lt;br&gt;
DDL statements are powerful because they specify table names, column names, types, and potentially relationships.  These ddl's are generated with the now supported DataBaseConnector as outlined in this &lt;a href="https://community.intersystems.com/post/omop-odyssey-celebration-house-hades" rel="noopener noreferrer"&gt;post&lt;/a&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ddl&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
--iris CDM DDL Specification for OMOP Common Data Model 5.4
--HINT DISTRIBUTE ON KEY (person_id)
CREATE TABLE omopcdm54.person (
            person_id integer NOT NULL,
            gender_concept_id integer NOT NULL,
            year_of_birth integer NOT NULL,
            month_of_birth integer NULL,
            day_of_birth integer NULL,
            birth_datetime datetime NULL,
            race_source_concept_id integer NULL,
            ethnicity_source_value varchar(50) NULL,
            ethnicity_source_concept_id integer NULL );
--HINT DISTRIBUTE ON KEY (person_id)
CREATE TABLE omopcdm54.observation_period (
            observation_period_id integer NOT NULL,
            person_id integer NOT NULL,
            observation_period_start_date date NOT NULL,
            observation_period_end_date date NOT NULL,
            period_type_concept_id integer NOT NULL );
--HINT DISTRIBUTE ON KEY (person_id)
CREATE TABLE omopcdm54.visit_occurrence (
            visit_occurrence_id integer NOT NULL,
            discharged_to_source_value varchar(50) NULL,
            preceding_visit_occurrence_id integer NULL );
--HINT DISTRIBUTE ON KEY (person_id)
CREATE TABLE omopcdm54.visit_detail (
            visit_detail_id integer NOT NULL,
            person_id integer NOT NULL,
            visit_detail_concept_id integer NOT NULL,
            provider_id integer NULL,
            care_site_id integer NULL,
            visit_detail_source_value varchar(50) NULL,
            visit_detail_source_concept_id Integer NULL,

--HINT DISTRIBUTE ON KEY (person_id)
CREATE TABLE omopcdm54.condition_occurrence (
            condition_occurrence_id integer NOT NULL,
            person_id integer NOT NULL,
            visit_detail_id integer NULL,
            condition_source_value varchar(50) NULL,
            condition_source_concept_id integer NULL,
            condition_status_source_value varchar(50) NULL );
--HINT DISTRIBUTE ON KEY (person_id)
CREATE TABLE omopcdm54.drug_exposure (
            drug_exposure_id integer NOT NULL,
            person_id integer NOT NULL,
            dose_unit_source_value varchar(50) NULL );
--HINT DISTRIBUTE ON KEY (person_id)
CREATE TABLE omopcdm54.procedure_occurrence (
            procedure_occurrence_id integer NOT NULL,
            person_id integer NOT NULL,
            procedure_concept_id integer NOT NULL,
            procedure_date date NOT NULL,
            procedure_source_concept_id integer NULL,
            modifier_source_value varchar(50) NULL );
--HINT DISTRIBUTE ON KEY (person_id)
CREATE TABLE omopcdm54.device_exposure (
            device_exposure_id integer NOT NULL,
            person_id integer NOT NULL,
            device_concept_id integer NOT NULL,
            unit_source_value varchar(50) NULL,
            unit_source_concept_id integer NULL );
--HINT DISTRIBUTE ON KEY (person_id)
CREATE TABLE omopcdm54.observation (
            observation_id integer NOT NULL,
            person_id integer NOT NULL,
            observation_concept_id integer NOT NULL,
            observation_date date NOT NULL,
            observation_datetime datetime NULL,
&amp;lt;SNIP&amp;gt;

&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sometimes you may want to add documentation about your business terminology or definitions, here I like to add the resource names from FHIR that were transformed to OMOP.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documentation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Our business is to provide tools for generating evicence in the OHDSI community from the CDM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documentation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Another word for care_site is organization.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;documentation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Another word for provider is practitioner.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now lets add all the data from the InterSystems OMOP Common Data Model, probably a better way to do this, but I get paid by the byte.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;cdmtables&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;care_site&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cdm_source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cohort&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cohort_definition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;concept&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;concept_ancestor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;concept_class&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;concept_relationship&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;concept_synonym&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;condition_era&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;condition_occurrence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cost&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;death&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;device_exposure&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;domain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dose_era&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;drug_era&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;drug_exposure&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;drug_strength&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;episode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;episode_event&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fact_relationship&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;location&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;measurement&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;note&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;note_nlp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;observation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;observation_period&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;payer_plan_period&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;person&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;procedure_occurrence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;provider&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;relationship&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source_to_concept_map&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;specimen&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;visit_detail&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;visit_occurrence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vocabulary&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;cdmtables&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sql&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SELECT * FROM  WHERE OMOPCDM54.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;table&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;I added the ability for Gemini to see the data here, ensure you want to do this in your travels or give Google your OMOP data with slight of hand.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Lets do our best &lt;a href="https://en.wikipedia.org/wiki/Pat_Sajak" rel="noopener noreferrer"&gt;Pat Sajak,&lt;/a&gt; and boot the shiny Vanna app.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vanna.flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VannaFlaskApp&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VannaFlaskApp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;allow_llm_to_see_data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;debug&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj9caqb9y521v3zsz013h.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj9caqb9y521v3zsz013h.jpg" alt=" " width="800" height="565"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Skynet!&lt;/h2&gt;

&lt;p&gt;This is a bit hackish, but really where I want to go with AI future forward integrating with apps, here we ask in natural language a question, which returns a sql query, then we immediately use that query against the InterSystems OMOP deployment using sqlalchemy-iris.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;
    &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;
    &lt;span class="n"&gt;old_stdout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;
    &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;StringIO&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Redirect stdout to a dummy stream
&lt;/span&gt;
    &lt;span class="n"&gt;question&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;How Many Care Sites are there in Los Angeles?&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;old_stdout&lt;/span&gt;

    &lt;span class="n"&gt;sql_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;question&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ask Vanna to generate a query from a question of the OMOP database...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;#print(type(sql_query))
&lt;/span&gt;    &lt;span class="n"&gt;raw_sql_to_send_to_sqlalchemy_iris&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sql_query&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Vanna returns the query to use against the database.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;gar&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;raw_sql_to_send_to_sqlalchemy_iris&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FROM care_site&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FROM OMOPCDM54.care_site&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gar&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Now use sqlalchemy-iris with the generated query back to the OMOP database...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exec_driver_sql&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gar&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;#print(result)
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;Utilities&lt;/h2&gt;

&lt;p&gt;At any time you can inspect what OMOP data the Vanna package is able to reference. You can also remove training data if there's obsolete/incorrect information (you can do this through the UI too).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;training_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_training_data&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;training_data&lt;/span&gt;
&lt;span class="n"&gt;vn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remove_training_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;omop-ddl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;About Using IRIS Vectors&lt;/h2&gt;

&lt;p&gt;Wish me luck here, if I manage to crush all the things to crush and resist the sun coming out, Ill implement iris vectors in vanna with the following repo.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/sween/vanna-iris-vector" rel="noopener noreferrer"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhx0v0nxt3hiam6pspap0.png" alt=" " width="519" height="164"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>beginners</category>
      <category>python</category>
    </item>
    <item>
      <title>Mini Tip of the Day - Preloading the License into the Docker IRIS Image</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Sat, 28 Feb 2026 15:38:57 +0000</pubDate>
      <link>https://dev.to/intersystems/mini-tip-of-the-day-preloading-the-license-into-the-docker-iris-image-14n8</link>
      <guid>https://dev.to/intersystems/mini-tip-of-the-day-preloading-the-license-into-the-docker-iris-image-14n8</guid>
      <description>&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Who hasn't been developing a beautiful example using a Docker IRIS image and had the image generation process fail in the Dockerfile because the license under which the image was created doesn't contain certain privileges?&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;In my case, what I was deploying in Docker is a small application that uses the Vector data type. With the Community version, this isn't a problem because it already includes Vector Search and vector storage. However, when I changed the IRIS image to a conventional IRIS (the latest-cd), I found that when I built the image, including the classes it had generated, it returned this error:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;9.505 ERROR &lt;span class="hljs-comment"&gt;#15806: Vector Search not permitted with current license&lt;/span&gt;
9.505   &amp;gt; ERROR &lt;span class="hljs-comment"&gt;#5030: An error occurred while compiling class 'Inquisidor.Object.LicitacionOS'&lt;/span&gt;
9.505 Compiling class Inquisidor.Object.Licitacion
9.505 ERROR &lt;span class="hljs-comment"&gt;#15806: Vector Search not permitted with current license&lt;/span&gt;
9.505   &amp;gt; ERROR &lt;span class="hljs-comment"&gt;#5030: An error occurred while compiling class 'Inquisidor.Object.Licitacion'&lt;/span&gt;
9.538 Compiling class Inquisidor.Message.LicitacionResponse&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This error left me confused, because I, as an obedient person, had defined in my docker-compose.yml the parameter that indicates where my valid license is located:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-attr"&gt;  iris:&lt;/span&gt;
&lt;span class="hljs-attr"&gt;    init:&lt;/span&gt; &lt;span class="hljs-literal"&gt;true&lt;/span&gt;
&lt;span class="hljs-attr"&gt;    container_name:&lt;/span&gt; &lt;span class="hljs-string"&gt;iris&lt;/span&gt;
&lt;span class="hljs-attr"&gt;    build:&lt;/span&gt;
&lt;span class="hljs-attr"&gt;      context:&lt;/span&gt; &lt;span class="hljs-string"&gt;.&lt;/span&gt;
&lt;span class="hljs-attr"&gt;      dockerfile:&lt;/span&gt; &lt;span class="hljs-string"&gt;iris/Dockerfile&lt;/span&gt;
&lt;span class="hljs-attr"&gt;    ports:&lt;/span&gt;
&lt;span class="hljs-bullet"&gt;      -&lt;/span&gt; &lt;span class="hljs-number"&gt;52774&lt;/span&gt;&lt;span class="hljs-string"&gt;:52773&lt;/span&gt;
&lt;span class="hljs-bullet"&gt;      -&lt;/span&gt; &lt;span class="hljs-number"&gt;51774&lt;/span&gt;&lt;span class="hljs-string"&gt;:1972&lt;/span&gt;
&lt;span class="hljs-attr"&gt;    volumes:&lt;/span&gt;
&lt;span class="hljs-bullet"&gt;    -&lt;/span&gt; &lt;span class="hljs-string"&gt;./iris/shared:/iris-shared&lt;/span&gt;
&lt;span class="hljs-attr"&gt;    environment:&lt;/span&gt;
&lt;span class="hljs-bullet"&gt;    -&lt;/span&gt; &lt;span class="hljs-string"&gt;ISC_DATA_DIRECTORY=/iris-shared/durable&lt;/span&gt;
&lt;span class="hljs-attr"&gt;    command:&lt;/span&gt; &lt;span class="hljs-bullet"&gt;--check-caps&lt;/span&gt; &lt;span class="hljs-literal"&gt;false&lt;/span&gt; &lt;span class="hljs-bullet"&gt;--ISCAgent&lt;/span&gt; &lt;span class="hljs-literal"&gt;false&lt;/span&gt; &lt;span class="hljs-bullet"&gt;--key&lt;/span&gt; &lt;span class="hljs-string"&gt;/iris-shared/iris.key&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;It took me a while to realize that the problem was the original image I was using, not the license I had, as you can see, I'm not the sharpest pencil in the case.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The problem was at the point where I imported my classes into the default IRIS image:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;RUN&lt;/span&gt;&lt;span class="bash"&gt; \
zn &lt;span&gt;"%SYS"&lt;/span&gt; \
&lt;span&gt;do&lt;/span&gt; &lt;span&gt;##class(SYS.Container).QuiesceForBundling() \&lt;/span&gt;
&lt;span&gt;do&lt;/span&gt; &lt;span&gt;##class(Security.Users).UnExpireUserPasswords("*") \&lt;/span&gt;
&lt;span&gt;set&lt;/span&gt; sc=&lt;span&gt;##class(%SYSTEM.OBJ).Load("/opt/irisapp/DemoSetup.Utilities.cls","ck") \&lt;/span&gt;
&lt;span&gt;set&lt;/span&gt; helper=&lt;span&gt;##class(DemoSetup.Utilities).%New() \ &lt;/span&gt;
&lt;/span&gt;do helper.EnableSSLSuperServer() \
do &lt;span class="hljs-comment"&gt;##class(Security.Applications).Import("/ApplicationInquisidor.xml",.n) \&lt;/span&gt;
zn &lt;span class="hljs-string"&gt;"INQUISIDOR"&lt;/span&gt; \
set sc = $SYSTEM.OBJ.LoadDir(&lt;span class="hljs-string"&gt;"/opt/irisapp/src/Inquisidor"&lt;/span&gt;, &lt;span class="hljs-string"&gt;"ck"&lt;/span&gt;, , &lt;span class="hljs-number"&gt;1&lt;/span&gt;) \
set production = &lt;span class="hljs-string"&gt;"Inquisidor.Production"&lt;/span&gt; \
set ^Ens.Configuration(&lt;span class="hljs-string"&gt;"csp"&lt;/span&gt;,&lt;span class="hljs-string"&gt;"LastProduction"&lt;/span&gt;) = production \
do &lt;span class="hljs-comment"&gt;##class(Ens.Director).SetAutoStart(production) \&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Compiling the code was returning the previous error. What should I do to fix it? It was very simple: I had to send the new license to the initial IRIS image and ask it to update the license on the first line of the commands I was using.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The first step is to move the new license to the &lt;strong&gt;/mgr &lt;/strong&gt; directory of the installation, which I did with this code:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;COPY&lt;/span&gt;&lt;span class="bash"&gt; --chown=&lt;span&gt;$ISC_PACKAGE_MGRUSER&lt;/span&gt;:&lt;span&gt;$ISC_PACKAGE_IRISGROUP&lt;/span&gt; /iris/iris.key /usr/irissys/mgr
&lt;/span&gt;&lt;span class="hljs-keyword"&gt;RUN&lt;/span&gt;&lt;span class="bash"&gt; chmod +x /usr/irissys/mgr/iris.key&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The IRIS installation path on our image is  &lt;strong&gt;/usr/irissys/mgr&lt;/strong&gt; , and the /iris/iris.key path is my local directory. With the license in the IRIS image, I just needed to tell IRIS to update its license, so I modified the previous commands by adding the following statement:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;RUN&lt;/span&gt;&lt;span class="bash"&gt; \
zn &lt;span&gt;"%SYS"&lt;/span&gt; \
&lt;span&gt;do&lt;/span&gt; &lt;span&gt;##class(%SYSTEM.License).Upgrade() \&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt; Et voila! &lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;I now have my IRIS image with my license loaded before importing and compiling my classes. No more compilation errors.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;I hope it is useful to you!&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

</description>
      <category>docker</category>
      <category>vectordatabase</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Multi-Layered Security Architecture for IRIS Deployments on AWS with InterSystems IAM</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Sat, 28 Feb 2026 15:37:01 +0000</pubDate>
      <link>https://dev.to/intersystems/multi-layered-security-architecture-for-iris-deployments-on-aws-with-intersystems-iam-134k</link>
      <guid>https://dev.to/intersystems/multi-layered-security-architecture-for-iris-deployments-on-aws-with-intersystems-iam-134k</guid>
      <description>&lt;p&gt;Introduction&lt;/p&gt;

&lt;p&gt;In today's rapidly evolving threat landscape, organizations deploying mission-critical applications must implement robust security architectures that protect sensitive data while maintaining high availability and performance. This is especially crucial for enterprises utilizing advanced database management systems like InterSystems IRIS, which often powers applications handling highly sensitive healthcare, financial, or personal data.&lt;/p&gt;

&lt;p&gt;This article details a comprehensive, multi-layered security architecture for deploying InterSystems IRIS clusters on AWS using Kubernetes (EKS) and InterSystems IAM. By implementing defense-in-depth principles, this architecture provides protection at every level—from the network perimeter to the application layer and data storage.&lt;/p&gt;

&lt;p&gt;Why a Multi-Layered Approach Matters&lt;/p&gt;

&lt;p&gt;Single-layer security strategies are increasingly inadequate against sophisticated attack vectors. When one security control fails, additional layers must be in place to prevent a complete compromise. Our architecture implements security controls at five critical layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Perimeter Security:&lt;/strong&gt; Using AWS WAF and CloudFront to filter malicious traffic before it reaches your services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Network Security:&lt;/strong&gt; Leveraging AWS VPC, Security Groups, and Kubernetes network policies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Security:&lt;/strong&gt; Implementing InterSystems IAM with advanced security plugins&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application Security:&lt;/strong&gt; Hardening the Web Gateway with strict URI restrictions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database Security:&lt;/strong&gt; Configuring IRIS Cluster with robust authentication and encryption&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;By the end of this article, you'll understand how to implement each security layer and how they work together to create a defense-in-depth strategy that protects your IRIS deployments against a wide range of threats while maintaining performance and scalability.&lt;br&gt; &lt;/p&gt;




&lt;p&gt;Architecture Overview&lt;/p&gt;

&lt;p&gt;Our security architecture is built around the principle of defense-in-depth, with each layer providing complementary protection. Here's a high-level overview of the complete solution:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa12xh60ot0mrzzqfh5dy.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fa12xh60ot0mrzzqfh5dy.jpeg" alt=" " width="800" height="202"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fya7f0v55nnd58b3d63gx.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fya7f0v55nnd58b3d63gx.jpeg" alt=" " width="800" height="960"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;


&lt;p&gt;IRIS Cluster Deployment Structure&lt;/p&gt;

&lt;p&gt;Our IRIS Cluster is deployed using the InterSystems Kubernetes Operator (IKO) with a carefully designed topology:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Data Tier:&lt;/strong&gt; Two IRIS instances in a mirrored configuration for high availability and data redundancy&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application Tier:&lt;/strong&gt; Two IRIS application servers that access data via ECP (Enterprise Cache Protocol)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Gateway:&lt;/strong&gt; InterSystems IAM (based on Kong) for API management and security&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Gateway:&lt;/strong&gt; Three Web Gateway instances (CSP+Nginx) for handling web requests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Arbiter:&lt;/strong&gt; One arbiter instance for the mirrored data tier&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This architecture separates concerns and provides multiple layers of redundancy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;data tier&lt;/strong&gt; handles the database operations with synchronous mirroring&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;application tier&lt;/strong&gt; focuses on processing business logic&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;IAM layer&lt;/strong&gt; manages API security&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Web Gateway layer&lt;/strong&gt; handles HTTP/HTTPS requests&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each component plays a specific role in the security stack:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;AWS WAF (Web Application Firewall):&lt;/strong&gt; Filters malicious traffic using rule sets that protect against common web exploits, SQL injection, and cross-site scripting (XSS). It also implements URI whitelisting to restrict access to only legitimate application paths.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS CloudFront:&lt;/strong&gt; Acts as a Content Delivery Network (CDN) that caches static content, reducing the attack surface by handling requests at edge locations. It also provides an additional layer of DDoS protection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AWS ALB (Application Load Balancer):&lt;/strong&gt; Configured as a Kubernetes Ingress controller, it performs TLS termination and routes traffic to the appropriate backend services based on URL paths.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;InterSystems IAM:&lt;/strong&gt; Built on Kong, this API gateway enforces authentication, authorization, rate limiting, and request validation before traffic reaches the application.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Gateway:&lt;/strong&gt; The InterSystems Web Gateway with hardened configuration restricts access to specific URI paths and provides additional validation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IRIS Cluster:&lt;/strong&gt; The IRIS database deployed in a Kubernetes cluster with secure configuration, TLS encryption, and role-based access controls.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This multi-layered approach ensures that even if one security control is bypassed, others remain in place to protect your applications and data.&lt;/p&gt;




&lt;h3&gt;Layer 1: Perimeter Security with AWS WAF and CloudFront&lt;/h3&gt;

&lt;p&gt;The first line of defense in our architecture is at the network perimeter, where we implement AWS WAF and CloudFront to filter malicious traffic before it reaches our services.&lt;/p&gt;

&lt;p&gt;1.1 AWS WAF Implementation&lt;/p&gt;

&lt;p&gt;AWS Web Application Firewall is configured with custom rule sets to protect against common web exploits and restrict access to authorized URI paths only. Here's how we've configured it:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;# WAF Configuration in Ingress alb.ingress.kubernetes.io/wafv2-acl-arn: arn:aws:wafv2:region-1:ACCOUNT_ID:regional/webacl/app_uri_whitelisting/abcdef123456 &lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Our WAF rules include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;URI Path Whitelisting:&lt;/strong&gt; Only allowing traffic to specified application paths such as &lt;code&gt;/app/&lt;/code&gt;, &lt;code&gt;/csp/broker/&lt;/code&gt;, &lt;code&gt;/api/&lt;/code&gt;, and &lt;code&gt;/csp/appdata&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQL Injection Protection:&lt;/strong&gt; Blocking requests containing SQL injection patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;XSS Protection:&lt;/strong&gt; Filtering requests with cross-site scripting payloads&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate-Based Rules:&lt;/strong&gt; Automatically blocking IPs that exceed request thresholds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geo-Restriction Rules:&lt;/strong&gt; Limiting access to specific geographic regions when appropriate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By implementing these rules at the perimeter, we prevent a significant portion of malicious traffic from ever reaching our application infrastructure.&lt;br&gt; &lt;/p&gt;

&lt;p&gt;1.2 CloudFront Integration&lt;/p&gt;

&lt;p&gt;AWS CloudFront works alongside WAF to provide additional security benefits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Edge Caching:&lt;/strong&gt; Static content is cached at edge locations, reducing the load on backend services and minimizing the attack surface&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DDoS Protection:&lt;/strong&gt; CloudFront's globally distributed infrastructure helps absorb DDoS attacks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TLS Enforcement:&lt;/strong&gt; All connections are secured with TLS 1.2+ and modern cipher suites&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Origin Access Identity:&lt;/strong&gt; Ensures that S3 buckets hosting static content can only be accessed through CloudFront&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;CloudFront is configured to forward specific headers to the backend services, ensuring that security contexts are preserved throughout the request flow:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;X-Forwarded-For&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;X-Real-IP&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This configuration allows downstream services to identify the original client IP address for rate limiting and logging purposes, even as requests pass through multiple layers.&lt;/p&gt;




&lt;h3&gt;Layer 2: Network Security with AWS VPC and Security Groups&lt;/h3&gt;

&lt;p&gt;The second layer of our security architecture focuses on network-level controls implemented through AWS VPC, Security Groups, and Kubernetes network policies.&lt;/p&gt;

&lt;p&gt;2.1 VPC Design for Isolation&lt;/p&gt;

&lt;p&gt;Our IRIS deployment runs within a custom VPC with the following characteristics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Private Subnets:&lt;/strong&gt; All IRIS and IAM pods run in private subnets with no direct internet access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NAT Gateways:&lt;/strong&gt; Outbound internet access is controlled through NAT gateways&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple Availability Zones:&lt;/strong&gt; Resources are distributed across three AZs for high availability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This design ensures that backend services are never directly exposed to the internet, requiring all traffic to flow through the controlled ingress points.&lt;/p&gt;

&lt;p&gt;2.2 Security Group Configuration&lt;/p&gt;

&lt;p&gt;Security groups act as virtual firewalls controlling inbound and outbound traffic. Our implementation includes multiple security groups with tightly scoped rules:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;# Security Groups referenced in Ingress alb.ingress.kubernetes.io/security-groups: sg-000000000, sg-0100000000, sg-012000000, sg-0130000000 &lt;/code&gt; These security groups implement:&lt;/pre&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ingress Rules:&lt;/strong&gt; Allowing traffic only on required ports (443 for HTTPS)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Source IP Restrictions:&lt;/strong&gt; Limiting access to specific CIDR blocks for administrative interfaces&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Egress Rules:&lt;/strong&gt; Restricting outbound connections to only necessary destinations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This granular control ensures that even if a container is compromised, its ability to communicate with other resources is limited by the security group rules.&lt;/p&gt;

&lt;p&gt;2.3 Kubernetes Network Policies&lt;/p&gt;

&lt;p&gt;Within the EKS cluster, we implement Kubernetes Network Policies to control pod-to-pod communication:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-attr"&gt;apiVersion:&lt;/span&gt; &lt;span class="hljs-string"&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class="hljs-attr"&gt;kind:&lt;/span&gt; &lt;span class="hljs-string"&gt;NetworkPolicy&lt;/span&gt;
&lt;span class="hljs-attr"&gt;metadata:&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;name:&lt;/span&gt; &lt;span class="hljs-string"&gt;allow-iam-webgateway&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;namespace:&lt;/span&gt; &lt;span class="hljs-string"&gt;app1&lt;/span&gt;
&lt;span class="hljs-attr"&gt;spec:&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;podSelector:&lt;/span&gt;
    &lt;span class="hljs-attr"&gt;matchLabels:&lt;/span&gt;
      &lt;span class="hljs-string"&gt;app.kubernetes.io/component:&lt;/span&gt; &lt;span class="hljs-string"&gt;webgateway&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;policyTypes:&lt;/span&gt;
    &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;Ingress&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;ingress:&lt;/span&gt;
    &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-attr"&gt;ports:&lt;/span&gt;
        &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-attr"&gt;protocol:&lt;/span&gt; &lt;span class="hljs-string"&gt;TCP&lt;/span&gt;
          &lt;span class="hljs-attr"&gt;port:&lt;/span&gt; &lt;span class="hljs-number"&gt;443&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;These policies ensure that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;IRIS pods&lt;/strong&gt; only accept connections from authorized sources (Web Gateway, IAM)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IAM pods&lt;/strong&gt; only accept connections from the Ingress controller&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web Gateway pods&lt;/strong&gt; only accept connections from IAM&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This multi-layered network security approach creates isolation boundaries that contain potential security breaches and limit lateral movement within the application environment.&lt;/p&gt;




&lt;h3&gt;Layer 3: API Security with InterSystems IAM&lt;/h3&gt;

&lt;p&gt;At the heart of our security architecture lies &lt;strong&gt;InterSystems IAM &lt;/strong&gt;, a powerful API management solution built on Kong. This component provides critical security capabilities including authentication, authorization, rate limiting, and request validation.&lt;/p&gt;

&lt;p&gt;3.1 InterSystems IAM Overview&lt;/p&gt;

&lt;p&gt;InterSystems IAM serves as the API gateway for all requests to IRIS services, ensuring that only authorized and legitimate traffic reaches your application. In our implementation, IAM is deployed as a StatefulSet within the same Kubernetes cluster as the IRIS instances, allowing for seamless integration while maintaining isolation of concerns.&lt;/p&gt;

&lt;p&gt;The IAM gateway is configured with &lt;strong&gt;TLS/SSL termination&lt;/strong&gt; and is exposed only through secure endpoints. All communication between IAM and the IRIS Web Gateway is encrypted, ensuring data privacy in transit.&lt;/p&gt;

&lt;p&gt;3.2 Advanced Rate Limiting Configuration&lt;/p&gt;

&lt;p&gt;To protect against denial-of-service attacks and abusive API usage, we've implemented advanced rate limiting through IAM's &lt;strong&gt;rate-limiting-advanced plugin&lt;/strong&gt;. This configuration uses Redis as a backend store to track request rates across distributed IAM instances.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F67qt34wmr1l40dr1rjq0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F67qt34wmr1l40dr1rjq0.png" alt=" " width="800" height="329"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvh9etv2a3tykv57j8y3g.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvh9etv2a3tykv57j8y3g.png" alt=" " width="598" height="369"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsga5rtv28zo9705kvnyc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsga5rtv28zo9705kvnyc.png" alt=" " width="593" height="467"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{ &lt;span class="hljs-attr"&gt;"name"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"rate-limiting-advanced"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"config"&lt;/span&gt;: { &lt;span class="hljs-attr"&gt;"identifier"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"ip"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"strategy"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"redis"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"window_type"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"sliding"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"limit"&lt;/span&gt;: [&lt;span class="hljs-number"&gt;2000&lt;/span&gt;, &lt;span class="hljs-number"&gt;3000&lt;/span&gt;], &lt;span class="hljs-attr"&gt;"window_size"&lt;/span&gt;: [&lt;span class="hljs-number"&gt;60&lt;/span&gt;, &lt;span class="hljs-number"&gt;60&lt;/span&gt;], &lt;span class="hljs-attr"&gt;"redis"&lt;/span&gt;: { &lt;span class="hljs-attr"&gt;"host"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"my-release-redis-master.default.svc.cluster.local"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"port"&lt;/span&gt;: &lt;span class="hljs-number"&gt;6379&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"timeout"&lt;/span&gt;: &lt;span class="hljs-number"&gt;2000&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"keepalive_pool_size"&lt;/span&gt;: &lt;span class="hljs-number"&gt;30&lt;/span&gt; }, &lt;span class="hljs-attr"&gt;"error_message"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"API rate limit exceeded"&lt;/span&gt; } }  &lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This configuration provides two tiers of rate limiting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tier 1:&lt;/strong&gt; 2,000 requests per minute with a sliding window&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tier 2:&lt;/strong&gt; 3,000 requests per minute with a sliding window&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The sliding window approach provides more accurate rate limiting compared to fixed windows, preventing traffic spikes at window boundaries. When a client exceeds these limits, they receive a 429 status code with a custom error message.&lt;/p&gt;

&lt;p&gt;3.3 Secure Session Management&lt;/p&gt;

&lt;p&gt;For applications requiring user sessions, we've configured IAM's &lt;strong&gt;session plugin&lt;/strong&gt; with secure settings to prevent session hijacking and maintain proper session lifecycle:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{ &lt;span class="hljs-attr"&gt;"name"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"session"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"config"&lt;/span&gt;: { &lt;span class="hljs-attr"&gt;"secret"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"REDACTED"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"cookie_secure"&lt;/span&gt;: &lt;span class="hljs-literal"&gt;true&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"cookie_same_site"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"Strict"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"cookie_http_only"&lt;/span&gt;: &lt;span class="hljs-literal"&gt;true&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"idling_timeout"&lt;/span&gt;: &lt;span class="hljs-number"&gt;900&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"absolute_timeout"&lt;/span&gt;: &lt;span class="hljs-number"&gt;86400&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"rolling_timeout"&lt;/span&gt;: &lt;span class="hljs-number"&gt;14400&lt;/span&gt; } }&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Key security features implemented include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HTTP-only cookies&lt;/strong&gt;: Prevents JavaScript access to session cookies, mitigating XSS attacks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secure flag&lt;/strong&gt;: Ensures cookies are only sent over HTTPS connections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Same-Site restriction&lt;/strong&gt;: Prevents CSRF attacks by restricting cookie usage to same-site requests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple timeout mechanisms:&lt;/strong&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Idling timeout&lt;/strong&gt; (15 minutes): Expires sessions after inactivity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rolling timeout&lt;/strong&gt; (4 hours): Requires re-authentication periodically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Absolute timeout&lt;/strong&gt; (24 hours): Maximum session lifetime regardless of activity&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;3.4 Request Validation for Input Sanitization&lt;/p&gt;

&lt;p&gt;To protect against injection attacks and malformed requests, we've implemented strict &lt;strong&gt;request validation&lt;/strong&gt; using IAM's &lt;strong&gt;request-validator plugin&lt;/strong&gt;. This is particularly important for securing the CSP broker, which is a critical component that handles client-server communication in InterSystems applications.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cbsdqedf2el3am7w1d9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5cbsdqedf2el3am7w1d9.png" alt=" " width="639" height="258"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;{ &lt;span class="hljs-attr"&gt;"name"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"request-validator"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"config"&lt;/span&gt;: { &lt;span class="hljs-attr"&gt;"version"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"kong"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"body_schema"&lt;/span&gt;: [ { &lt;span class="hljs-attr"&gt;"RequestParam"&lt;/span&gt;: { &lt;span class="hljs-attr"&gt;"type"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"integer"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"required"&lt;/span&gt;: &lt;span class="hljs-literal"&gt;true&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"between"&lt;/span&gt;: [&lt;span class="hljs-number"&gt;1&lt;/span&gt;, &lt;span class="hljs-number"&gt;10&lt;/span&gt;] } }, { &lt;span class="hljs-attr"&gt;"EventType"&lt;/span&gt;: { &lt;span class="hljs-attr"&gt;"type"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"string"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"required"&lt;/span&gt;: &lt;span class="hljs-literal"&gt;true&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"match"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"^[a-zA-Z0-9$_]{100}$"&lt;/span&gt; } }, { &lt;span class="hljs-attr"&gt;"SessionID"&lt;/span&gt;: { &lt;span class="hljs-attr"&gt;"type"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"string"&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"required"&lt;/span&gt;: &lt;span class="hljs-literal"&gt;true&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"match"&lt;/span&gt;: &lt;span class="hljs-string"&gt;"^00b0[a-zA-Z0-9]{40}$"&lt;/span&gt; } } ], &lt;span class="hljs-attr"&gt;"verbose_response"&lt;/span&gt;: &lt;span class="hljs-literal"&gt;true&lt;/span&gt;, &lt;span class="hljs-attr"&gt;"allowed_content_types"&lt;/span&gt;: [&lt;span class="hljs-string"&gt;"application/x-www-form-urlencoded"&lt;/span&gt;] } } &lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This configuration enforces strict validation rules:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input fields&lt;/strong&gt; must match exact data types and constraints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;String inputs&lt;/strong&gt; must match specific regular expression patterns&lt;/li&gt;
&lt;li&gt;Only allowed content types are accepted&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The CSP broker is particularly sensitive because it serves as a communication channel between client browsers and the IRIS server. By validating all requests at the IAM layer before they reach the broker, we create an additional security barrier that protects against malformed or malicious requests targeting this critical component. When a request fails validation, IAM returns a detailed error response that helps identify the validation issue without revealing sensitive information about your backend systems.&lt;/p&gt;

&lt;p&gt;3.5 Trusted IPs Configuration&lt;/p&gt;

&lt;p&gt;To further enhance security, IAM is configured to recognize trusted proxies and properly determine client IP addresses:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;{ "trusted_ips": [ "10.0.0.0/24", "10.1.0.0/24", "10.0.3.0/24" ], "real_ip_header": "X-Forwarded-For", "real_ip_recursive": "on" } &lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This configuration ensures that:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rate limiting correctly identifies client IPs even behind proxies&lt;/li&gt;
&lt;li&gt;Security rules using IP identification work properly&lt;/li&gt;
&lt;li&gt;Access logs record actual client IPs rather than proxy IPs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By implementing these advanced security features in InterSystems IAM, we've created a robust API security layer that complements the perimeter and network security measures while protecting the application and database layers from malicious or excessive traffic.&lt;/p&gt;




&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Layer 4: Application Security with Web Gateway Hardening&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The fourth layer of our security architecture focuses on hardening the InterSystems Web Gateway, which serves as the interface between the IAM API gateway and the IRIS database.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;4.1 Web Gateway Configuration in Kubernetes&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The Web Gateway is deployed as part of the IrisCluster custom resource, with specific security-focused configuration:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;webgateway:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;  &lt;span class="hljs-attr"&gt;image:&lt;/span&gt; &lt;span class="hljs-string"&gt;containers.intersystems.com/intersystems/webgateway-nginx:2023.3&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;type:&lt;/span&gt; &lt;span class="hljs-string"&gt;nginx&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;replicas:&lt;/span&gt; &lt;span class="hljs-number"&gt;2&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;applicationPaths:&lt;/span&gt;
    &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/csp/app1&lt;/span&gt;
    &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/csp/app2&lt;/span&gt;
    &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/app3&lt;/span&gt;
    &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/csp/app4&lt;/span&gt;
    &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/app5&lt;/span&gt;
    &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/csp/bin&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;alternativeServers:&lt;/span&gt; &lt;span class="hljs-string"&gt;LoadBalancing&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;loginSecret:&lt;/span&gt;
    &lt;span class="hljs-attr"&gt;name:&lt;/span&gt; &lt;span class="hljs-string"&gt;iris-webgateway-secret&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;This configuration restricts the Web Gateway to only serve specific application paths, limiting the attack surface by preventing access to unauthorized endpoints.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;4.2 CSP.ini Security Hardening&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The Web Gateway's CSP.ini configuration is hardened with several security measures:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-string"&gt;[SYSTEM]&lt;/span&gt;
&lt;span class="hljs-string"&gt;No_Activity_Timeout=480&lt;/span&gt;
&lt;span class="hljs-string"&gt;System_Manager=127.0.0.1&lt;/span&gt;
&lt;span class="hljs-string"&gt;Maximum_Logged_Request_Size=256K&lt;/span&gt;
&lt;span class="hljs-string"&gt;MAX_CONNECTIONS=4096&lt;/span&gt;
&lt;span class="hljs-string"&gt;Server_Response_Timeout=60&lt;/span&gt;
&lt;span class="hljs-string"&gt;Queued_Request_Timeout=60&lt;/span&gt;
&lt;span class="hljs-string"&gt;Default_Server=IRIS&lt;/span&gt;
&lt;span class="hljs-string"&gt;[APP_PATH:/app]&lt;/span&gt;
&lt;span class="hljs-string"&gt;Alternative_Servers=LoadBalancing&lt;/span&gt;
&lt;span class="hljs-string"&gt;Alternative_Server_0=1~~~~~~server-compute-0&lt;/span&gt;
&lt;span class="hljs-string"&gt;Response_Size_Notification=Chunked&lt;/span&gt; &lt;span class="hljs-string"&gt;Transfer&lt;/span&gt; &lt;span class="hljs-string"&gt;Encoding&lt;/span&gt; &lt;span class="hljs-string"&gt;and&lt;/span&gt; &lt;span class="hljs-string"&gt;Content&lt;/span&gt; &lt;span class="hljs-string"&gt;Length&lt;/span&gt;
&lt;span class="hljs-string"&gt;KeepAlive=No&lt;/span&gt; &lt;span class="hljs-string"&gt;Action&lt;/span&gt;
&lt;span class="hljs-string"&gt;GZIP_Compression=Enabled&lt;/span&gt;
&lt;span class="hljs-string"&gt;GZIP_Exclude_File_Types=jpeg&lt;/span&gt; &lt;span class="hljs-string"&gt;gif&lt;/span&gt; &lt;span class="hljs-string"&gt;ico&lt;/span&gt; &lt;span class="hljs-string"&gt;png&lt;/span&gt; &lt;span class="hljs-string"&gt;gz&lt;/span&gt; &lt;span class="hljs-string"&gt;zip&lt;/span&gt; &lt;span class="hljs-string"&gt;mp3&lt;/span&gt; &lt;span class="hljs-string"&gt;mp4&lt;/span&gt; &lt;span class="hljs-string"&gt;tiff&lt;/span&gt;
&lt;span class="hljs-string"&gt;GZIP_Minimum_File_Size=500&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt; &lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Key security features in this configuration include:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Disabled System Manager&lt;/b&gt;: The System Manager interface is disabled except from localhost&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Manual Configuration Only&lt;/b&gt;: Auto-configuration is disabled to prevent unauthorized changes&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Path Restrictions&lt;/b&gt;: Each application path has specific security settings&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Authentication Enforcement&lt;/b&gt;: AutheEnabled=64 enforces authentication&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Session Timeout&lt;/b&gt;: 15-minute session timeout aligned with IAM settings&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Locked CSP Names&lt;/b&gt;: Prevents path traversal attacks by locking CSP names&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;4.3 Advanced Nginx Security Configuration&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Our implementation uses a heavily hardened Nginx configuration for the Web Gateway, which provides several layers of defense:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;# Define whitelist using map&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-keyword"&gt;map&lt;/span&gt; $request_uri $whitelist_uri {

    &lt;span class="hljs-keyword"&gt;default&lt;/span&gt; &lt;span class="hljs-number"&gt;0&lt;/span&gt;;

    &lt;span class="hljs-string"&gt;"~^/app/.*$"&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;;

    &lt;span class="hljs-string"&gt;"~^/app/.*\.(csp|css|ico|js|png|woff2|ttf|jpg|gif)$"&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;;

    &lt;span class="hljs-string"&gt;"~^/csp/broker/cspxmlhttp.js$"&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;;

    &lt;span class="hljs-string"&gt;"~^/csp/broker/cspbroker.js$"&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;;

    &lt;span class="hljs-string"&gt;"~^/csp/app/.*$"&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;;

    &lt;span class="hljs-string"&gt;"~^/csp/bin/Systems/Module.cxw.*$"&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;;

}



# Block specific URIs globally
&lt;span class="hljs-keyword"&gt;map&lt;/span&gt; $request_uri $block_uri {

    &lt;span class="hljs-keyword"&gt;default&lt;/span&gt; &lt;span class="hljs-number"&gt;0&lt;/span&gt;;

    &lt;span class="hljs-string"&gt;"~*%25login"&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;;

    &lt;span class="hljs-string"&gt;"~*%25CSP\.PasswordChange\.cls"&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;;

    &lt;span class="hljs-string"&gt;"~*%25ZEN\.SVGComponent\.svgPage"&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;;

}



# Custom error pages
error_page &lt;span class="hljs-number"&gt;403&lt;/span&gt; /&lt;span class="hljs-number"&gt;403.&lt;/span&gt;html;



# URI Whitelisting enforcement
&lt;span class="hljs-keyword"&gt;if&lt;/span&gt; ($whitelist_uri = &lt;span class="hljs-number"&gt;0&lt;/span&gt;) {

    &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; &lt;span class="hljs-number"&gt;403&lt;/span&gt;;

}



# Deny access to forbidden file types
location ~* \.(ppt|pptx)$ {

    deny all;

    &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; &lt;span class="hljs-number"&gt;403&lt;/span&gt;;

}



# Deny access to blocked URIs
&lt;span class="hljs-keyword"&gt;if&lt;/span&gt; ($block_uri) {

    &lt;span class="hljs-keyword"&gt;return&lt;/span&gt; &lt;span class="hljs-number"&gt;403&lt;/span&gt;;

}



# Comprehensive logging &lt;span class="hljs-keyword"&gt;for&lt;/span&gt; security analysis
log_format security &lt;span class="hljs-string"&gt;'$real_client_ip - $remote_user [$time_local] '&lt;/span&gt;

                   &lt;span class="hljs-string"&gt;'"$request" $status $body_bytes_sent '&lt;/span&gt;

                   &lt;span class="hljs-string"&gt;'"$http_referer" "$http_user_agent" '&lt;/span&gt;

                   &lt;span class="hljs-string"&gt;'"$http_x_forwarded_for" "$request_body"'&lt;/span&gt;;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;This configuration implements several critical security controls:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;URI Whitelisting&lt;/b&gt;: Only explicitly allowed paths can be accessed&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Blocking Dangerous Paths&lt;/b&gt;: Automatically blocks access to dangerous endpoints&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Blocking Risky File Types&lt;/b&gt;: Prevents access to potentially dangerous file types&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Security Logging&lt;/b&gt;: Detailed logging of all requests for forensic analysis&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Client IP Extraction&lt;/b&gt;: Properly extracts real client IPs from X-Forwarded-For headers&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Custom Error Pages&lt;/b&gt;: Standardized error responses that don't leak system information&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Additionally, we implement strong security headers and request limits:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;# Security headers&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;add_header X-XSS-Protection "1; mode=block" always;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;add_header X-Content-Type-Options "nosniff" always;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;add_header X-Frame-Options "SAMEORIGIN" always;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;# Buffer and request size limits&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;client_max_body_size 50M;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;client_body_buffer_size 128k;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;client_header_buffer_size 1k;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;large_client_header_buffers 4 4k;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;# SSL/TLS security&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;ssl_protocols TLSv1.2 TLSv1.3;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;ssl_prefer_server_ciphers on;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;ssl_ciphers 'ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384';&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;These settings protect against:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Cross-site scripting (XSS)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;MIME type confusion attacks&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Clickjacking&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;SSL downgrade attacks&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Buffer overflow attempts&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Large payload attacks&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;4.4 TLS Configuration&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The Web Gateway is configured to use modern TLS settings, ensuring secure communication:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-attr"&gt;tls:&lt;/span&gt;     

&lt;span class="hljs-attr"&gt;  webgateway:&lt;/span&gt;
&lt;span class="hljs-attr"&gt;  secret:&lt;/span&gt;
&lt;span class="hljs-attr"&gt;    secretName:&lt;/span&gt; &lt;span class="hljs-string"&gt;iris-tls-secret&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Our TLS implementation ensures:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Only TLS 1.2+ protocols are allowed&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Strong cipher suites with forward secrecy are enforced&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Certificates are properly validated&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Session management is secure&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;By implementing this extensive hardening of the Web Gateway, we create a robust security layer that protects the IRIS database from unauthorized access and common web application vulnerabilities.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Layer 5: Database Security in IRIS Clusters&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The final layer of our security architecture focuses on securing the IRIS database itself, ensuring that even if all previous layers are compromised, the data remains protected.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;5.1 IrisCluster Secure Configuration with InterSystems Kubernetes Operator (IKO)&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The IRIS cluster is deployed using the IrisCluster custom resource definition provided by the InterSystems Kubernetes Operator (IKO), with security-focused configuration:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-attr"&gt;apiVersion:&lt;/span&gt; &lt;span class="hljs-string"&gt;intersystems.com/v1alpha1&lt;/span&gt;
&lt;span class="hljs-attr"&gt;kind:&lt;/span&gt; &lt;span class="hljs-string"&gt;IrisCluster&lt;/span&gt;
&lt;span class="hljs-attr"&gt;metadata:&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;name:&lt;/span&gt; &lt;span class="hljs-string"&gt;example-app&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;namespace:&lt;/span&gt; &lt;span class="hljs-string"&gt;example-namespace&lt;/span&gt;
&lt;span class="hljs-attr"&gt;spec:&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;tls:&lt;/span&gt;     

    &lt;span class="hljs-attr"&gt;common:&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;secret:&lt;/span&gt;
        &lt;span class="hljs-attr"&gt;secretName:&lt;/span&gt; &lt;span class="hljs-string"&gt;iris-tls-secret&lt;/span&gt;
    &lt;span class="hljs-attr"&gt;mirror:&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;secret:&lt;/span&gt;
        &lt;span class="hljs-attr"&gt;secretName:&lt;/span&gt; &lt;span class="hljs-string"&gt;iris-tls-secret&lt;/span&gt;
    &lt;span class="hljs-attr"&gt;ecp:&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;secret:&lt;/span&gt;
        &lt;span class="hljs-attr"&gt;secretName:&lt;/span&gt; &lt;span class="hljs-string"&gt;iris-tls-secret&lt;/span&gt;
  &lt;span class="hljs-attr"&gt;topology:&lt;/span&gt;     

    &lt;span class="hljs-attr"&gt;data:&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;image:&lt;/span&gt; &lt;span class="hljs-string"&gt;containers.intersystems.com/intersystems/iris:2023.3&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;preferredZones:&lt;/span&gt;
        &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;region-1a&lt;/span&gt;
        &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;region-1b&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;mirrored:&lt;/span&gt; &lt;span class="hljs-literal"&gt;true&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;podTemplate:&lt;/span&gt;
        &lt;span class="hljs-attr"&gt;spec:&lt;/span&gt;
          &lt;span class="hljs-attr"&gt;securityContext:&lt;/span&gt;
            &lt;span class="hljs-attr"&gt;runAsUser:&lt;/span&gt; &lt;span class="hljs-number"&gt;51773&lt;/span&gt;  &lt;span class="hljs-comment"&gt;# irisowner&lt;/span&gt;
            &lt;span class="hljs-attr"&gt;runAsGroup:&lt;/span&gt; &lt;span class="hljs-number"&gt;51773&lt;/span&gt;
            &lt;span class="hljs-attr"&gt;fsGroup:&lt;/span&gt; &lt;span class="hljs-number"&gt;51773&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;irisDatabases:&lt;/span&gt;
        &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-attr"&gt;name:&lt;/span&gt; &lt;span class="hljs-string"&gt;appdata&lt;/span&gt;
          &lt;span class="hljs-attr"&gt;mirrored:&lt;/span&gt; &lt;span class="hljs-literal"&gt;true&lt;/span&gt;
          &lt;span class="hljs-attr"&gt;ecp:&lt;/span&gt; &lt;span class="hljs-literal"&gt;true&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;irisNamespaces:&lt;/span&gt;
        &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-attr"&gt;name:&lt;/span&gt; &lt;span class="hljs-string"&gt;APP&lt;/span&gt;
          &lt;span class="hljs-attr"&gt;routines:&lt;/span&gt; &lt;span class="hljs-string"&gt;appdata&lt;/span&gt;
          &lt;span class="hljs-attr"&gt;globals:&lt;/span&gt; &lt;span class="hljs-string"&gt;appdata&lt;/span&gt;
    &lt;span class="hljs-attr"&gt;compute:&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;image:&lt;/span&gt; &lt;span class="hljs-string"&gt;containers.intersystems.com/intersystems/iris:2023.3&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;replicas:&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;compatibilityVersion:&lt;/span&gt; &lt;span class="hljs-string"&gt;"2023.3.0"&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;webgateway:&lt;/span&gt;
        &lt;span class="hljs-attr"&gt;image:&lt;/span&gt; &lt;span class="hljs-string"&gt;containers.intersystems.com/intersystems/webgateway-nginx:2023.3&lt;/span&gt;
        &lt;span class="hljs-attr"&gt;replicas:&lt;/span&gt; &lt;span class="hljs-number"&gt;1&lt;/span&gt;
        &lt;span class="hljs-attr"&gt;type:&lt;/span&gt; &lt;span class="hljs-string"&gt;nginx&lt;/span&gt;
        &lt;span class="hljs-attr"&gt;applicationPaths:&lt;/span&gt;
          &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/csp/sys&lt;/span&gt;
          &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/csp/bin&lt;/span&gt;
          &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/api/app&lt;/span&gt;
          &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;/app&lt;/span&gt;
    &lt;span class="hljs-attr"&gt;iam:&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;image:&lt;/span&gt; &lt;span class="hljs-string"&gt;containers.intersystems.com/intersystems/iam:3.4&lt;/span&gt;
    &lt;span class="hljs-attr"&gt;arbiter:&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;image:&lt;/span&gt; &lt;span class="hljs-string"&gt;containers.intersystems.com/intersystems/arbiter:2023.3&lt;/span&gt;
      &lt;span class="hljs-attr"&gt;preferredZones:&lt;/span&gt;
        &lt;span class="hljs-bullet"&gt;-&lt;/span&gt; &lt;span class="hljs-string"&gt;region-1c&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Our IKO deployment includes several critical security features:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;TLS Encryption&lt;/b&gt;: All communication between IRIS instances is encrypted using TLS&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Database Mirroring&lt;/b&gt;: High availability with synchronous mirroring ensures data integrity&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Non-Root Execution&lt;/b&gt;: IRIS runs as the non-privileged irisowner user&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;ECP Security&lt;/b&gt;: Enterprise Cache Protocol connections are secured with TLS&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Zone Distribution&lt;/b&gt;: Components are distributed across availability zones for fault tolerance&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Resource Isolation&lt;/b&gt;: Clear separation between data and compute nodes&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;IRIS Namespaces&lt;/b&gt;: Properly configured namespaces that map to secure databases&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Arbiter Node&lt;/b&gt;: Dedicated arbiter node in a separate availability zone&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;5.2 IRIS Database Security Settings&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Within the IRIS database, best practices for security include implementing several key security settings:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Delegated Authentication&lt;/b&gt;: Configure IRIS to use external authentication mechanisms for centralized identity management&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Audit Logging&lt;/b&gt;: Enable comprehensive auditing for security-relevant events like logins, configuration changes, and privilege escalation&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;System Security&lt;/b&gt;: Apply system-wide security settings that align with industry standards&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;These practices ensure that authentication is managed centrally, all security-relevant activities are logged for forensic purposes, and the system adheres to secure configuration standards.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;5.3 IRIS Resource-Based Security&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;IRIS provides a robust security framework based on resources and roles that allows for fine-grained access control. This framework can be used to implement the principle of least privilege, giving users and services only the permissions they need to perform their functions.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Resource-Based Security Model&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The IRIS resource-based security model includes:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Resources&lt;/b&gt;: Secure objects such as databases, services, applications, and system operations&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Permissions&lt;/b&gt;: Different levels of access to resources (Read, Write, Use)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Roles&lt;/b&gt;: Collections of permissions on resources that can be assigned to users&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Users&lt;/b&gt;: Accounts that are assigned roles and can authenticate to the system&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;This model allows security administrators to create a granular security structure that restricts access based on job functions and needs. For example:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Database administrators might have full access to database resources but limited access to application resources&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Application users might have access only to specific application functions&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Service accounts for integrations might have narrow permissions tailored to their specific needs&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;InterSystems Documentation&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The implementation of role-based security in IRIS is well-documented in the InterSystems official documentation:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;a href="https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GSEC" rel="noopener noreferrer"&gt;Security Administration Guide&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;a href="https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=ASECMGT" rel="noopener noreferrer"&gt;Security Management Portal&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;By leveraging IRIS's built-in security framework, organizations can create a security model that follows the principle of least privilege, significantly reducing the risk of unauthorized access or privilege escalation.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;5.4 Data Encryption&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;IRIS database files are encrypted at rest using AWS EBS encryption with customer-managed KMS keys:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-attr"&gt;kind:&lt;/span&gt; &lt;span class="hljs-string"&gt;StorageClass&lt;/span&gt;
&lt;span class="hljs-attr"&gt;apiVersion:&lt;/span&gt; &lt;span class="hljs-string"&gt;storage.k8s.io/v1&lt;/span&gt;
&lt;span class="hljs-attr"&gt;metadata:&lt;/span&gt;
&lt;span class="hljs-attr"&gt;  name:&lt;/span&gt; &lt;span class="hljs-string"&gt;iris-ssd-storageclass&lt;/span&gt;
&lt;span class="hljs-attr"&gt;provisioner:&lt;/span&gt; &lt;span class="hljs-string"&gt;kubernetes.io/aws-ebs&lt;/span&gt;
&lt;span class="hljs-attr"&gt;parameters:&lt;/span&gt;
&lt;span class="hljs-attr"&gt;  type:&lt;/span&gt; &lt;span class="hljs-string"&gt;gp3&lt;/span&gt;
&lt;span class="hljs-attr"&gt;  encrypted:&lt;/span&gt; &lt;span class="hljs-string"&gt;"true"&lt;/span&gt;
&lt;span class="hljs-attr"&gt;volumeBindingMode:&lt;/span&gt; &lt;span class="hljs-string"&gt;WaitForFirstConsumer&lt;/span&gt;
&lt;span class="hljs-attr"&gt;allowVolumeExpansion:&lt;/span&gt; &lt;span class="hljs-literal"&gt;true&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The EKS cluster is configured to use encrypted EBS volumes for all persistent storage, ensuring that data at rest is protected with AES-256 encryption.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;5.5 Backup and Disaster Recovery&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;To protect against data loss and ensure business continuity, our architecture implements:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Journal Mirroring&lt;/b&gt;: IRIS journals are stored on separate volumes and mirrored&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Automated Backups&lt;/b&gt;: Regular backups to encrypted S3 buckets&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Cross-AZ Replication&lt;/b&gt;: Critical data is replicated to a secondary AWS AZ&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;This approach ensures that even in case of a catastrophic failure or security incident, data can be recovered with minimal loss.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Implementation Guide&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;To implement this multi-layered security architecture for your own IRIS deployments on AWS, follow these high-level steps:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Step 1: Set Up AWS Infrastructure&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Create a VPC with private and public subnets across multiple availability zones&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Set up NAT gateways for outbound connectivity from private subnets&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Create security groups with appropriate ingress and egress rules&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Deploy an EKS cluster in the private subnets&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Step 2: Configure AWS Security Services&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Create an AWS WAF Web ACL with appropriate rule sets&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Set up CloudFront distribution with WAF association&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Configure AWS ALB for Kubernetes Ingress&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Step 3: Deploy InterSystems IAM&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Create necessary Kubernetes secrets for certificates and credentials&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Deploy the IAM StatefulSet using the IrisCluster operator&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Configure IAM security plugins (rate limiting, session management, request validation)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Step 4: Deploy and Secure IRIS Cluster&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Create an IrisCluster custom resource with security configurations&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Configure TLS for all communication&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Deploy the Web Gateway with hardened configuration&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Set up database mirroring and ECP security&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Step 5: Implement Monitoring and Logging&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Configure centralized logging with ElasticSearch&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Set up security monitoring with Datadog&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Implement alerting for security events&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Enable IRIS audit logging&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Monitoring and Incident Response&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;A robust security architecture must include continuous monitoring and incident response capabilities. Our implementation includes:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;6.1 Security Monitoring&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The architecture includes comprehensive monitoring using Datadog and ElasticSearch:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Real-time Log Analysis&lt;/b&gt;: All components send logs to a centralized ElasticSearch cluster&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Security Dashboards&lt;/b&gt;: Datadog dashboards visualize security metrics and anomalies&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Automated Alerting&lt;/b&gt;: Alerts are generated for suspicious activities or security violations&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;6.2 Incident Response&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;A defined incident response process ensures timely reaction to security events:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Detection&lt;/b&gt;: Automated detection of security incidents through monitoring&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Classification&lt;/b&gt;: Incidents are classified by severity and type&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Containment&lt;/b&gt;: Procedures to contain incidents, including automated responses&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Eradication&lt;/b&gt;: Steps to eliminate the threat and restore security&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Recovery&lt;/b&gt;: Procedures for restoring normal operations&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Lessons Learned&lt;/b&gt;: Post-incident analysis to improve security posture&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Performance Considerations&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Implementing multiple security layers can impact performance. Our architecture addresses this through:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;7.1 Caching Strategies&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;CloudFront Caching&lt;/b&gt;: Static content is cached at edge locations&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;API Gateway Caching&lt;/b&gt;: IAM implements response caching for appropriate endpoints&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Web Gateway Caching&lt;/b&gt;: CSP pages are cached when possible&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;7.2 Load Balancing&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Multi-AZ Deployment&lt;/b&gt;: Services are distributed across availability zones&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Horizontal Scaling&lt;/b&gt;: Components can scale horizontally based on load&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Affinity Settings&lt;/b&gt;: Pod anti-affinity ensures proper distribution&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;7.3 Performance Metrics&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;During our implementation, we observed the following performance impacts:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Latency&lt;/b&gt;: Average request latency increased by only 20-30ms with all security layers&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Throughput&lt;/b&gt;: System can handle over 2,000 requests per second with all security measures&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Resource Usage&lt;/b&gt;: Additional security components increased CPU usage by approximately at 15%&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;These metrics demonstrate that a robust security architecture can be implemented without significant performance degradation.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Conclusion&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;The multi-layered security architecture described in this article provides comprehensive protection for InterSystems IRIS deployments on AWS. By implementing security controls at every layer—from the network perimeter to the database—we create a defense-in-depth strategy that significantly reduces the risk of successful attacks.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Key benefits of this approach include:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Comprehensive Protection&lt;/b&gt;: Multiple layers provide protection against a wide range of threats&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Defense in Depth&lt;/b&gt;: If one security control fails, others remain in place&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Scalability&lt;/b&gt;: The architecture scales horizontally to handle increased load&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Manageability&lt;/b&gt;: Infrastructure as Code approach makes security controls reproducible and versionable&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;Compliance&lt;/b&gt;: The architecture helps meet regulatory requirements for data protection&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;By leveraging AWS security services, InterSystems IAM, and secure IRIS configurations, organizations can build secure, high-performance applications while protecting sensitive data from evolving threats.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;b&gt;References&lt;/b&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;InterSystems Documentation: &lt;a href="https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=PAGE_security" rel="noopener noreferrer"&gt;IRIS Security Guide&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;AWS Security Best Practices: &lt;a href="https://docs.aws.amazon.com/wellarchitected/latest/security-pillar/welcome.html" rel="noopener noreferrer"&gt;AWS Security Pillar&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Kubernetes Security: &lt;a href="https://aws.github.io/aws-eks-best-practices/security/docs/" rel="noopener noreferrer"&gt;EKS Best Practices Guide&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;OWASP API Security: &lt;a href="https://owasp.org/www-project-api-security/" rel="noopener noreferrer"&gt;Top 10 API Security Risks&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;InterSystems Container Registry: &lt;a href="https://containers.intersystems.com/" rel="noopener noreferrer"&gt;containers.intersystems.com&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt; &lt;/p&gt;

</description>
      <category>security</category>
      <category>kubernetes</category>
      <category>cloud</category>
      <category>beginners</category>
    </item>
    <item>
      <title>How to run a process on an interval or schedule?</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Wed, 25 Feb 2026 16:40:59 +0000</pubDate>
      <link>https://dev.to/intersystems/how-to-run-a-process-on-an-interval-or-schedule-32jk</link>
      <guid>https://dev.to/intersystems/how-to-run-a-process-on-an-interval-or-schedule-32jk</guid>
      <description>&lt;p&gt;When I started my journey with InterSystems IRIS, especially in Interoperability, one of the initial and common questions I had was: how can I run something on an interval or schedule? In this topic, I want to share two simple classes that address this issue. I'm surprised that some similar classes are not located somewhere in &lt;code&gt;EnsLib&lt;/code&gt;. Or maybe I didn't search well? Anyway, this topic is not meant to be complex work, just a couple of snippets for beginners.&lt;/p&gt;

&lt;p&gt;So let's assume we have a task &lt;em&gt;"Take some data from an API and put it into an external database"&lt;/em&gt;. To solve this task, we need:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;Ens.BusinessProcess&lt;/code&gt;, which contains an algorithm of our data flow: How to prepare a request for taking data, how to transform the API response to a request for DB, how to handle errors and other events through the data flow lifecycle&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;EnsLib.REST.Operation&lt;/code&gt; for making HTTP requests to the API using &lt;code&gt;EnsLib.HTTP.OutboundAdapter&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;Ens.BusinessOperation&lt;/code&gt; with &lt;code&gt;EnsLib.SQL.OutboundAdapter&lt;/code&gt; for putting data into the external database via a JDBC connection&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Details of the implementation of these business hosts lie outside the scope of this article, so let's say we already have a process and two operations. But how to run it all? The process can run only by inbound request... We need an &lt;strong&gt;Initiator&lt;/strong&gt;! Which one will just be run by interval and send a dummy request to our process.&lt;/p&gt;

&lt;p&gt;Here is such an initiator class. I added a bit of additional functionality: sync or async calls will be used, and stop or not process on error if we have many hosts as targets. But mainly here it's a target list. To each item (business host) on this list will be sent a request. Pay attention to the &lt;code&gt;OnGetConnections&lt;/code&gt; event - it's needed for correct link building in Production UI.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-comment"&gt;/// Call targets by interval&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;Class&lt;/span&gt; Util.Service.IntervalCall &lt;span class="hljs-keyword"&gt;Extends&lt;/span&gt; Ens.BusinessService
{

&lt;span class="hljs-comment"&gt;/// List of targets to call&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;Property&lt;/span&gt; TargetConfigNames &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; Ens.DataType.ConfigName&lt;span class="hljs-comment"&gt;;&lt;/span&gt;

&lt;span class="hljs-comment"&gt;/// If true, calls are made asynchronously (SendRequestAsync)&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;Property&lt;/span&gt; AsyncCall &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; &lt;span class="hljs-built_in"&gt;%Boolean&lt;/span&gt;&lt;span class="hljs-comment"&gt;;&lt;/span&gt;

&lt;span class="hljs-comment"&gt;/// If true, and the target list contains more than one target, the process will stop after the first error&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;Property&lt;/span&gt; BreakOnError &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; &lt;span class="hljs-built_in"&gt;%Boolean&lt;/span&gt; [ InitialExpression = &lt;span class="hljs-number"&gt;1&lt;/span&gt; ]&lt;span class="hljs-comment"&gt;;&lt;/span&gt;

&lt;span class="hljs-keyword"&gt;Property&lt;/span&gt; Adapter &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; Ens.InboundAdapter&lt;span class="hljs-comment"&gt;;&lt;/span&gt;

&lt;span class="hljs-keyword"&gt;Parameter&lt;/span&gt; ADAPTER = &lt;span class="hljs-string"&gt;"Ens.InboundAdapter"&lt;/span&gt;&lt;span class="hljs-comment"&gt;;&lt;/span&gt;

&lt;span class="hljs-keyword"&gt;Parameter&lt;/span&gt; SETTINGS = &lt;span class="hljs-string"&gt;"TargetConfigNames:Basic:selector?multiSelect=1&amp;amp;context={Ens.ContextSearch/ProductionItems?targets=1&amp;amp;productionName=@productionId},AsyncCall,BreakOnError"&lt;/span&gt;&lt;span class="hljs-comment"&gt;;&lt;/span&gt;

Method OnProcessInput(pInput &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; &lt;span class="hljs-built_in"&gt;%RegisteredObject&lt;/span&gt;, Output pOutput &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; &lt;span class="hljs-built_in"&gt;%RegisteredObject&lt;/span&gt;, ByRef pHint &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; &lt;span class="hljs-built_in"&gt;%String&lt;/span&gt;) &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; &lt;span class="hljs-built_in"&gt;%Status&lt;/span&gt;
{
    &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; tSC = &lt;span class="hljs-built_in"&gt;$$$OK&lt;/span&gt;
    &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; targets = &lt;span class="hljs-built_in"&gt;$LISTFROMSTRING&lt;/span&gt;(&lt;span class="hljs-built_in"&gt;..TargetConfigNames&lt;/span&gt;)

    &lt;span class="hljs-keyword"&gt;Quit&lt;/span&gt;:&lt;span class="hljs-built_in"&gt;$LISTLENGTH&lt;/span&gt;(targets)=&lt;span class="hljs-number"&gt;0&lt;/span&gt; &lt;span class="hljs-built_in"&gt;$$$ERROR&lt;/span&gt;(&lt;span class="hljs-built_in"&gt;$$$GeneralError&lt;/span&gt;, &lt;span class="hljs-string"&gt;"TargetConfigNames are not defined"&lt;/span&gt;)

    &lt;span class="hljs-keyword"&gt;For&lt;/span&gt; i=&lt;span class="hljs-number"&gt;1&lt;/span&gt;:&lt;span class="hljs-number"&gt;1&lt;/span&gt;:&lt;span class="hljs-built_in"&gt;$LISTLENGTH&lt;/span&gt;(targets) {
        &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; target = &lt;span class="hljs-built_in"&gt;$LISTGET&lt;/span&gt;(targets, i)
        &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; pRequest = &lt;span class="hljs-keyword"&gt;##class&lt;/span&gt;(Ens.Request).&lt;span class="hljs-built_in"&gt;%New&lt;/span&gt;()

        &lt;span class="hljs-keyword"&gt;If&lt;/span&gt; &lt;span class="hljs-built_in"&gt;..AsyncCall&lt;/span&gt; {
            &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; tSC = &lt;span class="hljs-built_in"&gt;..SendRequestAsync&lt;/span&gt;(target, pRequest)
        } &lt;span class="hljs-keyword"&gt;Else&lt;/span&gt;  {
            &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; tSC = &lt;span class="hljs-built_in"&gt;..SendRequestSync&lt;/span&gt;(target, pRequest, .pResponse)
        }
        &lt;span class="hljs-keyword"&gt;Quit&lt;/span&gt;:(&lt;span class="hljs-built_in"&gt;$$$ISERR&lt;/span&gt;(tSC)&amp;amp;&amp;amp;&lt;span class="hljs-built_in"&gt;..BreakOnError&lt;/span&gt;)
    }

    &lt;span class="hljs-keyword"&gt;Quit&lt;/span&gt; tSC
}

&lt;span class="hljs-keyword"&gt;ClassMethod&lt;/span&gt; OnGetConnections(Output pArray &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; &lt;span class="hljs-built_in"&gt;%String&lt;/span&gt;, pItem &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; Ens.Config.Item)
{
    &lt;span class="hljs-keyword"&gt;If&lt;/span&gt; pItem.GetModifiedSetting(&lt;span class="hljs-string"&gt;"TargetConfigNames"&lt;/span&gt;, .tValue) {
        &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; targets = &lt;span class="hljs-built_in"&gt;$LISTFROMSTRING&lt;/span&gt;(tValue)
        &lt;span class="hljs-keyword"&gt;For&lt;/span&gt; i=&lt;span class="hljs-number"&gt;1&lt;/span&gt;:&lt;span class="hljs-number"&gt;1&lt;/span&gt;:&lt;span class="hljs-built_in"&gt;$LISTLENGTH&lt;/span&gt;(targets) &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; pArray(&lt;span class="hljs-built_in"&gt;$LISTGET&lt;/span&gt;(targets, i)) = &lt;span class="hljs-string"&gt;""&lt;/span&gt;
    }
}

}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;After it, you just need to add this class to Production, and mark our business process in the &lt;code&gt;TargetConfigNames&lt;/code&gt; setting. &lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;But what if requirements were changed? And now we need to run our data grabber every Monday at 08:00 AM. The best way for it is using &lt;a href="https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GSA_manage_taskmgr" rel="noopener noreferrer"&gt;Task Manager&lt;/a&gt;. For this, we need to create a custom task that will run our Initiator programmatically. Here is a simple code for this task:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;&lt;span class="hljs-comment"&gt;/// Launch selected business service on schedule&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;Class&lt;/span&gt; Util.Task.ScheduleCall &lt;span class="hljs-keyword"&gt;Extends&lt;/span&gt; &lt;span class="hljs-built_in"&gt;%SYS.Task.Definition&lt;/span&gt;
{

&lt;span class="hljs-keyword"&gt;Parameter&lt;/span&gt; TaskName = &lt;span class="hljs-string"&gt;"Launch On Schedule"&lt;/span&gt;&lt;span class="hljs-comment"&gt;;&lt;/span&gt;

&lt;span class="hljs-comment"&gt;/// Business Service to launch&lt;/span&gt;
&lt;span class="hljs-keyword"&gt;Property&lt;/span&gt; ServiceName &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; Ens.DataType.ConfigName&lt;span class="hljs-comment"&gt;;&lt;/span&gt;

Method OnTask() &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; &lt;span class="hljs-built_in"&gt;%Status&lt;/span&gt;
{
    &lt;span class="hljs-keyword"&gt;#dim&lt;/span&gt; tService &lt;span class="hljs-keyword"&gt;As&lt;/span&gt; Ens.BusinessService
    &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; tSC = &lt;span class="hljs-keyword"&gt;##class&lt;/span&gt;(Ens.Director).CreateBusinessService(&lt;span class="hljs-built_in"&gt;..ServiceName&lt;/span&gt;, .tService)
    &lt;span class="hljs-keyword"&gt;Quit&lt;/span&gt;:&lt;span class="hljs-built_in"&gt;$$$ISERR&lt;/span&gt;(tSC) tSC
    
    &lt;span class="hljs-keyword"&gt;Set&lt;/span&gt; pRequest = &lt;span class="hljs-keyword"&gt;##class&lt;/span&gt;(Ens.Request).&lt;span class="hljs-built_in"&gt;%New&lt;/span&gt;()
    &lt;span class="hljs-keyword"&gt;Quit&lt;/span&gt; tService.ProcessInput(pRequest, .pResponse)
}

}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;Two important things here:&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;You must set the &lt;code&gt;Pool Size&lt;/code&gt; of the Initiator Business Service to 0 to prevent running it by call interval (option &lt;code&gt;Call Interval&lt;/code&gt;, you can clear or leave as is - it's not used when &lt;code&gt;Pool Size&lt;/code&gt; is 0)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0ublikngrwkpnk24nmc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0ublikngrwkpnk24nmc.png" alt=" " width="300" height="136"&gt;&lt;/a&gt;&lt;br&gt;
&lt;/p&gt;
&lt;li&gt;
&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;You need to create a task in &lt;code&gt;Task Manager&lt;/code&gt;, choose "&lt;code&gt;Launch On Schedule&lt;/code&gt;" &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;as task type (don't forget to check a Namespace), set our Initiator Business Service name to the &lt;code&gt;ServiceName&lt;/code&gt; parameter, and set up the desired schedule. See: &lt;code&gt;System Operation &amp;gt; Task Manager &amp;gt; New Task&lt;/code&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/li&gt;
&lt;br&gt;


&lt;h3&gt;And a bonus&lt;/h3&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;I often faced cases when we need to run something in Production only on demand. Of course, we can create some custom UI on CSP for it, but reinventing the wheel is not our way. I believe it is better to use the typical UI of the Management Portal&lt;/span&gt;&lt;span&gt;. So, the same task that we created previously can be run manually. Just change the task run type to &lt;/span&gt;&lt;code&gt;On Demand&lt;/code&gt;&lt;span&gt; for it. On-demand task list is available at &lt;/span&gt;&lt;code&gt;System &amp;gt; Task Manager &amp;gt; On-demand Tasks&lt;/code&gt;&lt;span&gt;, see the &lt;/span&gt;&lt;code&gt;Run&lt;/code&gt;&lt;span&gt; button. Furthermore, the &lt;/span&gt;&lt;code&gt;Run&lt;/code&gt;&lt;span&gt; button (manual run) is available for any kind of task.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

&lt;p&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;&lt;span&gt;It is all. Now we have a pretty architecture of interoperability for our business hosts. And 3 ways to run our data grabber: by interval, on a timetable, or manually.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;

</description>
      <category>tutorial</category>
      <category>coding</category>
      <category>codesnippet</category>
      <category>programming</category>
    </item>
    <item>
      <title>Namespaces and databases - basics of inner workings of InterSystems IRIS</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Wed, 25 Feb 2026 16:32:06 +0000</pubDate>
      <link>https://dev.to/intersystems/namespaces-and-databases-basics-of-inner-workings-of-intersystems-iris-a79</link>
      <guid>https://dev.to/intersystems/namespaces-and-databases-basics-of-inner-workings-of-intersystems-iris-a79</guid>
      <description>&lt;p&gt;InterSystems IRIS is built on an architecture that separates the logical organization of data (namespaces) from its physical storage location (databases). Understanding this separation and the distinction between Namespaces and Databases is crucial for effective data management, security, and especially, high-performance data sharing.&lt;/p&gt;

&lt;p&gt;In this article, I will discuss these foundational components and provide a practical guide on leveraging global mappings to share native data structures (globals) across different logical environments.&lt;/p&gt;

&lt;h3&gt;Databases: Physical Reality&lt;/h3&gt;

&lt;p&gt;A database represents the physical reality of where the data is stored on the disk. First and foremost, it’s a file in a file system called IRIS.dat (e.g., &amp;lt;Install folder&amp;gt;\mgr\user\IRIS.DAT). The maximum size of this file is 32TB. It is the container for all the actual data and the code. Databases are managed by the IRIS kernel, which handles caching, journaling, and transaction logging at the physical file level.&lt;/p&gt;

&lt;p&gt;When you install InterSystems IRIS DBMS, the following databases are installed automatically:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3io2uu8xqjq6f512a7zb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3io2uu8xqjq6f512a7zb.png" alt=" " width="462" height="178"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When you’re working in a production environment, it’s strongly recommended that you create a separate dedicated database or several to store your data and code.&lt;/p&gt;

&lt;p&gt;To create a new database, go to System &amp;gt; Configuration &amp;gt; Local Databases &amp;gt; Create New Database and provide a name and a directory where it should be stored:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fackpheej4xx3yoqk5k91.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fackpheej4xx3yoqk5k91.png" alt=" " width="800" height="223"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;Namespaces: Logical Sandbox&lt;/h3&gt;

&lt;p&gt;A namespace in InterSystems IRIS represents a self-contained, logical work environment. It is analogous to a schema in a relational database or a project workspace. It defines the scope of all application elements: persistent classes (objects), routines (code), and most importantly, globals (native data structures). Moreover, applications running in one namespace are logically isolated from those in another. This prevents conflicts between different applications or development environments (e.g., Development, Staging, Production). From the developer's perspective, everything, data and code, resides within the confines of the namespace to which they are connected. &lt;strong&gt;Each IRIS application process runs in one namespace.&lt;/strong&gt; If you want to access data/code in another namespace, you need to change it explicitly.&lt;/p&gt;

&lt;p&gt;When you install InterSystems IRIS DBMS, the following namespaces are set up automatically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;%SYS&lt;/li&gt;
&lt;li&gt;USER&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Mapping Logic to Physics&lt;/h3&gt;

&lt;p&gt;The link between a Namespace and a Database is established through a Mapping. Every Namespace has a defined set of mappings that specify which physical database(s) should be used to store its elements.&lt;/p&gt;

&lt;p&gt;For example, we have several databases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clients – contains data about clients&lt;/li&gt;
&lt;li&gt;Finances – contains financial data that allows working with both clients and vendors&lt;/li&gt;
&lt;li&gt;Vendors – contains data about vendors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;and two namespaces:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Billing – has mapping to clients and some or all info from finances that allows the applications that work with this namespace to get all necessary info to deal with clients&lt;/li&gt;
&lt;li&gt;Purchasing – has mapping to vendors and some or all info from finances that allows the applications that work with this namespace to get all necessary info to deal with vendors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcbhkll4wh8hxj189bw5u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcbhkll4wh8hxj189bw5u.png" alt=" " width="800" height="251"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At the same time, each namespace has a default database for data and code.&lt;/p&gt;

&lt;p&gt;In this example, it could be:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl2yo9fbddsjgq1rbcf8w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl2yo9fbddsjgq1rbcf8w.png" alt=" " width="756" height="357"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You set up which database should be the default for the namespace when you create a namespace (System &amp;gt; Configuration &amp;gt; Namespaces &amp;gt; New Namespace):&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhqkgadwcjvxjq1x2l7la.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhqkgadwcjvxjq1x2l7la.png" alt=" " width="800" height="458"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Database for temporary storage will be IRISTEMP by default.&lt;/p&gt;

&lt;p&gt;The database for Globals stores data, while the database for Routines stores code and class descriptions.&lt;/p&gt;

&lt;p&gt;If you wish to add more mappings to data to other databases, go to System &amp;gt; Configuration &amp;gt; Namespaces &amp;gt; (Choose the namespace) &amp;gt; Global Mappings &amp;gt; New and add a new mapping:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe4l33m33ellim9rxeqcn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe4l33m33ellim9rxeqcn.png" alt=" " width="635" height="267"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As you can see, you are able to set up access in minute detail – including the subscripts of particular globals.&lt;/p&gt;

&lt;p&gt;The same can be done to Routine mappings.&lt;/p&gt;

&lt;p&gt;Apart from user-defined mappings, there are also system mappings. System-level code and data (like system class definitions, routines, and system-specific globals starting with ^%) are mapped to system databases (e.g., IRISLIB, IRISSYS). This ensures that application data doesn't interfere with the core system components.&lt;/p&gt;




&lt;p&gt;By separating the logical Namespace from the physical Database, and by using mappings, you can gain fine-grained control over data location, security, and performance.&lt;/p&gt;

</description>
      <category>beginners</category>
      <category>healthshare</category>
      <category>programming</category>
      <category>architecture</category>
    </item>
    <item>
      <title>AI Agents from Scratch Part 2: Giving the Brain a Body</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Wed, 18 Feb 2026 17:33:34 +0000</pubDate>
      <link>https://dev.to/intersystems/ai-agents-from-scratch-part-2-giving-the-brain-a-body-3p4e</link>
      <guid>https://dev.to/intersystems/ai-agents-from-scratch-part-2-giving-the-brain-a-body-3p4e</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fca1vrlyb1kps0yv19pg2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fca1vrlyb1kps0yv19pg2.png" alt=" " width="800" height="345"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In &lt;a href="https://community.intersystems.com/post/ai-agents-scratch-part-1-forging-brain" rel="noopener noreferrer"&gt;&lt;strong&gt;Part 1&lt;/strong&gt;&lt;/a&gt;, we laid the technical foundation of &lt;strong&gt;MAIS&lt;/strong&gt; (Multi-Agent Interoperability Systems). We have successfully wired up the 'Brain', built a robust Adapter using LiteLLM, locked down our API keys with IRIS Credentials, and finally cracked the trick code on the Python interoperability puzzle.&lt;/p&gt;

&lt;p&gt;However, right now our system is merely a raw pipe to an LLM. It processes text, but it lacks identity.&lt;/p&gt;

&lt;p&gt;Today, in Part 2, we will define the &lt;strong&gt;Anatomy of an Agent&lt;/strong&gt;. We will move from simple API calls to structured Personas. We will learn how to wrap the LLM in a layer of business logic, giving it a name, a role, and, most importantly, the ability to know its neighbors.&lt;/p&gt;

&lt;p&gt;Let’s build the "Soul" of our machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Anatomy of an Agent: More Than Just a Prompt&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Now that we have a connection to the "Brain" (the LLM), we need to grant it a personality. A common misconception is that an Agent is simply a system prompt, e.g., "You are a helpful assistant." That’s just a chatbot.&lt;/p&gt;

&lt;p&gt;True Agentic AI stands out because it does not require a babysitter. It combines autonomy with a serious drive to complete the job. It looks ahead, e.g., verifying inventory before booking a sale, and if it runs into a roadblock, it figures out a workaround instead of just giving up.&lt;/p&gt;

&lt;p&gt;To encapsulate this complexity within IRIS, I developed the &lt;code&gt;dc.mais.adapter.Agent&lt;/code&gt; class. It acts as a "Persona Definition," effectively wrapping the raw LLM within strict operational boundaries.&lt;/p&gt;

&lt;p&gt;Every agent is built on a specific configuration set. We always start with the &lt;strong&gt;Name&lt;/strong&gt; and &lt;strong&gt;Role&lt;/strong&gt; to establish a unique identifier and expertise domain (e.g., a "French Cuisine Expert"). To prevent hallucinations or scope creep, we set a hard &lt;strong&gt;Goal&lt;/strong&gt; and a detailed checklist of &lt;strong&gt;Tasks&lt;/strong&gt;. We also enforce communication standards via &lt;strong&gt;OutputInstructions&lt;/strong&gt;, telling the agent to be concise or avoid specific characters, and provide the &lt;strong&gt;Tools&lt;/strong&gt; (JSON definitions) it is authorized to execute.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Why "Target" Matters&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;To matter how you look at it, the most critical component in this setup is the &lt;strong&gt;Target&lt;/strong&gt;. This property enables what I call a &lt;strong&gt;Decentralized Handoff&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of routing everything through the central supervisor, the &lt;code&gt;Target&lt;/code&gt; property provides a comma-separated list of valid next steps in the chain. For example, a &lt;em&gt;MenuExpert&lt;/em&gt; agent knows its job is to help choose food. Yet, thanks to the &lt;code&gt;Target&lt;/code&gt; property, it also understands that once the user says "I want the bill," it &lt;em&gt;must&lt;/em&gt; pass the ball to the &lt;em&gt;CashierAgent&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;It creates a "reasoning engine" where the LLM comprehends its own boundaries: "I am the food expert, but I am &lt;em&gt;not&lt;/em&gt; allowed to process payments. I need to call the &lt;em&gt;Cashier&lt;/em&gt;."&lt;/p&gt;

&lt;p&gt;Below, you can see how the class looks so far:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Class dc.mais.adapter.Agent Extends Ens.OutboundAdapter
{

/// Controls which properties are visible in Production settings
Parameter SETTINGS = "Name:Basic,Role:Basic,Goal:Basic,Tasks:Basic:textarea?rows=5&amp;amp;cols=50,OutputInstructions:Basic:textarea?rows=5&amp;amp;cols=50,Tools:Basic:textarea?rows=5&amp;amp;cols=50,Target:Basic,Model:Basic,MaxIterations,Verbose";

/// Unique identifier for the agent
Property Name As %String;

/// Primary objective the agent is designed to achieve
Property Goal As %String(MAXLEN = 100);

/// Description of the agent's function and expertise. i.e. "This assistant is knowledgeable, helpful, and suggests follow-up questions."
Property Role As %String(MAXLEN = 350);

/// Guidelines for how the agent should format and present responses
Property OutputInstructions As %String(MAXLEN = 1000);

/// Ordered list of responsibilities and actions the agent must perform
Property Tasks As %String(MAXLEN = 1000);

/// List of callable functions available to the agent
Property Tools As %String(MAXLEN = 10000);

/// Name of the next agents allowed (Comma-separated, e.g., "OrderTaker,OrderSubmitter")
Property Target As %String(MAXLEN = 1000);

// Comma-separated or JSON for extensibility

/// LLM model name (allows different agents to use different models)
Property Model As %String;

/// Maximum number of tool-calling iterations before stopping
Property MaxIterations As %Integer [ InitialExpression = 3 ];

}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This layering is the secret sauce. It elevates us from a generic "I hope the AI understands" approach to a structured "Plan, Assign, and Monitor" system. Essentially, we are wrapping the raw unpredictability of an LLM in a safety blanket of business logic.&lt;/p&gt;

&lt;p&gt;Still, having these properties in a database class is not sufficient. We also need to translate these strict configurations into natural-language instructions that the LLM respects.&lt;/p&gt;

&lt;p&gt;This is where &lt;strong&gt;Dynamic Prompt Engineering&lt;/strong&gt; enters the picture.&lt;/p&gt;

&lt;p&gt;I implemented a method called &lt;code&gt;GetAgentInstructions&lt;/code&gt; that acts as a factory for the agent's personality. It does not simply concatenate strings; it constructs the whole mental model for the AI layer by layer.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The "Knowledge of Neighbors" (Handoff)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Pay attention to the logic inside the &lt;code&gt;If (..Target '= "")&lt;/code&gt; block since it is the glue that holds the network together. In fact, we are telling the agent exactly who its neighbors are.&lt;/p&gt;

&lt;p&gt;This functions as an "Allow List." It prevents the &lt;em&gt;MenuExpert&lt;/em&gt; from attempting to transfer a customer to a non-existent &lt;em&gt;ParkingAttendant&lt;/em&gt;. It enforces the business process flow at the prompt level. While the actual transfer mechanism belongs to the Orchestrator (which we will cover soon), the &lt;em&gt;awareness&lt;/em&gt; of the transfer starts here.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Defensive Prompting&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;You should also notice the section on &lt;strong&gt;Tool Usage Guidelines&lt;/strong&gt;. We explicitly command the model: &lt;em&gt;"Do NOT guess or invent data."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;It is defensive programming applied to English. We are pre-emptively stopping the model from hallucinating a menu or faking an order confirmation and forcing it to utilize the native tools we provide.&lt;/p&gt;

&lt;p&gt;Check out the implementation below:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Method GetAgentInstructions(Output oPrompt As %String) As %String
{
    Set tSC = $$$OK
    Set oPrompt = "" // Ensures it's not null
    Try {
        Set prompt = "You are "_..Name_", a specialized agent."_$C(10)
        Set:(..Role '= "") prompt = prompt_"## Your Role: "_..Role_$C(10)
        Set:(..Goal '= "") prompt = prompt_"## Your Goal: "_..Goal_$C(10)
        Set:(..Tasks '= "") prompt = prompt_"## Your Tasks: "_..Tasks_$C(10)
        Set:(..OutputInstructions'= "") prompt = prompt_"## Output Instructions: "_..OutputInstructions_$C(10)

        // --- Handoff Logic: Introducing the neighbors ---
        If (..Target '= "") {
            Set prompt = prompt_"## Handoff Capabilities:"_$C(10)
            Set prompt = prompt_"- You can transfer the conversation ONLY to the following agents: "_..Target_$C(10)
            Set prompt = prompt_"- Use the 'handoff_to_agent' tool with one of these exact names."_$C(10)
        }

        // --- Defensive Prompting for Tools ---
        If (..Tools '= "") {
            Set prompt = prompt_"## Tool Usage Guidelines:"_$C(10)
            Set prompt = prompt_"- You have access to functions (tools) to get real data."_$C(10)
            Set prompt = prompt_"- You MUST call the function natively when needed."_$C(10)
            Set prompt = prompt_"- Do NOT guess or invent data. Use the function."_$C(10)
            Set prompt = prompt_"- NEVER write the function call JSON in the response text. Just trigger the function."_$C(10)
        }

        Set prompt = prompt_"# Remember: You are part of a multi-agent system."
        Set oPrompt = prompt

    } Catch ex {
        Set tSC=ex.AsStatus()
        $$$LOGERROR("Error generating instructions: "_ex.DisplayString())
    }
    Return tSC
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that our agents have their orders and recognize their neighbors, we need a Commander to ensure they actually stick to the script. So, let’s enter the &lt;strong&gt;Orchestrator&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Nervous System: Orchestrating the Bistro Crew&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;It is time for use to move to the nervous system: &lt;strong&gt;The Orchestrator&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I find it significantly easier to grasp the Orchestrator by creating a tangible project rather than discussing abstract theory. So, let’s head to our &lt;code&gt;dc.samples&lt;/code&gt; package and build a "Bistro Crew" to validate our framework.&lt;/p&gt;

&lt;p&gt;The concept is straightforward: we will establish a team of attendants for a small bistro. We will need a &lt;strong&gt;Greeter&lt;/strong&gt; to welcome guests and a &lt;strong&gt;Menu Expert&lt;/strong&gt; to handle the culinary details.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;1. Hiring the Staff (Configuring Agents)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Since we designed our &lt;code&gt;dc.mais.operation.Agent&lt;/code&gt; class for reusability, we do not need to write new code for these agents. We should simply add them to the Production and configure the settings.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffwsrfikagrugi0knscg0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffwsrfikagrugi0knscg0.png" alt=" " width="800" height="499"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;[Caption: &lt;em&gt;Setting up the crew: Adding a reusable Agent Operation to the Production. No new code required, just configuration.&lt;/em&gt;]&lt;/p&gt;

&lt;p&gt;Let’s add the first one, &lt;code&gt;Agent.Greeter&lt;/code&gt;. In the &lt;strong&gt;Basic Parameters&lt;/strong&gt;, we should define its soul:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Name: Greeter
Role: Welcome customers and provide initial menu information
Goal: Make customers feel welcomed and guide them to the appropriate specialist
Tasks: 
- Welcome customers with a warm, professional greeting
- Provide brief overview of restaurant specialties
- Identify customer needs (menu info, ordering, or general questions)
- Handoff to MenuExpert when customer wants detailed menu information
OutputInstructions: 
- Greet customers warmly and professionally
- Keep responses concise and inviting (2-3 sentences max)
- Always end with a question to engage the customer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd6jvfxa6muoflv8zww8u.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd6jvfxa6muoflv8zww8u.gif" alt="noice" width="500" height="281"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We will return to the &lt;em&gt;MenuExpert&lt;/em&gt; shortly. First, let’s make sure they have a brain. Just as we did for the Agent, I created a generic Business Operation &lt;code&gt;dc.mais.operation.LLM&lt;/code&gt; using the adapter we built earlier. Simply add it to the production, and we are ready to go.&lt;/p&gt;

&lt;p&gt;Great!!&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;2. Building the Flow (The BPL)&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Now, let's create a Business Process named &lt;code&gt;Orchestrator&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This process requires a Request message containing the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sender:&lt;/strong&gt; The name of the agent sending the message (if any).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Assignee:&lt;/strong&gt; The specific agent we want to target.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content:&lt;/strong&gt; The user’s interaction text.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the response, a simple &lt;code&gt;Content&lt;/code&gt; property to hold the agent's reply is sufficient.&lt;/p&gt;

&lt;p&gt;I prefer using &lt;strong&gt;Context Variables&lt;/strong&gt; to keep the state clean. So, the first thing we do in the BPL is assign the incoming Request properties to the Context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The "Cold Start" Logic:&lt;/strong&gt; If this is the very first execution, the &lt;code&gt;Assignee&lt;/code&gt; will be empty. We need to decide who initiates the conversation. For that reason, I added a simple &lt;code&gt;If&lt;/code&gt; condition: if &lt;code&gt;Assignee&lt;/code&gt; is empty, set the Target to &lt;code&gt;'greeter'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Routing:&lt;/strong&gt; At this point, we trace the route using a &lt;strong&gt;Switch&lt;/strong&gt; based on the Target.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Case &lt;code&gt;'greeter'&lt;/code&gt;: Call &lt;code&gt;Agent.Greeter&lt;/code&gt;.
&lt;/li&gt;
&lt;li&gt;Case &lt;code&gt;'menu_expert'&lt;/code&gt;: Call &lt;code&gt;Agent.MenuExpert&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; We are making synchronous calls here (disable the &lt;code&gt;Async&lt;/code&gt; flag). Why? Because we are not asking the Agent to &lt;em&gt;answer&lt;/em&gt; the user yet. We are requesting the Agent Operation to return its &lt;strong&gt;System Prompt&lt;/strong&gt; (its personality).&lt;/p&gt;

&lt;p&gt;Remember to save this result in &lt;code&gt;context.CurrentSystemPrompt&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Synapse:&lt;/strong&gt; Finally, after the route is determined and we have the correct prompt, we can call the &lt;strong&gt;LLM&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Request.Content:&lt;/strong&gt; &lt;code&gt;context.CurrentSystemPrompt&lt;/code&gt; (The Rules)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Request.UserContent:&lt;/strong&gt; The actual text from the user.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fonmmbulexz4jpqt0xzeh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fonmmbulexz4jpqt0xzeh.png" alt=" " width="800" height="765"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;With this simple flow — Router -&amp;gt; Get Persona -&amp;gt; Call LLM — we can run our first test.&lt;/p&gt;

&lt;p&gt;I sent a "Hello" to the process, and...&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Et voilà!&lt;/em&gt; The Greeter responded with a warm, professional welcome, exactly as configured. It is alive!&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;Giving Agents Hands: Tools and the ReAct Loop&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Let’s get back to our crew. We left off with the &lt;em&gt;Greeter&lt;/em&gt;, who is charming but, frankly, a bit useless when it comes to the actual food.&lt;/p&gt;

&lt;p&gt;Enter the &lt;strong&gt;MenuExpert&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The primary distinction between this agent and the Greeter is that the MenuExpert does not merely rely on its training data (which hallucinates prices). It requires access to real-time data. It needs a &lt;strong&gt;Tool&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;To keep things simple for this example, I created a standard Business Operation called &lt;code&gt;Tool.GetMenu&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;At this stage, in the &lt;strong&gt;MenuExpert&lt;/strong&gt; configuration settings, under &lt;code&gt;Tools&lt;/code&gt;, we should paste the function definition that follows the standard OpenAI JSON schema:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"function"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"function"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"get_menu"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Get the full bistro menu"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"parameters"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"object"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"properties"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"required"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It acts as the API documentation for the brain. We are basically telling the LLM: &lt;em&gt;"If you need the menu, there is a function called &lt;code&gt;get_menu&lt;/code&gt; that takes no arguments. Use it."&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;A Quick Break: The ReAct Paradigm&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Prior to implementing the wiring, it is worth understanding the theory behind it. The entire system hinges on a framework known as ReAct (short for Reason plus Act). Introduced in a &lt;a href="https://arxiv.org/abs/2210.03629" rel="noopener noreferrer"&gt;2022 paper&lt;/a&gt;, it fundamentally changed the way we build AI agents.&lt;/p&gt;

&lt;p&gt;Before ReAct, LLMs were exclusively text completion engines. With ReAct, we force the model to alternate between &lt;strong&gt;verbal reasoning&lt;/strong&gt; (Thinking) and &lt;strong&gt;actions&lt;/strong&gt; (Tools). In a nutshell, it looks similar to the following internal monologue:&lt;/p&gt;

&lt;p&gt;Thought: The user wants the price of Coq au Vin. I don't know it.&lt;br&gt;&lt;br&gt;
Action: get_menu()&lt;br&gt;&lt;br&gt;
Observation: {"Coq au Vin": 28.00}&lt;br&gt;&lt;br&gt;
Thought: I have the price. I can answer now.&lt;br&gt;&lt;br&gt;
Final Answer: "The Coq au Vin costs $28.00."&lt;/p&gt;

&lt;p&gt;Think of &lt;strong&gt;Memory&lt;/strong&gt; as &lt;em&gt;"What I remember"&lt;/em&gt; (context history) and &lt;strong&gt;ReAct&lt;/strong&gt; as &lt;em&gt;"How I solve problems"&lt;/em&gt; (the loop of reasoning and acting). Without ReAct, the agent is just a passive observer. With it, it becomes an active problem solver.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Missing Piece: The Nervous System&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;We have covered considerable ground. We have a secure connection to the LLM (Part 1). We have also defined our Agents with strict Roles, Goals, and Tools, and explored the ReAct theory that drives their reasoning (Part 2).&lt;/p&gt;

&lt;p&gt;However, after reviewing our code, we have identified a problem: &lt;strong&gt;It is all static.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We have the &lt;em&gt;definitions&lt;/em&gt; of the &lt;code&gt;MenuExpert&lt;/code&gt; and the &lt;code&gt;get_menu&lt;/code&gt; tool, but nothing connects them. There is no loop to catch the tool call, execute the SQL, and feed the result back to the brain. There is also no mechanism to handle the Handoff when the agent says, "I need help."&lt;/p&gt;

&lt;p&gt;We have the talent (the Actors) and the instructions (the Script), but they have nowhere to perform. There is no Stage yet, either.&lt;/p&gt;

&lt;p&gt;In the final &lt;strong&gt;Part 3&lt;/strong&gt;, we will construct the &lt;strong&gt;Nervous System&lt;/strong&gt;. We will do the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Implement the &lt;strong&gt;Orchestrator&lt;/strong&gt; using InterSystems BPL.
&lt;/li&gt;
&lt;li&gt;Build the &lt;strong&gt;Double Loop Architecture&lt;/strong&gt; to manage autonomous lifecycles.
&lt;/li&gt;
&lt;li&gt;Execute the tools and handle the "Handoff" signal dynamically.
&lt;/li&gt;
&lt;li&gt;Utilize &lt;strong&gt;Visual Tracing&lt;/strong&gt; to watch our agents thinking in real time.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Get your coffee ready since the next part will bring it all to life!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>famework</category>
      <category>llm</category>
      <category>programming</category>
    </item>
    <item>
      <title>AI Agents from Scratch Part 1: Forging the Brain</title>
      <dc:creator>InterSystems Developer</dc:creator>
      <pubDate>Wed, 18 Feb 2026 17:31:19 +0000</pubDate>
      <link>https://dev.to/intersystems/ai-agents-from-scratch-part-1-forging-the-brain-4b83</link>
      <guid>https://dev.to/intersystems/ai-agents-from-scratch-part-1-forging-the-brain-4b83</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6liine8rcpdusbedscd9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6liine8rcpdusbedscd9.png" alt=" " width="800" height="345"&gt;&lt;/a&gt;&lt;br&gt;
Some concepts make perfect sense on paper, whereas others require you to get your hands dirty. &lt;br&gt;
Take driving, for example. You can memorize every component of the engine mechanics, but that does not mean you can actually drive. &lt;/p&gt;

&lt;p&gt;You cannot truly grasp it until you are in the driver's seat, physically feeling the friction point of the clutch and the vibration of the road beneath. &lt;br&gt;
While some computing concepts are intuitive, Intelligent Agents are different. To understand them, you have to get in the driver's seat.&lt;/p&gt;

&lt;p&gt;In my previous articles regarding AI agents, we discussed such tools as &lt;a href="https://community.intersystems.com/post/command-crew" rel="noopener noreferrer"&gt;CrewAI&lt;/a&gt; and &lt;a href="https://community.intersystems.com/post/building-ai-agents-zero-hero" rel="noopener noreferrer"&gt;LangGraph&lt;/a&gt;. In this guide, however, we are going to build an AI agent micro-framework from scratch. Writing an agent goes beyond mere syntax; it is a journey every developer should undertake to try and solve real-world problems.&lt;/p&gt;

&lt;p&gt;Still, beyond the experience itself, there is another fundamental reason to do this, best summarized by &lt;strong&gt;Richard Feynman&lt;/strong&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"What I cannot create, I do not understand."&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;So… What Is an AI Agent?&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Let 's be specific. An agent is essentially a code that pursues a goal. It does not just chat. It “reads the room” and executes various tasks, ranging from sorting emails to managing complex schedules.&lt;/p&gt;

&lt;p&gt;Unlike rigid scripts, agents possess agency. Conventional scripts break the moment reality deviates from hard-coded rules. Agents do not. They adapt. If a flight is cancelled, they do not crash with an error; they simply reroute.&lt;/p&gt;

&lt;p&gt;I like visualizing the architecture as a biological system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Hands: The Tools. Without execution environments or APIs, the brain is trapped in a jar.
&lt;/li&gt;
&lt;li&gt;The Nervous System: Your orchestration layer. It manages state and logs memory.
&lt;/li&gt;
&lt;li&gt;The Body: The deployment infrastructure that ensures reliability and uptime.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A single agent might be impressive, but &lt;strong&gt;Agentic AI&lt;/strong&gt; is where the true real power lies. &lt;br&gt;
It is the whole system where multiple specialized agents collaborate to achieve a shared objective.&lt;/p&gt;

&lt;p&gt;It is essentially a digital agency: you have one agent conducting research, another one drafting a copy, and a 'manager' node ensuring no one steps on each other's toes.&lt;/p&gt;

&lt;p&gt;Still, honestly, theory can only get us so far, but I am itching to actually build this thing. &lt;br&gt;
Let’s get our hands dirty. I have named this project MAIS, which serves a dual purpose: technically, it stands for Multi-Agent Interoperability Systems. However, in Portuguese, it simply means 'plus.' It is a nod to our constant search for extra capabilities.&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;The Brain: Agnostic Intelligence with LiteLLM&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;To power our agents, we need flexibility, but hardcoding a specific provider like OpenAI limits us. What if we want to test Gemini 3.0? What if a client prefers to run Llama 3 locally via Ollama?&lt;/p&gt;

&lt;p&gt;To accomplish that, I prefer to rely on a library that has become a staple in &lt;strong&gt;The Musketeers&lt;/strong&gt; projects: &lt;strong&gt;LiteLLM&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The beauty of LiteLLM lies in its standardization. It acts as a universal adapter, normalizing requests and responses across over 100+ providers. &lt;br&gt;
This abstraction is crucial for a Multi-Agent System because it allows us to mix and match models based on the agent's specific needs. &lt;br&gt;
Let’s imagine the following scenario:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The first agent uses a fast, cost-effective model (e.g., &lt;code&gt;gpt-4o-mini&lt;/code&gt;).
&lt;/li&gt;
&lt;li&gt;The second agent utilizes a model with high reasoning capabilities and a large context window (e.g., &lt;code&gt;claude-3-5-sonnet&lt;/code&gt;) to analyze complex data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With our architecture, we can define which model an agent will work with simply by changing a string in the settings. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security First: Handling API Keys&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;Connecting to these providers requires API Keys, and we certainly do not want to hardcode secrets in our source code. &lt;br&gt;
The "InterSystems’ way" to handle this is via &lt;strong&gt;Production Credentials&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;To ensure our keys remain protected, the &lt;code&gt;LLM Adapter&lt;/code&gt; acts as a bridge to IRIS's secure credential storage. &lt;br&gt;
We utilize a property named &lt;code&gt;APIKeysConfig&lt;/code&gt; to manage the handover. &lt;br&gt;
You should populate it with the provider-specific key names required by LiteLLM (&lt;strong&gt;e.g., OPENAI_API_KEY, AZURE_API_KEY&lt;/strong&gt;), separated by commas.&lt;/p&gt;

&lt;p&gt;When the adapter gets initialized, it pulls the actual secrets from the secure storage and assigns them as environment variables, allowing LiteLLM to authenticate without ever exposing raw keys in the code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Method OnInit() As %Status
{
    Set tSC = $$$OK
    Try {
        Do ..Initialize()
    } Catch e {
        Set tSC = e.AsStatus()
    }
    Quit tSC
}

/// Configure API Keys in Python Environment
Method Initialize() [ Language = python ]
{
    import os
    for tKeyName in self.APIKeysConfig.split(','): 
        credential = iris.cls("Ens.Config.Credentials")._OpenId(tKeyName.strip())
        if not credential:
            continue
        os.environ[tKeyName] = credential.PasswordGet()
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now that the security layer is in place, let’s focus on the core reasoning of our Adapter. &lt;br&gt;
This is where we define which model to call and how to structure the message. &lt;br&gt;
The Adapter has a configuration property that determines the default model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/// Default model to use if not specified in request
Property DefaultModel As %String [ InitialExpression = "gpt-4o-mini" ]; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, the magic happens at runtime. The input message &lt;code&gt;dc.mais.messages.LLMRequest&lt;/code&gt; has an optional &lt;code&gt;Model&lt;/code&gt; property. If the orchestrator (BPL) sends this property filled in, the Adapter respects the dynamic choice. &lt;br&gt;
Otherwise, it falls back to &lt;code&gt;DefaultModel&lt;/code&gt;.&lt;/p&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Separating Instructions from Input&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Another important design decision was how we send text to the LLM. Instead of sending just a raw string, I split the concept into two fields in the request:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Content:&lt;/strong&gt; This is where the “System Prompt” or the current Agent’s instructions go (e.g., “You are a waiter who is an expert in wines…”).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UserContent:&lt;/strong&gt; This is where the user’s actual input goes (e.g., “Which wine pairs well with fish?”).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It allows us to build a clean messages array for LiteLLM, ensuring that the AI can clearly distinguish its &lt;em&gt;persona*from the user’s *question&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Here is how the main &lt;code&gt;CallLiteLLM&lt;/code&gt; method assembles this puzzle using Python directly within IRIS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
Method CallLiteLLM(pRequest As dc.mais.messages.LLMRequest) As dc.mais.messages.LLMResponse [ Language = python ]
{
    import litellm
    import json
    import time
    import iris

    t_attempt = 0
    max_retries = self.MaxRetries
    retry_delay = self.RetryDelay

    last_error = None
    pResponse = iris.cls("dc.mais.messages.LLMResponse")._New()

    while t_attempt &amp;lt;= max_retries:
        t_attempt += 1

        try:
            model = pRequest.Model
            if not model:
                model = self.GetDefaultModel()

            messages = [{"role": "user", "content": pRequest.Content}]

            if pRequest.UserContent:
                messages.append({"role": "user", "content": pRequest.UserContent})

            response = litellm.completion(model=model, messages=messages)

            pResponse.Model = response.model

            choices_list = []
            if hasattr(response, 'choices'):
                for choice in response.choices:
                    if hasattr(choice, 'model_dump'):
                        choices_list.append(choice.model_dump())
                    elif hasattr(choice, 'dict'):
                        choices_list.append(choice.dict())
                    else:
                        choices_list.append(dict(choice))

            pResponse.Choices = json.dumps(choices_list)
            if (len(response.choices) &amp;gt; 0 ):
                pResponse.Content = response.choices[0].message.content

            if hasattr(response, 'usage'):
                if hasattr(response.usage, 'model_dump'):
                    pResponse.Usage = json.dumps(response.usage.model_dump())
                else:
                    pResponse.Usage = json.dumps(dict(response.usage))

            if hasattr(response, 'error') and response.error:
                pResponse.Error = json.dumps(dict(response.error))

            return pResponse

        except Exception as e:
            last_error = str(e)

            class_name = "dc.mais.adapter.LiteLLM"
            iris.cls("Ens.Util.Log").LogError(class_name, "CallLiteLLM", f"LiteLLM call attempt {t_attempt} failed: {last_error}")

            if t_attempt &amp;gt; max_retries:
                break

            time.sleep(retry_delay)

    error_payload = {
        "message": "All LiteLLM call attempts failed",
        "details": last_error
    }
    pResponse.Error = json.dumps(error_payload)

    return pResponse
}

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For those who prefer the classic syntax, I also included an ObjectScript version of the same method:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Method CallLiteLLMObjectScript(pRequest As dc.mais.messages.LLMRequest, Output pResponse As dc.mais.messages.LLMResponse) As %Status
{
    Set tSC = $$$OK
    Set tAttempt = 0
    Set pResponse = ##class(dc.mais.messages.LLMResponse).%New()

    While tAttempt &amp;lt;= ..MaxRetries {
        Set tAttempt = tAttempt + 1

        Try {
            Set model = $Select(pRequest.Model '= "": pRequest.Model, 1: ..GetDefaultModel())

            // Prepare History
            Set jsonHistory = [].%FromJSON(pRequest.History)
            Set:(jsonHistory="") jsonHistory = []

            // Inject Tool Output if present (Close the Loop logic)
            If (pRequest.ToolCallId '= "") &amp;amp;&amp;amp; (pRequest.ToolOutput '= "") {
                Set tToolMsg = {}
                Set tToolMsg.role = "tool"
                Set tToolMsg.content = pRequest.ToolOutput
                Set tToolMsg."tool_call_id" = pRequest.ToolCallId
                Do jsonHistory.%Push(tToolMsg)
            }

            // Add current user content/prompt only if not empty
            If (pRequest.Content '= "") {
                Do jsonHistory.%Push({"role": "user", "content": (pRequest.Content)})
            }

            // Add extra user content field if present
            If (pRequest.UserContent'="") {
                Do jsonHistory.%Push({"role": "user", "content": (pRequest.UserContent)})
            }

            Set strMessages = jsonHistory.%ToJSON()

            // Call Python Helper
            Set tResponse = ..PyCompletion(model, strMessages, pRequest.Parameter, 1)

            // Map Response
            Set pResponse.Model = tResponse.model

            // Convert Python choices to IRIS DynamicArray
            Set choices = []
            For i=0:1:tResponse.choices."__len__"()-1 {
                Do choices.%Push({}.%FromJSON(tResponse.choices."__getitem__"(i)."to_json"()))
            }
            Set pResponse.Choices = choices.%ToJSON()

            // Process the last choice
            If (choices.%Size()&amp;gt;0) {
                Set choice = choices.%Get(choices.%Size()-1)

                If ($IsObject(choice.message)){
                    Set pResponse.Content = choice.message.content

                    // Extract Tool Calls
                    // Check if 'tool_calls' exists and is a valid Object (DynamicArray)
                    Set tToolCalls = choice.message."tool_calls"

                    // Verify it is an Object (Array) and not empty string
                    If $IsObject(tToolCalls) {
                        Do ..GetToolCalls(tToolCalls, .pResponse)
                    }
                }
            }

            // Map Usage
            If ..hasattr(tResponse, "usage") {
                Set pResponse.Usage = {}.%FromJSON(tResponse.usage."to_json"()).%ToJSON()
            }

            // Success - Exit Loop
            Quit 

        } Catch e {
            Set tSC = e.AsStatus()
            $$$LOGERROR("LiteLLM call attempt "_tAttempt_" failed: "_$System.Status.GetOneErrorText(tSC))
            If tAttempt &amp;gt; ..MaxRetries Quit
            Hang ..RetryDelay
        }
    }

    If ($$$ISERR(tSC)) {
        Set pResponse.Error = {
            "message": "All LiteLLM call attempts failed",
            "details": ($System.Status.GetOneErrorText(tSC))
        }.%ToJSON()
    }
    Quit tSC
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;You might have noticed the call to &lt;code&gt;..PyCompletion(...)&lt;/code&gt; inside the ObjectScript version of this logic. &lt;br&gt;
It is not a standard system method; but a custom helper designed to handle Data Marshaling between the two languages. &lt;br&gt;
While IRIS allows direct calls to Python, passing complex nested structures(e.g., lists of objects containing specific data types) can sometimes require manual conversion.&lt;br&gt;&lt;br&gt;
The PyCompletion method acts as a translation layer. It accepts the data from ObjectScript as serialized JSON strings. Then it deserializes them into native Python dictionaries and lists (using json.loads) inside the Python environment. Finally, it executes the &lt;code&gt;LiteLLM&lt;/code&gt; request.&lt;br&gt;&lt;br&gt;
This "Hybrid" approach keeps our ObjectScript code clean and readable, focusing purely on business logic (looping, history management), while offloading the heavy lifting of data type conversion and library interaction to a small, dedicated Python wrapper.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This simple structure gives us tremendous control. While BPL can swap out the brain of the operation (the Model) or the personality (Content) dynamically at each step of the flow, the Adapter takes care of the technical “plumbing.”&lt;/p&gt;

&lt;p&gt;The Stage is Set, but It is Empty&lt;/p&gt;

&lt;p&gt;We have covered a lot of ground so far. We have built a secure, provider-agnostic bridge to the LLM using Python and LiteLLM. We have solved the tricky interoperability issues with &lt;code&gt;**kwargs&lt;/code&gt; and established a secure way to handle credentials with the help of the IRIS native storage.&lt;/p&gt;

&lt;p&gt;However, if you look closely, you will see that we have a beautiful car with a powerful engine, but &lt;strong&gt;no driver&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We have established the link to the 'Brain,' but it lacks a defined persona. We can invoke GPT, but without specific instructions, it does not know if it should act as a helpful Greeter or a technical Support Engineer. It is currently just a stateless processor void of memory, lacking a goal and disconnected from any tools.&lt;/p&gt;

&lt;p&gt;In &lt;strong&gt;Part 2&lt;/strong&gt;, we will give this brain a Soul. We will do the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Build the &lt;code&gt;dc.mais.adapter.Agent&lt;/code&gt; class to define personas.
&lt;/li&gt;
&lt;li&gt;Master &lt;strong&gt;Dynamic Prompt Engineering&lt;/strong&gt; to enforce business rules.
&lt;/li&gt;
&lt;li&gt;Implement the &lt;strong&gt;"Allow List"&lt;/strong&gt; logic for agent-to-agent communication.
&lt;/li&gt;
&lt;li&gt;Dive into the &lt;strong&gt;ReAct Paradigm&lt;/strong&gt; theory that makes agents truly smart.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Did I overcomplicate the adapter? Do you have a cleaner way to handle the environment variables? &lt;br&gt;
If so, or if you spot a flaw in my logic before we get to Part 2, call it out in the comments below! I am writing this to learn from you as much as to share.&lt;/p&gt;

&lt;p&gt;Acknowledgments: A special thanks to my fellow &lt;strong&gt;Musketeer&lt;/strong&gt;, &lt;a href="https://community.intersystems.com/user/jos%C3%A9-pereira" rel="noopener noreferrer"&gt;@José Pereira&lt;/a&gt;, who introduced me to the wonders of &lt;strong&gt;LiteLLM&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Stay tuned. We are just getting started. &lt;/p&gt;

</description>
      <category>ai</category>
      <category>tools</category>
      <category>ensemble</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
