Author: Wanderson Freitas Batista
Contact: www.wbatista.com
Repository: https://github.com/wanderbatistaf/CodeCraftIDE
Live Demo: https://code-craft-ide-psi.vercel.app/
License: MIT
Abstract
This paper presents CodeCraft IDE, an open source platform developed in Python for interpretation, conversion, and modernization of legacy systems written in Informix 4GL. The proposed solution addresses one of the main challenges faced by organizations maintaining critical systems developed in the 1980s and 1990s: the need for modernization without operational disruption. CodeCraft IDE offers a complete ecosystem comprising a 4GL interpreter, a bidirectional 4GL↔Python converter, a modern web IDE, and an enterprise toolkit for large-scale migration. Results demonstrate that the tool can execute 4GL code, convert entire projects while preserving original code semantics, and provide detailed migration readiness reports. With over 470 automated tests and bilingual documentation, the project represents a significant contribution to the developer community working with legacy system modernization.
Keywords: Informix 4GL, Legacy System Modernization, Interpreter, Code Converter, Python, Open Source, IDE, Software Migration
1. Introduction
Legacy system modernization represents one of the greatest challenges in contemporary software engineering. It is estimated that trillions of lines of legacy code still support critical operations in sectors such as finance, healthcare, manufacturing, and government (Sommerville, 2016). Among the languages that compose this landscape, Informix 4GL (Fourth Generation Language) occupies a prominent place, having been widely adopted in the 1980s and 1990s for developing enterprise applications with strong relational database integration.
Informix 4GL, originally developed by Informix Software Inc. (later acquired by IBM), offered a high-level syntax that simplified database operations, form manipulation, and report generation. Its popularity resulted in thousands of systems that, decades later, still operate in production, representing software assets with incalculable value in terms of accumulated business rules.
The dilemma faced by organizations is complex: maintaining these systems implies growing maintenance costs, scarcity of qualified professionals, and integration limitations with modern technologies. On the other hand, complete rewriting presents significant risks, high costs, and potential loss of business knowledge embedded in legacy code.
This work presents CodeCraft IDE, an open source platform that offers an intermediate approach: allowing execution, analysis, and gradual conversion of 4GL code, enabling controlled and low-risk migration to modern technologies such as Python.
1.1 Objectives
The CodeCraft IDE project was developed with the following objectives:
Interpretation: Create an interpreter capable of executing Informix 4GL code in a Python environment, allowing testing and validation without the need for legacy infrastructure.
Conversion: Develop a bidirectional converter that transforms 4GL code into idiomatic Python and vice versa, preserving semantics and facilitating gradual migration.
Tooling: Provide a modern IDE and command-line tools that increase productivity when working with 4GL code.
Accessibility: Make the solution available as free software, allowing organizations of all sizes to benefit without licensing costs.
1.2 Article Structure
The remainder of this article is organized as follows: Section 2 presents related works and the state of the art; Section 3 describes the architecture and methodology; Section 4 details the implementation of the processing core; Section 5 presents the Codecraft Studio IDE; Section 6 discusses the migration toolkit; Section 7 presents results and evaluation; Section 8 discusses limitations and future work; and Section 9 concludes the article.
2. Related Works and State of the Art
2.1 Legacy System Modernization
The literature presents various approaches to legacy system modernization. Comella-Dorda et al. (2000) categorize strategies into three main groups: (1) complete replacement, (2) encapsulation/wrapping, and (3) migration/transformation. CodeCraft IDE falls into the third category, offering tools for gradual code transformation.
Seacord et al. (2003) highlight that successful migration depends on tools that preserve business knowledge embedded in code. This premise guided the development of the converter, which prioritizes maintaining original semantics over premature optimizations.
2.2 Automatic Code Conversion
Automatic code conversion between programming languages is an established field. Tools like 2to3 (Python 2 to Python 3) and various transpilers demonstrate the viability of this approach. However, converting fourth-generation languages like 4GL to general-purpose languages presents unique challenges due to strong coupling with databases and specific user interfaces.
Waters (1988) observed that fourth-generation languages often embed complex operations in simple syntactic constructions, making conversion to third-generation languages more of an "expansion" task than "translation". CodeCraft IDE addresses this challenge through configurable mappings and commented code generation.
2.3 Commercial Solutions for 4GL
Commercial solutions for 4GL modernization exist, such as those offered by Querix (Lycia) and Four Js (Genero). These solutions, while robust, present significant licensing costs and often create dependency on new proprietary environments. CodeCraft IDE differentiates itself by being completely open source and allowing conversion to pure Python, a widely adopted language with a vast ecosystem.
2.4 Interpreters and Compilers in Python
Python has been widely used for building educational and practical interpreters and compilers. Works such as the book "Crafting Interpreters" (Nystrom, 2021) and projects like PLY (Python Lex-Yacc) provide theoretical and practical foundations that influenced the development of CodeCraft IDE.
The choice of Python as both implementation and conversion target language was motivated by its readability, vast library ecosystem, and growing adoption in enterprise environments.
3. Architecture and Methodology
3.1 Architecture Overview
CodeCraft IDE was designed following a layered modular architecture, as illustrated in Figure 1.
The architecture comprises four main layers:
Presentation Layer: Responsible for user interface, includes the web IDE (Codecraft Studio) built with React and Next.js, Monaco editor with multi-language support, form preview, integrated SSH terminal, database explorer, and command-line tools (CLI).
Service Layer (API): Implemented with FastAPI, provides REST endpoints for all system operations, including file management (local and SFTP), database operations, authentication and authorization.
Processing Core: Contains fundamental components for 4GL code analysis and transformation, including lexical analyzer (Lexer), syntactic analyzer (Parser), interpreter, converter, dependency analyzer, SQL translator, and validation framework.
Data Layer: Responsible for persistence and data access, includes adapters for different databases via JDBC and ORM, local and remote file system, session storage and authentication.
3.2 Processing Pipeline
One of the central contributions of CodeCraft IDE is the implementation of a complete 4GL code processing pipeline developed from scratch in Python. It's important to clarify that this pipeline does not use or depend on any component from the original Informix 4GL environment — it's an independent reimplementation designed specifically to enable interpretation, conversion, and analysis of legacy code in a modern environment.
The pipeline follows the classic compiler architecture, adapted to support multiple backends:
4GL Code → Lexer → Tokens → Parser → AST → Backend (Interpreter/Converter/Analyzer)
Why reimplement the pipeline? The native Informix 4GL compiler is a proprietary tool that generates executable code specific to the Informix runtime. It doesn't provide access to the code's internal structure (AST), doesn't allow extension, and requires licensing and specific infrastructure. CodeCraft IDE solves these limitations by implementing each pipeline stage as an independent Python component:
Lexer (Lexical Analyzer): Developed with regular expressions and state machine, transforms 4GL source code into typed tokens. This component "understands" 4GL lexical syntax — keywords, identifiers, literals, operators — without depending on any external Informix library.
Parser (Syntactic Analyzer): Implements a recursive descent parser that builds a strongly-typed AST (Abstract Syntax Tree). The parser recognizes the complete 4GL grammar and organizes tokens into a hierarchical structure representing program semantics.
AST (Abstract Syntax Tree): Intermediate data structure representing the program independently of original syntax. Each AST node is a Python dataclass with explicit types, facilitating programmatic manipulation.
-
Backends: The AST feeds different backends according to the desired operation:
- Interpreter: Executes code directly, evaluating expressions and maintaining state
- Converter: Generates equivalent Python code preserving semantics
- Analyzer: Extracts metrics, dependencies, and information for reports
This decoupled architecture allows CodeCraft IDE to process 4GL code on any machine with Python installed, without needing Informix licenses, database servers, or legacy infrastructure. A developer can simply install the package via pip install fglinterpreter and immediately start analyzing, executing, or converting 4GL code.
3.3 Design Patterns Used
Development employed several design patterns recognized by the software engineering community:
Visitor Pattern: Used extensively for AST traversal, allowing new operations (interpretation, conversion, analysis) to be added without modifying tree node classes.
Strategy Pattern: Implemented for different SQL backends (WBJDBC for direct queries, WBORM for object-relational mapping), allowing transparent strategy switching.
Factory Pattern: Used for AST node creation and component instantiation, facilitating testing and extensibility.
Observer Pattern: Implemented in the IDE for real-time preview and diagnostic updates as code is edited.
3.4 Design Decisions
Some design decisions deserve emphasis:
Typed AST: Each AST node is a Python dataclass with explicit types, facilitating validation and tooling.
Preferential Immutability: Whenever possible, immutable structures are used to avoid unexpected side effects.
Separation of Concerns: The parser doesn't know the interpreter, which doesn't know the converter. Each component has a single responsibility.
Extensibility: New node types, SQL commands, or backends can be added without modifying existing code.
4. Processing Core Implementation
This section details the implementation of CodeCraft IDE's central components responsible for processing 4GL code. All components described below were specifically developed for this project in pure Python, without dependencies on the original Informix environment. This means organizations can use these tools to analyze, execute, and convert legacy 4GL code even without access to the original proprietary infrastructure.
4.1 Lexical Analysis (CodeCraft Lexer)
The CodeCraft Lexer is the first component of the processing pipeline. Its function is to perform lexical analysis (or tokenization) of 4GL source code: it reads the program text character by character and groups character sequences into meaningful units called tokens.
What is a token? A token is the smallest syntactic unit with its own meaning. For example, in the code snippet LET x = 10, the lexer identifies four tokens:
-
LET→ keyword -
x→ identifier (variable name) -
=→ assignment operator -
10→ numeric literal (integer)
The CodeCraft lexer was implemented using regular expressions and a finite state machine. Regular expressions define the patterns each token type must follow (for example, an integer is a sequence of digits). The state machine controls recognition flow, especially for complex cases like strings with escape characters or block comments.
Why reimplement the lexer? The native 4GL compiler performs tokenization internally but doesn't expose this functionality. The CodeCraft lexer allows the IDE to offer features like real-time syntax highlighting, intelligent autocompletion, and precise error messages with line and column indication.
Main characteristics of the implemented lexer:
- Support for over 150 token types
- Case-insensitive keyword handling (4GL standard)
- Literal recognition: strings (single and double quotes), numbers (integer and decimal), dates
- Support for line comments (
--and#) and block comments ({ }) - Position tracking (line and column) for precise error messages
class TokenType(Enum):
# Keywords
DEFINE = "DEFINE"
LET = "LET"
IF = "IF"
THEN = "THEN"
ELSE = "ELSE"
END = "END"
FUNCTION = "FUNCTION"
RETURN = "RETURN"
# SQL Keywords
SELECT = "SELECT"
INSERT = "INSERT"
UPDATE = "UPDATE"
DELETE = "DELETE"
# ... over 150 token types
4.2 Syntactic Analysis (CodeCraft Parser)
While the lexer identifies what code elements are (tokens), the CodeCraft Parser determines how these elements relate, verifying they follow 4GL language grammar rules and building a structured program representation.
What is an AST? The Abstract Syntax Tree is a tree-shaped data structure representing the program's hierarchical structure. Each tree node corresponds to a language construction: a function declaration becomes a FunctionDeclaration node, an IF command becomes an IfStatement node with children representing the condition and then/else blocks, and so on.
Why is an AST necessary? The linear token sequence produced by the lexer doesn't capture program structure. For example, the tokens IF, x, >, 0, THEN, DISPLAY, "positive", END, IF are just a flat list. The AST organizes these tokens into a hierarchy representing semantics: a conditional command with a condition (x > 0) and an execution block (DISPLAY "positive"). This structure allows the interpreter to execute code, the converter to generate equivalent Python, and the analyzer to extract metrics — all operating on the same intermediate representation.
Implementation technique: The CodeCraft parser uses the recursive descent parsing technique, where each 4GL grammar rule is implemented as a Python function. For example, there's a parse_if_statement() function that recognizes IF commands, a parse_function() function that recognizes function declarations, and so on. This approach results in readable and easily extensible code — adding support for a new 4GL construction involves implementing a new parsing function.
Difference from native compiler: The original Informix 4GL compiler also performs parsing internally, but its goal is to generate optimized machine code for execution. The CodeCraft parser has different goals: generate an accessible AST that can be inspected, transformed, and used by multiple backends. This enables functionalities the native compiler doesn't offer, such as conversion to other languages or detailed static analysis.
4GL constructions supported by the parser:
- Variable Declarations: Primitive types (INTEGER, CHAR, DECIMAL, DATE, etc.), one-dimensional and multidimensional arrays, simple and nested records, LIKE for column type inheritance
- Control Structures: IF/THEN/ELSE, WHILE, FOR, FOREACH (cursor iteration), CASE/WHEN
- Functions and Procedures: Declaration, parameters, local variables, RETURN
- Embedded SQL Commands: SELECT (including INTO), INSERT, UPDATE, DELETE, cursors (DECLARE, OPEN, FETCH, CLOSE)
- Form Manipulation: OPEN FORM, DISPLAY, INPUT, validations
- Error Handling: WHENEVER ERROR CONTINUE/STOP/CALL
@dataclass
class FunctionDeclaration(ASTNode):
name: str
parameters: List[Parameter]
local_variables: List[VariableDeclaration]
body: List[Statement]
return_type: Optional[TypeAnnotation] = None
4.3 CodeCraft Interpreter
The CodeCraft Interpreter is the component that makes it possible to execute 4GL code directly in Python, without needing the original Informix runtime. It traverses the AST generated by the parser and executes each node, maintaining program state (variables, database connections, etc.) in memory.
Difference between interpreter and compiler: The native 4GL compiler translates source code to executable machine code once, generating a program that can be executed repeatedly. The CodeCraft interpreter, on the other hand, executes the code directly at each invocation, without generating an intermediate executable. This approach is ideal for development, testing, and validation, where convenience trumps maximum performance.
Interpreter use cases:
- Quickly test 4GL code snippets without compiling
- Validate that code logic works as expected
- Execute migration and validation scripts
- Debugging and prototyping during modernization
The interpreter implements the Visitor pattern to traverse the AST and execute corresponding operations. It maintains an environment with nested scopes for local and global variables, simulating original 4GL runtime behavior.
class Interpreter(ASTVisitor):
def __init__(self):
self.global_env = Environment()
self.current_env = self.global_env
self.db_connection = None
def visit_LetStatement(self, node: LetStatement):
value = self.evaluate(node.value)
self.current_env.set(node.variable, value)
def visit_IfStatement(self, node: IfStatement):
condition = self.evaluate(node.condition)
if self._is_truthy(condition):
return self.execute_block(node.then_branch)
elif node.else_branch:
return self.execute_block(node.else_branch)
def visit_SelectStatement(self, node: SelectStatement):
sql = self._build_sql(node)
params = self._extract_parameters(node)
result = self.db_connection.execute(sql, params)
if node.into_variables:
self._assign_results(node.into_variables, result)
return result
4.4 CodeCraft 4GL → Python Converter
The CodeCraft Converter represents perhaps the most valuable functionality for organizations in modernization processes: the ability to automatically transform 4GL code into equivalent Python code.
How does conversion work? The converter uses the same AST generated by the parser, but instead of executing each node (like the interpreter does), it generates equivalent Python code. Each AST node type has a corresponding translation rule. For example:
-
DEFINE x INTEGER→x: int = 0 -
LET x = y + 1→x = y + 1 -
IF condition THEN ... END IF→if condition: ...
Semantic preservation: The goal isn't to generate "beautiful" Python code, but rather code that behaves exactly like the original 4GL. This means even idiomatic 4GL constructions that have no direct Python equivalent are translated to preserve behavior. Comments are automatically inserted when translation isn't trivial.
Pythonic code: While the priority is preserving semantics, the converter follows Pythonic conventions whenever possible: uses dataclasses for records, type hints to document types, and idiomatic Python structures. The result is code that Python developers can read, maintain, and evolve.
Conversion Example:
Original 4GL Code:
DEFINE l_customer RECORD
id INTEGER,
name CHAR(50),
balance DECIMAL(10,2)
END RECORD
FUNCTION get_customer(p_id)
DEFINE p_id INTEGER
SELECT * INTO l_customer.*
FROM customers
WHERE id = p_id
IF l_customer.balance > 1000 THEN
RETURN "Premium"
ELSE
RETURN "Standard"
END IF
END FUNCTION
Generated Python Code:
from dataclasses import dataclass
from decimal import Decimal
from typing import Optional
from wbjdbc import Database
@dataclass
class CustomerRecord:
id: int = 0
name: str = ""
balance: Decimal = Decimal("0.00")
l_customer = CustomerRecord()
def get_customer(p_id: int) -> str:
global l_customer
db = Database.get_connection()
result = db.execute_one(
"SELECT * FROM customers WHERE id = ?",
[p_id]
)
if result:
l_customer.id = result['id']
l_customer.name = result['name']
l_customer.balance = Decimal(str(result['balance']))
if l_customer.balance > Decimal("1000"):
return "Premium"
else:
return "Standard"
4.5 SQL Translation
The SQL translation module supports two backends:
WBJDBC (Direct SQL): Generates code that executes parameterized SQL queries directly:
# 4GL: SELECT * FROM customers WHERE status = l_status
db.execute("SELECT * FROM customers WHERE status = ?", [l_status])
WBORM (Object-Relational Mapping): Generates code using a lightweight ORM for more abstract operations:
# 4GL: SELECT * FROM customers WHERE status = l_status
Customer.select().where(status=l_status).all()
The translator handles complex clauses including:
- WHERE with multiple conditions (AND, OR, NOT)
- ORDER BY with multiple columns and directions
- GROUP BY and HAVING
- Implicit and explicit JOINs
- Subqueries in WHERE clauses
4.6 Dependency Analysis
For multi-file projects, the dependency analyzer:
- Extracts all function definitions from each file
- Identifies all function calls
- Builds a dependency graph
- Detects cycles (circular dependencies)
- Calculates topological order for conversion
class DependencyAnalyzer:
def analyze(self, project_path: str) -> DependencyGraph:
functions = self._extract_all_functions(project_path)
calls = self._extract_all_calls(project_path)
graph = self._build_graph(functions, calls)
cycles = graph.detect_cycles()
return DependencyGraph(
nodes=functions,
edges=calls,
cycles=cycles,
conversion_order=graph.topological_sort()
)
5. Codecraft Studio: The Web IDE
One of CodeCraft IDE's differentiators is the integrated web IDE, called Codecraft Studio, developed with modern technologies.
5.1 Technology Stack
- Frontend: React 18, Next.js 15, TypeScript
- UI Components: shadcn/ui, Radix UI, Tailwind CSS
- Editor: Monaco Editor (same engine as VS Code)
- State: React Context API
- Communication: REST API, WebSocket for real-time updates
5.2 Interface Overview
The interface is organized into functional regions:
- Top Bar: Menus (File, Edit, View, Run, Database, Help), mode indicators (Local/Remote), session information
- Left Sidebar: File explorer with project support (.ccp), database explorer
- Central Area: Code editor with tab system, support for multiple simultaneous files
- Bottom Panel: Console (execution output), SSH Terminal, Debug, SQL Query
5.3 Code Editor
The editor uses Monaco Editor with custom configurations:
Implemented features:
- Syntax Highlighting: Complete support for 4GL, PER (forms), SQL, Python, CSS
- Diagnostics: Syntax errors displayed in real-time with underlining and tooltips
- Quick Fixes: Correction suggestions for common errors
- Find & Replace: Search with regular expression support
- Multiple Selections: Simultaneous editing of multiple occurrences
5.4 Form Preview
For form files (.per and .fm2 Lycia), Studio offers real-time preview:
The preview interprets:
- Field definitions and types
- Positioning (row/column)
- Visual attributes (colors, styles)
- Declarative validations
- Grid layouts (for Lycia)
5.5 Database Explorer
The IDE includes a database explorer that allows viewing and interacting with the structure:
Features:
- Connection to multiple databases (Informix, PostgreSQL, MySQL, Oracle)
- Schema and table visualization
- Column details (name, type, constraints)
- SQL Query panel for ad-hoc query execution
- Result export
5.6 Integrated SSH Terminal
For environments where 4GL code resides on remote servers:
The terminal uses credentials from the configured database connection, simplifying access to development and production environments.
5.7 Remote Mode (SFTP)
Studio supports editing files on remote servers via SFTP:
- Remote file system navigation
- Opening and editing files directly on the server
- Automatic synchronization on save
- Visual remote mode indicator in top bar
5.8 Visual Conversion
The conversion process can be initiated directly from the IDE:
The converted file is automatically:
- Generated in the same directory as the original
- Opened in a new tab
- Formatted with Black (for Python)
5.9 Integrated Documentation
Documentation system accessible via Help menu or Ctrl+Shift+H shortcut:
Documentation includes:
- FAQ (Frequently Asked Questions)
- Interface overview
- Files and projects guide
- Shortcut reference
- Troubleshooting
6. Enterprise Migration Toolkit
For large-scale migration projects, CodeCraft IDE offers a complete toolkit via command line.
6.1 Batch Conversion
The batch-convert command processes entire directories preserving structure:
# Basic conversion
fgl batch-convert ./src_4gl ./output_python
# With dependency resolution
fgl batch-convert ./src_4gl ./output --resolve-dependencies
# Specifying SQL backend
fgl batch-convert ./src_4gl ./output --sql-backend wborm
Features:
- Directory structure preservation
- Parallel processing (configurable)
- Real-time progress reporting
- Robust error handling (continues despite individual failures)
6.2 Dependency Analysis
fgl analyze-deps ./project
Example output:
╔══════════════════════════════════════════════════════╗
║ Dependency Analysis Report ║
╠══════════════════════════════════════════════════════╣
║ Files analyzed: 47 ║
║ Functions found: 312 ║
║ Cross-file calls: 89 ║
║ Circular dependencies: 0 ✓ ║
╠══════════════════════════════════════════════════════╣
║ Recommended conversion order: ║
║ 1. globals.4gl ║
║ 2. utils.4gl ║
║ 3. database.4gl ║
║ 4. business_logic.4gl ║
║ ... ║
╚══════════════════════════════════════════════════════╝
6.3 Migration Reports
The toolkit generates detailed reports in multiple formats:
# Interactive HTML report
fgl migration-report ./project --format html --output report.html
# JSON report for integration
fgl migration-report ./project --format json --output report.json
# Markdown report
fgl migration-report ./project --format markdown --output report.md
The report includes:
| Metric | Description |
|---|---|
| Migration Readiness Score | 0-100 score indicating readiness |
| Mapped Constructions | Percentage of code with direct mapping |
| Warnings per File | Count of potential issues |
| Unsupported Constructions | List of what requires manual intervention |
| Recommendations | Suggested prioritized actions |
6.4 Validation Framework
To ensure conversion preserves original semantics:
# Individual validation
fgl validate original.4gl converted.py
# Batch validation with JUnit report
fgl validate-batch ./tests --output results.xml --format junit
Validation modes:
| Mode | Description | Use |
|---|---|---|
| exact | Outputs must be byte-for-byte identical | Deterministic tests |
| normalized | Ignores whitespace and formatting differences | Most cases |
| semantic | Allows configurable numerical tolerance | Floating-point calculations |
Custom validators can be registered for specific cases:
@validator_registry.register("numeric_tolerance")
def numeric_validator(expected, actual, tolerance=0.001):
return abs(float(expected) - float(actual)) < tolerance
6.5 CI/CD Integration
Example GitHub Actions workflow for automated migration:
name: 4GL Migration Pipeline
on:
push:
paths:
- 'src/**/*.4gl'
jobs:
migrate-and-validate:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install CodeCraft
run: pip install fglinterpreter[all]
- name: Convert 4GL to Python
run: |
fgl batch-convert ./src ./dist \
--resolve-dependencies \
--sql-backend wbjdbc
- name: Validate Conversion
run: |
fgl validate-batch ./tests \
--output results.xml \
--format junit
- name: Generate Migration Report
run: |
fgl migration-report ./src \
--format html \
--output migration-report.html
- name: Upload Report
uses: actions/upload-artifact@v4
with:
name: migration-report
path: migration-report.html
- name: Publish Test Results
uses: EnricoMi/publish-unit-test-result-action@v2
if: always()
with:
files: results.xml
7. Results and Evaluation
7.1 Test Coverage
The project maintains an extensive automated test suite:
| Component | Tests | Coverage |
|---|---|---|
| Lexer | 45 | 98% |
| Parser | 78 | 95% |
| Interpreter | 112 | 92% |
| Converter | 89 | 94% |
| Batch Converter | 21 | 100% |
| Dependency Resolver | 26 | 100% |
| SQL Translator | 34 | 100% |
| Validation Framework | 48 | 100% |
| Total | 470+ | ~96% |
Tests are automatically executed on every commit via GitHub Actions, ensuring non-regression.
7.2 Supported 4GL Constructions
| Category | Constructions | Support |
|---|---|---|
| Variables | DEFINE, LET, primitive types | ✅ Complete |
| Arrays | ARRAY OF, multiple dimensions | ✅ Complete |
| Records | RECORD, LIKE, nested | ✅ Complete |
| Control | IF, WHILE, FOR, FOREACH, CASE | ✅ Complete |
| Functions | FUNCTION, RETURN, parameters | ✅ Complete |
| SQL | SELECT, INSERT, UPDATE, DELETE | ✅ Complete |
| Cursors | DECLARE, OPEN, FETCH, CLOSE | ✅ Complete |
| Transactions | BEGIN WORK, COMMIT, ROLLBACK | ✅ Complete |
| Errors | WHENEVER ERROR | ✅ Complete |
| Forms | OPEN FORM, DISPLAY, INPUT | ⚠️ Partial |
| Reports | START REPORT, OUTPUT TO | ⚠️ Partial |
| Menus | MENU, COMMAND | ⚠️ Partial |
7.3 Conversion Benchmark
Tests performed with production codebase (anonymized):
| Metric | Value |
|---|---|
| Files processed | 234 |
| 4GL lines of code | 47,832 |
| Functions converted | 1,247 |
| Conversion time | 12.4s |
| Success rate | 97.8% |
| Warnings generated | 156 |
| Blocking errors | 5 |
The 5 blocking errors were related to proprietary extensions specific to the client's environment, not standard 4GL syntax.
7.4 Interpreter Performance
Execution time comparison between native 4GL environment and CodeCraft IDE:
| Operation | Native 4GL | CodeCraft | Overhead |
|---|---|---|---|
| Simple loop (10k iterations) | 0.8s | 1.2s | 1.5x |
| Array processing (1k elements) | 1.5s | 2.1s | 1.4x |
| String operations (concatenation) | 0.5s | 0.7s | 1.4x |
| SQL query (100 rows) | 0.3s | 0.4s | 1.3x |
| Average | - | - | 1.4x |
The average 1.4x overhead is acceptable considering that:
- The interpreter is primarily intended for testing and validation
- Production execution would use converted Python code
- Flexibility and portability offset the performance difference
7.5 Comparison with Existing Solutions
| Feature | CodeCraft IDE | Solution A (Commercial) | Solution B (Commercial) |
|---|---|---|---|
| License | MIT (Free) | Proprietary ($$$) | Proprietary ($$) |
| 4GL Interpreter | ✅ | ❌ | ✅ |
| Python Conversion | ✅ | ❌ | ❌ |
| Integrated IDE | ✅ (Web) | ✅ (Desktop) | ✅ (Desktop) |
| Open Source | ✅ | ❌ | ❌ |
| Migration Reports | ✅ | ✅ | ⚠️ |
| CI/CD Integration | ✅ | ⚠️ | ❌ |
8. Limitations and Future Work
8.1 Current Limitations
Interactive Forms: Support for form constructions (OPEN FORM, INPUT, DISPLAY) is functional for syntactic validation and static preview, but generation of equivalent web interfaces is not yet complete.
Reports: Report commands (START REPORT, OUTPUT TO REPORT) are parsed and partially interpreted, but complex report generation requires refinement.
Proprietary Extensions: Vendor-specific extensions (Querix, Four Js) are not supported, focusing on standard Informix 4GL syntax.
Debugging: While the interpreter supports execution, there's no visual debugger yet with breakpoints and variable inspection.
8.2 Roadmap
The following developments are planned:
Desktop Application (Electron): Desktop version of Codecraft Studio for offline use, with better operating system integration.
Visual Debugger: Implementation of step-by-step debugging with breakpoints, watches, and call stack visualization.
Web Interface Generator: Conversion of 4GL forms to React components, enabling complete presentation layer modernization.
IDE Plugins: Extensions for VS Code and IntelliJ with syntax highlighting and CodeCraft backend integration.
AI-Assisted Migration: Use of language models for refactoring suggestions and automatic resolution of unmapped constructions.
8.3 Community Contributions
As an open source project, contributions are welcome in several areas:
- Support for new databases
- Coverage of specific 4GL dialects
- Documentation translation to other languages
- Testing with real (sanitized) codebases
- Bug reports and feature requests
- Documentation and example improvements
9. Conclusion
This article presented CodeCraft IDE, a comprehensive open source platform for interpretation, conversion, and modernization of legacy systems in Informix 4GL. The solution offers a functional interpreter implemented in Python, bidirectional converter to Python with semantic preservation, modern web IDE with professional features, and enterprise toolkit for large-scale migration.
Results obtained demonstrate that the tool achieves its main objectives:
- Interpretation: Correct 4GL code execution with acceptable 1.4x overhead
- Conversion: 97.8% success rate on real production code
- Quality: Over 470 automated tests with 96% coverage
- Usability: Modern web IDE with experience comparable to commercial tools
The choice for open source licensing (MIT) reflects the conviction that modernization tools should be accessible to organizations of all sizes. The barrier to entry for modernization shouldn't be financial, especially when organizations already face the inherent technical challenges of the process.
CodeCraft IDE doesn't claim to be the definitive solution for all 4GL modernization scenarios, but rather a solid tool that can be adapted, extended, and improved by the community. I invite developers, architects, and organizations to experiment, contribute, and help evolve the project.
References
AHO, A. V.; LAM, M. S.; SETHI, R.; ULLMAN, J. D. Compilers: Principles, Techniques, and Tools. 2nd ed. Boston: Pearson, 2006.
COMELLA-DORDA, S.; WALLNAU, K.; SEACORD, R.; ROBERT, J. A Survey of Legacy System Modernization Approaches. Pittsburgh: Carnegie Mellon University, Software Engineering Institute, 2000.
FOWLER, M. Refactoring: Improving the Design of Existing Code. 2nd ed. Boston: Addison-Wesley, 2018.
FOUR JS DEVELOPMENT TOOLS. Genero Business Development Language Documentation. Four Js, 2024. Available at: https://4js.com/documentation/
IBM CORPORATION. IBM Informix 4GL Reference Manual. IBM, 2020. Available at: https://www.ibm.com/docs/en/informix-servers
NYSTROM, R. Crafting Interpreters. Genever Benning, 2021. Available at: https://craftinginterpreters.com/
QUERIX LTD. Lycia Documentation. Querix, 2024. Available at: https://querix.com/lycia/
SEACORD, R. C.; PLAKOSH, D.; LEWIS, G. A. Modernizing Legacy Systems: Software Technologies, Engineering Processes, and Business Practices. Boston: Addison-Wesley, 2003.
SOMMERVILLE, I. Software Engineering. 10th ed. Boston: Pearson, 2016.
WATERS, R. C. Program Translation via Abstraction and Reimplementation. IEEE Transactions on Software Engineering, v. 14, n. 8, p. 1207-1228, 1988.








Top comments (0)