DEV Community

Yangholmes
Yangholmes

Posted on

How to Use GDAL in Web Applications (Part 3)

This article focuses on optimization.

The Necessity of Optimization

The previous article introduced a complete compilation script that successfully builds the WebAssembly version of GDAL.

GDAL Build Artifacts

However, the compilation results are not suitable for production environments because:

  1. Excessive file sizes: Core wasm file (27MB), glue code (272KB), data file (11MB)
  2. Redundant glue code: Contains Node.js and bash environment code, impossible to tree-shake
  3. Debug info in production: Debug information is unnecessary in production environments

File size is the most critical issue—total artifacts exceed 38MB, which is unacceptable for any web application.

Additionally, the Makefile contains misconfigurations. Since emsdk silently ignores unsupported compilation options during build, these errors don't halt compilation. This article will also interpret the original author's intent and fix incorrect compilation parameters.

Disclaimer

Through compilation optimization, GDAL 3.x WebAssembly artifacts can be reduced but likely not enough. These techniques work well for GDAL 2.x and OpenCV 4.x. Deeper reasons relate to GDAL's source code and build mechanisms—beyond this series' scope.

TODO: Add OpenCV optimization comparison

Optimization Approaches

For web applications, smaller resources are better. Classical frontend workflows use modern build tools and modular design to shrink JavaScript via lazy loading and tree-shaking. Non-JS resources are transformed by "loaders" into JS modules for optimization. However, these methods don't work for WebAssembly:

  1. Loader limitations: Wasm files can be compressed but can't be used client-side without extra code.
  2. No tree-shaking: Wasm is binary code; dead code elimination can't be done like with JS ASTs.

Could a *.wasm loader exist? Tools like vite support loading Wasm via ?init, but this doesn't suit glue code integration.

Thus, we optimize during the wasm compilation phase.

Code Separation

WASM
Enter fullscreen mode Exit fullscreen mode

Options:

  • 0: Output combined wasm.js (wasm embedded in JS)
  • 1: Separate wasm and JS output
  • 2: Output both formats

wasm.js serves legacy browsers. -sWASM=2 outputs both, but if target browsers support wasm, wasm.js is unnecessary. wasm.js encodes wasm as base64, increasing file size.

Demand-Driven Compilation

"Compile only what you use"

1. Library Functions

Projects typically use only a small subset of a library. Dead code elimination is controlled by:

EXPORTED_FUNCTIONS # List of exported functions
Enter fullscreen mode Exit fullscreen mode
EXPORT_ALL # Export all functions
Enter fullscreen mode Exit fullscreen mode

Note: Exported functions require a _ prefix. For example:

-sEXPORTED_FUNCTIONS="['_add']"
Enter fullscreen mode Exit fullscreen mode

2. Emscripten Runtime Functions

EXPORTED_RUNTIME_METHODS
Enter fullscreen mode Exit fullscreen mode

Default is empty. Export only necessary methods. For virtual filesystems:

-sEXPORTED_RUNTIME_METHODS="['FS']"
Enter fullscreen mode Exit fullscreen mode

The original gdal3.js exports nearly all GDAL functions, a key reason for large artifacts.

Debug Information

emcc parameters resemble gcc's. Disable debug info in production using:

-gsource-map
-source-map-base
-O<level>
-g<level>
Enter fullscreen mode Exit fullscreen mode

1. -gsource-map and -source-map-base

Control sourcemap generation. If enabled, debuggers load .map files from <base-url>/<wasm-file-name>.map, with <base-url> set by -source-map-base (default: same as wasm path).

2. -O<level>

Optimization levels:

  • -O0: No optimization, full debug info
  • -O1: Basic optimizations, remove runtime asserts
  • -O2: Dead code elimination (beyond -O1)
  • -O3: Aggressive size reduction (beyond -O2)
  • -Og: Similar to -O1, more debug info
  • -Os: Similar to -O3, smaller output
  • -Oz: Smaller than -Os

Default -O0 retains full debug info.

Higher optimization levels increase compilation time.

3. -g<level>

Debug levels:

  • -g0: No debug info
  • -g1: Preserve whitespace in JS
  • -g2: Preserve function names
  • -g3: Full debug info (DWARF + LLVM metadata)

Omitting the number (e.g., -g) defaults to -g3.

Environment Configuration

By default, emscripten generates environment-detection code for multiple targets. For fixed environments, this is redundant. Use:

ENVIRONMENT
Enter fullscreen mode Exit fullscreen mode

Valid values:

  • node: Node.js
  • web: Web browsers
  • webview: Same as web (embedded webviews)
  • worker: Web Worker
  • shell: Command line

For web apps, compile only -sENVIRONMENT=worker. Also configure:

EXPORT_ES6
Enter fullscreen mode Exit fullscreen mode

Set to 1 to output ES Module-compliant glue code. Default output includes environment-sniffing CJS/IIFE, unusable with import. Compare:

// -sEXPORT_ES6=1

;return moduleRtn}export default CModule;
Enter fullscreen mode Exit fullscreen mode

// -sEXPORT_ES6=0

;return moduleRtn}})();if(typeof exports==="object"&&typeof module==="object"){module.exports=CModule;module.exports.default=CModule}else if(typeof define==="function"&&define["amd"])define([],()=>CModule);
Enter fullscreen mode Exit fullscreen mode

Filesystem

Libraries like GDAL rely on OS filesystems. Emscripten emulates this in JS. Disable if unused:

FILESYSTEM
Enter fullscreen mode Exit fullscreen mode

Automatic if code references stdio.h/fprintf. For pure computation, disable manually.

Access via Module.FS.

Other Options

1. Polyfill

POLYFILL
Enter fullscreen mode Exit fullscreen mode

Default true. Disable if polyfills are handled elsewhere.

2. Use JS Math Library

JS_MATH
Enter fullscreen mode Exit fullscreen mode

Set true to use browser's Math, avoiding libc compilation. May reduce precision. Recommended for precision-insensitive tasks.

3. Minimal Runtime

MINIMAL_RUNTIME
Enter fullscreen mode Exit fullscreen mode

Minimal output (no POSIX, no Module, no built-in XHR). May break functionality—not recommended.

Practical Optimization

Fixing gdal3.js Build Script Errors

1. Invalid Debug Level

Line 4 incorrectly uses -g4 (unsupported).

Fix for type=debug:

GDAL_EMCC_FLAGS += -O0 -g3
Enter fullscreen mode Exit fullscreen mode

2. Sourcemap Misconfiguration

Same line: --source-map-base without -gsource-map. Fix:

GDAL_EMCC_FLAGS += -gsource-map=1 --source-map-base $(BASE_URL)
Enter fullscreen mode Exit fullscreen mode

Optimizing gdal3.js Build Script

1. Disable Debug in Production

Line 6:

GDAL_EMCC_FLAGS += -Oz -g0
Enter fullscreen mode Exit fullscreen mode

2. Specify Environment

GDAL_EMCC_FLAGS += -s ENVIRONMENT=worker -s EXPORT_ES6=1
Enter fullscreen mode Exit fullscreen mode

3. Reduce Exported Functions

For the use case in Part 2 (only GDALOpen, GDALInfo, GDALClose):

GDAL_EMCC_FLAGS += -s EXPORTED_FUNCTIONS="[\
'_malloc',\
'_free',\
'_CSLCount',\
'_GDALOpen',\
'_GDALClose',\
'_GDALInfo'\
]"
Enter fullscreen mode Exit fullscreen mode

Minimal runtime methods:

GDAL_EMCC_FLAGS += -s EXPORTED_RUNTIME_METHODS="[\
'ccall',\
'cwrap',\
'FS'\
]"
Enter fullscreen mode Exit fullscreen mode

Results

Optimization Results

  • Wasm file: Reduced by 6,177,075 bytes (22.44%)
  • JS file: Reduced by 18,299 bytes (10.21%)

Conclusion

Future articles will cover:

  1. Emscripten's virtual filesystem
  2. Purpose and optimization of *.data files

Top comments (0)