Speaker: Francois Vernet @ AWS FSI Meetup 2025 Q4
Leveraging AWS and EC2 Graviton for System Transformation
Overview of Numora
Established for 100 years, headquartered in Tokyo
Five divisions: wealth management, investment management, sale, global markets, investment banking
Slogan: "Connect markets east and west"
Tradition of discipline, entrepreneurship, creative solutions, and thought leadership
Supports enterprise risk function
Runs market and counterparty risk models for global businesses
Compute and data-heavy operations, ideal for cloud use
Minimal concern around latency
Scale of deployments
Deploys over 65,000 cores daily for pricing batches using EC2 Spot
Generates around two terabytes of data per day
Data retention up to seven years
Current S3 footprint is approximately two terabytes, utilizing S3 Integring for cost-effective historical data storage
Aggregation and summarization of data
Chooses in-memory aggregation at Numora
Utilizes over 100 very large EC2 instances for subsecond aggregation for credit and market risk models
Agenda
Impact of public cloud on computing risk within financial institutions
High-level architecture of risk systems at Numora
Graviton case study for pricing and aggregation use cases
Additional considerations and wisdom from the migration process
High-level architecture of Numora's systems
Data lakehouse stores all inputs and outputs
Inputs include trades, historical market data, and reference data (accounts, currencies, countries)
Hybrid data lakehouse: primarily processes data on-premises and exports to S3 for cloud deployments
Pricing engines
Consist of C++ pricing calculators
Load input data, run models, and output results at the transaction level
Aggregation engines
- Load data alongside reference data for slicing and dicing by book, country, currency, or counterparty in subsecond
In-house custom BI tool
Event-based platform
Provides set views and allows users to create their own views
Creates virtual tables for SQL access and pivoting in applications and spreadsheets
Pricing use case and Graviton implementation
Went live on AWS with pricing around 5 years ago, initially on Intel in North Virginia, then added Ohio a year later
Migrated all pricing engines to Graviton about 3 years ago
Required full recompilation of C++ pricing engines and number regression due to importance of number accuracy
Runs pricing engine using 100% spot autoscaling groups across North Virginia and Ohio
Leverages multiple instance types and t-shirt sizes
Utilizes all Graviton instance families (Graviton 2, 3, and 4) and instances
Results of Graviton migration for pricing layer
Runs one engine on 4x large and four engines on 24x large
50% reduction in cost
No loss of performance, marginal improvement observed
Lessons learned: carefully choose deployment region, leverage multi-regions, optimize batch tail for cost optimization
Aggregation use case and Graviton implementation
Aggregation consists of large JVMs
Switched aggregation layer to Graviton when ARM-compatible JVMs were provided by Java vendors
No recompilation needed due to Java-based system, but full regression was performed
Utilizes on-demand instances for stateful data loading and slice-and-dice functionality (Slice and Dice Analysis works by breaking data down into smaller, manageable chunks (slicing) and rearranging these elements to observe patterns and trends (dicing). Its features include: Enabling data disaggregation for detailed analysis. Facilitating multi-dimensional analysis of data)
Implemented compute savings plan to cover 80% of usage and reduce costs
Switched to a vendor providing additional optimizations
Implemented custom spot instances for aggregation engine to pause or recycle unused instances, further reducing costs
Currently using X2GD instances (Graviton with disk storage), but can leverage other instance types as well
Planning to migrate to Graviton 4 in the near future
Results of Graviton migration for aggregation platform
Estimated 3x ramp-up of aggregation platform for the same cost
Enabled deployment of new business features, such as FRTB IMA model, while keeping costs flat
Key learnings and recommendations from Numora's cloud migration journey
Invest early in deployment pipelines and enforce resource tagging
Use a centralized deployment pipeline integrated with an engine for instance and OS control
Helps maintain cost efficiency and reduce mistakes
Instill a sense of entrepreneurship and ownership within teams
Savings from AWS can be reinvested in other use cases or added to the bottom line
Implement top-notch observability and guard rails
Focus on resiliency
Start with multi-AZ deployments within regions
Implement multi-region deployments for added resilience
Stay AWS innovation and build your own innovations on top of it
AWS provides a powerful environment for system building and problem-solving
Top comments (0)