DEV Community

Nouha Bel haj youssef
Nouha Bel haj youssef

Posted on

Agentic AI in chemistry

Iโ€™ve been reading โ€œ๐‹๐š๐ง๐ ๐‚๐ก๐š๐ข๐ง ๐Ÿ๐จ๐ซ ๐‹๐ข๐Ÿ๐ž ๐’๐œ๐ข๐ž๐ง๐œ๐ž๐ฌ ๐š๐ง๐ ๐‡๐ž๐š๐ฅ๐ญ๐ก๐œ๐š๐ซ๐žโ€ by Ivan Reznikov, published by O'Reilly, and hereโ€™s what stood out to me:
In ๐œ๐ก๐ž๐ฆ๐ข๐ฌ๐ญ๐ซ๐ฒ ๐€๐ˆ, the way we represent molecules may shape how models โ€œunderstandโ€ chemistry.
๐‚๐ก๐ž๐ฆ๐ข๐ฌ๐ญ๐ซ๐ฒ-๐ญ๐ฎ๐ง๐ž๐ ๐‹๐‹๐Œ๐ฌ ๐๐จ๐งโ€™๐ญ ๐ข๐ง๐ญ๐ž๐ซ๐ฉ๐ซ๐ž๐ญ ๐ฆ๐จ๐ฅ๐ž๐œ๐ฎ๐ฅ๐ž๐ฌ ๐ฅ๐ข๐ค๐ž ๐œ๐ก๐ž๐ฆ๐ข๐ฌ๐ญ๐ฌ ๐๐จ. They interpret them as ๐ฌ๐ž๐ช๐ฎ๐ž๐ง๐œ๐ž๐ฌ ๐จ๐Ÿ ๐ญ๐จ๐ค๐ž๐ง๐ฌ.
Those tokens can come in different molecular representations:
โ€ข ๐’๐Œ๐ˆ๐‹๐„๐’
โ€ข ๐’๐„๐‹๐…๐ˆ๐„๐’
โ€ข ๐ˆ๐ง๐‚๐ก๐ˆ ๐ข๐๐ž๐ง๐ญ๐ข๐Ÿ๐ข๐ž๐ซ๐ฌ
This creates a fascinating challenge for generative AI:
๐Ÿ‘‰ ๐–๐ก๐ข๐œ๐ก ๐ฆ๐จ๐ฅ๐ž๐œ๐ฎ๐ฅ๐š๐ซ ๐ซ๐ž๐ฉ๐ซ๐ž๐ฌ๐ž๐ง๐ญ๐š๐ญ๐ข๐จ๐ง ๐ ๐ข๐ฏ๐ž๐ฌ ๐‹๐‹๐Œ๐ฌ ๐ญ๐ก๐ž ๐›๐ž๐ฌ๐ญ ๐š๐›๐ข๐ฅ๐ข๐ญ๐ฒ ๐ญ๐จ ๐ซ๐ž๐š๐ฌ๐จ๐ง ๐š๐›๐จ๐ฎ๐ญ ๐œ๐ก๐ž๐ฆ๐ข๐ฌ๐ญ๐ซ๐ฒ?
๐’๐Œ๐ˆ๐‹๐„๐’ is compact and widely used, but struggles with ambiguity, stereochemistry, and incomplete molecular context.
๐’๐„๐‹๐…๐ˆ๐„๐’ is more robust because every generated sequence maps to a valid molecule.
๐ˆ๐ง๐‚๐ก๐ˆ provides standardization, but sequence generation becomes harder due to its complexity.
Molecular representation is not just a formatting choice. It directly influences how AI models learn chemical relationships.
This is where initiatives like ๐‘ฎ๐‘ป4๐‘บ๐‘ซ (Generative Toolkit for Scientific Discovery) become important.
GT4SD explores chemistry-focused generative models capable of:
๐Ÿงช Chemical reaction prediction
๐Ÿงช Retrosynthesis prediction
๐Ÿงช Description โ†’ SMILES generation
๐Ÿงช SMILES โ†’ caption generation
๐Ÿงช Paragraph โ†’ laboratory actions
The book evaluated several chemistry-tuned models, including:
โ€ข ๐†๐“๐Ÿ’๐’๐ƒ ๐ฆ๐ฎ๐ฅ๐ญ๐ข๐ญ๐š๐ฌ๐ค ๐“๐Ÿ“ ๐ฆ๐จ๐๐ž๐ฅ๐ฌ
โ€ข ๐Œ๐จ๐ฅ๐“๐Ÿ“
โ€ข ๐‚๐‡๐„๐Œ๐‹๐‹๐Œ-๐Ÿ๐›
What I found most interesting was not the successes, but the failures.
During reaction prediction tasks such as Fischer esterification, several models generated chemically implausible molecules, sometimes introducing atoms never present in the reactants.
And that exposes the deeper issue:
โš ๏ธ ๐๐ซ๐ž๐๐ข๐œ๐ญ๐ข๐ง๐  ๐ฆ๐จ๐ฅ๐ž๐œ๐ฎ๐ฅ๐š๐ซ ๐ญ๐จ๐ค๐ž๐ง๐ฌ ๐ข๐ฌ ๐ง๐จ๐ญ ๐ญ๐ก๐ž ๐ฌ๐š๐ฆ๐ž ๐š๐ฌ ๐ฎ๐ง๐๐ž๐ซ๐ฌ๐ญ๐š๐ง๐๐ข๐ง๐  ๐œ๐ก๐ž๐ฆ๐ข๐ฌ๐ญ๐ซ๐ฒ.
Real chemistry depends on:
โ€ข Reaction mechanisms
โ€ข Thermodynamics
โ€ข Electron movement
โ€ข Stereochemistry
โ€ข 3D spatial interactions
๐€ ๐Ÿ๐ƒ ๐ญ๐จ๐ค๐ž๐ง ๐ฌ๐ž๐ช๐ฎ๐ž๐ง๐œ๐ž ๐œ๐š๐ง ๐จ๐ง๐ฅ๐ฒ ๐œ๐š๐ฉ๐ญ๐ฎ๐ซ๐ž ๐ฉ๐š๐ซ๐ญ ๐จ๐Ÿ ๐ญ๐ก๐š๐ญ ๐ซ๐ž๐š๐ฅ๐ข๐ญ๐ฒ.
One takeaway from the chapter was:
โ€œ๐–๐ข๐ญ๐ก๐จ๐ฎ๐ญ ๐ฎ๐ง๐๐ž๐ซ๐ฌ๐ญ๐š๐ง๐๐ข๐ง๐  ๐ญ๐ก๐ž ๐ฌ๐œ๐ข๐ž๐ง๐œ๐ž ๐›๐ž๐ก๐ข๐ง๐ ๐œ๐ก๐ž๐ฆ๐ข๐œ๐š๐ฅ ๐ซ๐ž๐š๐œ๐ญ๐ข๐จ๐ง๐ฌ, ๐ฆ๐จ๐๐ž๐ฅ๐ฌ ๐š๐ซ๐ž ๐ฃ๐ฎ๐ฌ๐ญ ๐ ๐ฎ๐ž๐ฌ๐ฌ๐ข๐ง๐  ๐ฉ๐จ๐ฌ๐ฌ๐ข๐›๐ฅ๐ž ๐ฌ๐จ๐ฅ๐ฎ๐ญ๐ข๐จ๐ง๐ฌ.โ€
The future of AI-driven chemistry likely wonโ€™t come from LLMs alone.
It will come from hybrid systems combining:
๐Ÿ”น Language models
๐Ÿ”น Graph neural networks
๐Ÿ”น 3D molecular representations
๐Ÿ”น Physics-informed AI
๐Ÿ”น Symbolic chemical reasoning
We are moving from models that generate chemistry to models that may eventually understand chemistry.

Top comments (0)