Research & Development World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE

AI agent mines 3,000+ papers to create comprehensive lithium metal battery database

By Julia Rock-Torcivia | May 12, 2026

Researchers at the Korea Advanced Institute of Science and Technology have developed an AI agent, LLMB, designed to accelerate the development-validation cycle of lithium metal batteries (LMBs). They published their work in ACS Central Science, and the agent is available in a GitHub repository.

(a) The LLMB agent automatically extracts text and graph data from the literature. Each stage collects data related to battery materials and properties, including cell components, material compositions, operating conditions, and cyclability. (b) The constructed database was utilized for machine learning, molecular simulation, and material analysis. Credit: doi: 10.1021/acscentsci.5c02433

LLMB integrates a large language model for hierarchical text mining with a specialized graph mining tool, Material Graph Digitizer (MatGD), to enable large-scale extraction and synthesis of battery material data and performance metrics from scientific literature. 

LLMB automated the mining of 3,606 papers, resulting in a comprehensive database of 8,074 battery cells containing component specifics and cyclability data. The agent achieved an F1 score of 96.4% in cell name extraction text mining and 99.3% in data merging. 

It also features machine-learning models that predict initial capacity and 50th-cycle capacity for NCM-based batteries, with R2 scores of 0.75 and 0.69, respectively. LLMB can also identify the relationship between solvent polarity and battery performance. 

Multimodal data extraction

The agent uses a modular architecture where specialized LLMs perform specific tasks like cell name extraction, categorization and value extraction across 29 distinct entities. 

The Material Graph Digitizer (MatGD) uses the YOLOv8 architecture to identify and remove non-data elements, such as text, legends or arrows. It employs the DBSCAN algorithm to segregate data lines based on RBG color vectors. 

To ensure the database is usable, a post-processing model performs SMILES conversion and unit standardization. 

Machine learning predictive modeling 

Using the synthesized database, the researchers developed Random Forest (RF) and Gradient Boosting Regressor (GBR) models to predict battery performance based on material composition and operating conditions. 

The RF model achieved an R2 score of 0.75 for NCM cathodes. SHapley Additive exPlanation (SHAP) analysis identified the cathode composition, operating conditions and electrolyte descriptors. The analysis successfully restated known materials trends, including that the stoichiometric ratio of Ni, Mn and Co was the most influential factor and that Ni composition greater than 0.8 correlates with higher capacity, while Mn shows a negative correlation. 

The analysis also found that higher C-rates and cathode loading correlated with lower initial capacity. It concluded that molecular properties like EState VSA6 and Kappa3 were found to significantly impact capacity. 

Experimental validation

The researchers validated the framework by designing new solvent systems. The analysis suggested that low-polarity solvents enhance performance. 

The study compared nonfluorinated ether solvents: Diethyl ether (DEE), Dipropyl ether (DPE) and Diethylene glycol dimethyl ether (DEGDME). It found that DEE and DPE showed lower polarity and weaker Li+ binding energy compared to DEGDME. 

The study concluded that Li||NCM811 cells using DEE and DPE electrolytes delivered higher initial capacity and more stable cycling at 1C and 5C rates, while DEGDME exhibited rapid capacity fade. 

LLMB could allow for the identification of previously hidden physicochemical correlations. The researchers suggest that more standardized reporting in literature and the integration of self-driving laboratories would strengthen the predictive capabilities of AI frameworks like LLMB. 

Tell Us What You Think! Cancel reply

You must be logged in to post a comment.

Related Articles Read More >

Big data technology Data science analysing artificial intelligence generative AI deep learning machine learning algorithm Neural flow network analytics innovation abstract futuristic. 3d rendering.
This week in AI research: Fields medalist says GPT-5.5 Pro did PhD-level math in an hour, Anthropic teaches Claude to ‘dream’
Elsevier joins suit against Meta over use of copyrighted research in LLM training
Alphabet-spinoff Isomorphic Labs raises $2.1 billion in quest to ‘solve all disease’ with AI-based drug discovery tools
Who has access to Claude Mythos-tier models (and beyond) will redefine cybersecurity, including in R&D
rd newsletter
EXPAND YOUR KNOWLEDGE AND STAY CONNECTED
Get the latest info on technologies, trends, and strategies in Research & Development.

R&D World Digital Issues

Fall 2025 issue

Browse the most current issue of R&D World and back issues in an easy to use high quality format. Clip, share and download with the leading R&D magazine today.

R&D 100 Awards
Research & Development World
  • Subscribe to R&D World Magazine
  • Sign up for R&D World’s newsletter
  • Contact Us
  • About Us
  • Drug Discovery & Development
  • Pharmaceutical Processing
  • Global Funding Forecast

Copyright © 2026 WTWH Media LLC. All Rights Reserved. The material on this site may not be reproduced, distributed, transmitted, cached or otherwise used, except with the prior written permission of WTWH Media
Privacy Policy | Advertising | About Us

Search R&D World

  • R&D World Home
  • Topics
    • Aerospace
    • Automotive
    • Biotech
    • Careers
    • Chemistry
    • Environment
    • Energy
    • Life Science
    • Material Science
    • R&D Management
    • Physics
  • Technology
    • 3D Printing
    • A.I./Robotics
    • Software
    • Battery Technology
    • Controlled Environments
      • Cleanrooms
      • Graphene
      • Lasers
      • Regulations/Standards
      • Sensors
    • Imaging
    • Nanotechnology
    • Scientific Computing
      • Big Data
      • HPC/Supercomputing
      • Informatics
      • Security
    • Semiconductors
  • R&D Market Pulse
  • R&D 100
    • 2025 R&D 100 Award Winners
    • 2025 Professional Award Winners
    • 2025 Special Recognition Winners
    • R&D 100 Awards Event
    • R&D 100 Submissions
    • Winner Archive
  • Resources
    • Research Reports
    • Digital Issues
    • Educational Assets
    • Subscribe
    • Video
    • Webinars
    • Content submission guidelines for R&D World
  • Global Funding Forecast
  • Top Labs
  • Advertise
  • SUBSCRIBE