LLM4RTL System Boosts RTL Code Generation with Tool-Assisted Pipeline

A new research paper proposes LLM4RTL, a tool-assisted large language model system for RTL code generation. The system uses a judge-renew-check-renew-check (JRCRC) pipeline to filter and refine training datasets, and incorporates pre-processing tools to address LLM weaknesses in rule-based reasoning. LLM4RTL achieves significant performance gains on the VerilogEval benchmark, rivaling GPT-4O with a smaller model.

iGEN Editorial

June 16, 2026

LLM4RTL System Boosts RTL Code Generation with Tool-Assisted Pipeline

Large language models (LLMs) have demonstrated remarkable progress in software engineering and code generation, but applying them to hardware design—specifically RTL (Register Transfer Level) code generation—remains challenging due to the need for high-quality training samples. A new research paper, "LLM4RTL: Tool-Assisted LLM for RTL Generation" (arXiv, June 2026), introduces a system that combines a novel data refinement pipeline with tool-assisted architecture to improve RTL code generation performance significantly.

The JRCRC Pipeline

The authors propose a "judge-renew-check-renew-check" (JRCRC) pipeline that iteratively updates a public dataset using a hierarchy of state-of-the-art commercial LLM models. These models differ in cost and capability for RTL code generation. According to the paper, the pipeline provides a cost-effective mechanism for filtering and refining code-generation samples into a higher-quality training dataset. This approach enables the creation of specialized LLM systems through fine-tuning or low-rank adaptation without requiring massive manual curation.

Weaknesses and Tool-Assisted Architecture

Experiments conducted by the research team identified common weaknesses of LLMs in rule-based reasoning and logic, which are critical for RTL code generation. To address these shortcomings, the authors developed an architecture that incorporates pre-processing tools to dynamically assist the LLM in inferring logical relationships from tabular data formats. This tool-assisted approach mitigates the identified weaknesses and enhances the model's ability to generate correct Verilog code.

Performance Results

With the JRCRC pipeline and tool-assisted architecture, LLM4RTL achieves significant overall performance gains on the VerilogEval benchmark. The system outperforms many state-of-the-art methods and delivers performance comparable to GPT-4O while using a significantly smaller LLM. The following table summarizes the key comparison:

Metric	LLM4RTL	GPT-4O
Model Size	Significantly smaller	Larger
VerilogEval Performance	Comparable	Baseline
Pipeline	JRCRC + tool-assisted	None (standard)

Implications for Hardware Design Automation

LLM4RTL demonstrates that targeted pipeline design and tool integration can unlock LLM capabilities for specialized domains like chip design. By combining a data refinement pipeline with pre-processing tools that address logical reasoning gaps, the system achieves high performance without relying on the largest available models. This has practical implications for cost and deployment in electronic design automation (EDA) workflows.

The research was conducted by Jing, Chu, Robert, Yan, Ning, Mortazavi, and Masood S., and published on arXiv under the Computer Science > Hardware Architecture category. The full paper is available for review.

Sources:

LLM4RTL System Boosts RTL Code Generation with Tool-Assisted Pipeline

The JRCRC Pipeline

Weaknesses and Tool-Assisted Architecture

Performance Results

Implications for Hardware Design Automation

Recommended Stories

LLM-Driven Stepwise Refinement Framework Promises Verifiable Hardware Generation

Agentic Electronic Design Automation: Handoff Validity as Organizing Principle

New Research Shows Pretraining Data Composition Can Engineer Neural Scaling Laws for Particle Physics

Reinforcement-Aware Knowledge Distillation Boosts LLM Reasoning Efficiency