PreLort: Prefix-Nested LoRA Enables Federated Fine-Tuning Across Heterogeneous Hardware Ranks

A new method called PreLort addresses the challenge of aggregating federated LoRA adapters with different ranks due to heterogeneous hardware. By organizing adapter dimensions into a prefix hierarchy and introducing segment-wise aggregation and prefix-nested training, PreLort consistently outperforms existing heterogeneous federated LoRA methods in accuracy and ROUGE-L while achieving lower perplexity.

iGEN Editorial

June 16, 2026

PreLort: Prefix-Nested LoRA Enables Federated Fine-Tuning Across Heterogeneous Hardware Ranks

Federated fine-tuning of large language models (LLMs) using parameter-efficient methods like LoRA (Low-Rank Adaptation) enables privacy-preserving adaptation of foundation models. However, heterogeneous hardware resources introduce a critical challenge: clients with different adapter ranks cannot be directly aggregated. Existing methods that allow aggregation under heterogeneous ranks fail to control how information is distributed across rank dimensions, leading to suboptimal use of shared low-rank representations. To solve this, researchers from multiple institutions have proposed PreLort, a nested low-rank formulation for federated LoRA that organizes adapter dimensions into a prefix hierarchy.

The Challenge of Heterogeneous Ranks

In federated learning, clients often possess different computational capabilities, resulting in varying adapter ranks when fine-tuning LLMs with LoRA. Direct averaging of these heterogeneous adapters dilutes the information contributed by lower-rank clients, as zero-padding disrupts the alignment of rank dimensions. According to the paper, existing heterogeneous federated LoRA methods do not control how information is distributed across rank dimensions, causing suboptimal use of shared low-rank representations. PreLort addresses this by ensuring that lower-rank dimensions encode task-relevant information while higher-rank dimensions capture additional capacity.

How PreLort Works

PreLort introduces three key components that together encourage a consistent low-rank prefix capturing the most task-relevant information, while higher-rank dimensions learn additional capacity. The first is a segment-wise aggregation rule that averages only over clients contributing to each rank segment, avoiding dilution from zero-padded lower-rank clients. The second is a prefix-nested training strategy that optimizes each adapter under multiple rank truncations, encouraging useful signal to concentrate in low-rank prefix dimensions. The third is the overall nested low-rank formulation that organizes adapter dimensions into a prefix hierarchy. These components allow low-rank clients to benefit from richer information contributed by higher-rank clients, as prefix dimensions are consistently learned and aggregated.

Component	Description	Benefit
Segment-wise aggregation	Averages only over clients contributing to each rank segment	Avoids dilution from zero-padded lower-rank clients
Prefix-nested training	Optimizes each adapter under multiple rank truncations	Encourages useful signal to concentrate in low-rank prefix dimensions
Nested low-rank formulation	Organizes adapter dimensions into a prefix hierarchy	Ensures lower-rank dimensions encode task-relevant information, higher-rank capture additional capacity

Experimental Results

Experiments conducted by the researchers demonstrate that PreLort consistently outperforms prior heterogeneous federated LoRA methods in accuracy and ROUGE-L, a metric for evaluating text generation quality. Additionally, the method achieves lower or comparable perplexity across multiple base models. The paper states that "our method consistently outperforms prior heterogeneous federated LoRA methods in accuracy and ROUGE-L, while achieving lower or comparable perplexity across multiple base models."

Implications for Enterprise AI

For enterprise technology decision-makers, PreLort represents a step toward more efficient and effective federated learning deployments. In scenarios where edge devices or regional servers have varying hardware capabilities—common in global supply chains and logistics—the ability to aggregate adapters without information loss can improve model performance without centralizing sensitive data. While the research is still in the academic phase, the method's focus on handling rank heterogeneity directly addresses a practical barrier to deploying federated LLM fine-tuning in heterogeneous environments.

The authors of the paper are Waseem, Muhammad, Tastan, Nurbek, Jovanovic, Andrej, Lane, Nicholas D, Lukas, Nils, Nandakumar, Karthik, and Horvath, Samuel. The work is available on arXiv and has been submitted to the computer science subcategory of Distributed, Parallel, and Cluster Computing.

Sources:

PreLort: Prefix-Nested LoRA Enables Federated Fine-Tuning Across Heterogeneous Hardware Ranks

The Challenge of Heterogeneous Ranks

How PreLort Works

Experimental Results

Implications for Enterprise AI

Recommended Stories

SDFLoRA: Selective Decoupled Federated LoRA for Privacy-Preserving Fine-Tuning with Heterogeneous Clients

Techniques for Peak Memory Reduction for LoRA Fine-tuning of LLMs on Edge Devices

SDS-LoRA: New Low-Rank Adaptation Method Fixes Gradient Distortion in Large Model Fine-Tuning

New Research Reveals How Visual Tokens Evolve Inside Vision-Language Models