openFuyao v25.09 Released

Release-management Maintainer2025-09-30

September 30, 2025

The openFuyao community is committed to building an open software ecosystem for diverse computing clusters, focusing on promoting efficient collaboration of AI-native technologies and maximizing the utilization of effective compute resources.

Community Release v25.09 introduces many new features and optimizes several existing features. The following sections detail the specific additions and changes:

openFuyao Kubernetes Upgrade

SIG-orchestration-engine is the core SIG of the openFuyao community and is dedicated to building core container orchestration engine components for diverse clusters.

In v25.09, the container orchestration SIG introduces multiple Kubernetes enhancements to comprehensively improve performance, O&M capabilities, and reliability.

Kubernetes Upgrade from 1.28 to 1.33

The Kubernetes version is upgraded once a year.

Performance Optimization: CPU Scaling-Up During Service Startup Supported by kubelet to Accelerate Java Program Startup

Some types of applications require more resources during startup, exceeding the steady-state limit, which can result in long startup durations. For example, this issue may occur in Java, foundation model inference, and scientific computing scenarios. This feature can accelerate application startup by scaling up CPUs during the startup while ensuring lower resource usage during steady state. This helps accelerate the readiness of Java programs in scenarios such as installation and deployment, restart, and fault recovery.

If resources are adequate, this feature can improve the startup speed of test applications by over 50% and that of specific applications by over 100%.

O&M Capability Improvement: Hot Loading for Enhanced Kubernetes Certificate Management

Kubernetes, etcd, and CoreDNS natively lack the certificate hot-loading capability. openFuyao Kubernetes has implemented this for CoreDNS, thereby reducing the impact of certificate rotation on services.

O&M Capability Improvement: Capacity Expansion for StatefulSet PVC Template

In a Kubernetes cluster, the storage infrastructure uses StatefulSet workloads. As the cluster service scale expands, the storage space of workloads also needs to be expanded. The native expansion mode is to modify the PVC objects associated with a StatefulSet in the cluster. This feature is designed to modify the PVC information in the StatefulSet specification to trigger automatic capacity expansion of the StatefulSet PVC objects, thereby simplifying O&M.

Reliability Enhancement: Log Rotation

Native kube-log-runner provides only the log redirection capability. fuyao-log-runner extends the native mechanism by providing byte-level precise log rotation. Even when the log disk space is insufficient, services can run properly. If log files are deleted, automatic recovery is supported to ensure normal log output.

Performance Optimization: High-Density Container Deployment

In high-specification bare metal container scenarios, a large number of Pods are deployed on a single node. Kubernetes natively supports 100 to 300 Pods per node. This feature reduces resources consumed by the container runtime and kubelet probes, reducing the base overhead and runtime costs of the container infrastructure. As a result, the single-node Pod deployment density is increased to over 1,000 Pods.

Environment OS Compatibility Verification

This feature conducts comprehensive environment diagnostics, delivering non-intrusive compatibility check results for openFuyao within the current OS.

AI Inference Optimization

SIG-ai-inference is dedicated to building an open, efficient, and future-oriented cloud-native LLM inference acceleration system. The ai-inference-integration project cultivated by this SIG is officially released in v25.09. This feature provides an end-to-end acceleration solution in the AI inference scenario, incorporating a intelligent routing module, an inference backend module, and a global KV cache management module. It significantly improves inference throughput and reduces latency, providing efficient and reliable technical support for AI service deployment.

End-to-End AI Acceleration: AI Inference Integrated Deployment

AI inference integrated deployment is an end-to-end integrated deployment solution designed for optimizing AI inference services in the cloud-native environment. Based on a Helm chart, this solution seamlessly integrates three AI inference acceleration modules: the intelligent routing (hermes-router), high-performance inference backend (Inference Backend), and global KV cache management (cache-indexer) modules. This offers a complete deployment pipeline for AI inference acceleration components, covering request ingestion, inference execution, and resource management for a one-stop deployment experience.

Intelligent Routing

The intelligent routing module adopts a KV-aware routing policy. Compared to round-robin, KV-aware dynamically balances KV cache matching with the load status of vLLM instances, significantly optimizing inference performance:

  • Average first-token latency reduced by 56.9%
  • Output throughput increased by 133.0%

Best Practice Integration: AI Inference Software Suite

The AI inference software suite provides an integrated solution for AI appliances, initially supporting full-stack foundational LLM inference and DeepSeek. It truly achieves out-of-the-box usability, with full-link integration from hardware drivers to inference frameworks and inference models. Currently, it offers driver support for certain NPU or GPU models.

Existing Feature Optimization

Colocation: QoS Assurance Capability Improvement

In v25.09, the QoS assurance capability of colocation is further enhanced. Multiple capabilities, such as Rubik elastic throttling and asynchronous tiered memory reclamation, are integrated. The QoS fluctuation is further reduced while a 30–50% improvement in resource utilization is maintained. Additionally, SIG-colocation has led structured code refactoring to simplify the repository architecture and enhance the code quality.

Cluster-API: Installation Mode Optimization

SIG-installation provides a new installation experience in this release.

  • Simplified creation of the offline installation package: Users can select the extensions to be contained in the offline installation package by making configurations.
  • Lightweight installation specifications: Bootstrap nodes and management clusters can be co-deployed. A cluster can be created even if there is only one node.

References

This article is first published by the openFuyao Community. Reproduction is permitted in accordance with the terms of the CC-BY-SA 4.0 License.