⭐️ If you like this project, give it a star on GitHub! ⭐️

V2.2.0 Released

LastUpdate: 2026-05-12, Author: HAO022

Following a period of development and testing, the v2.2.0 release is now available. This provides a more stable and robust foundation for large-scale production environments.

Key Features

  • Hardware: Hardware fault detection has been added for AI and general-purpose computing scenarios. This includes support for RDMA PFC congestion detection.
  • Multi-Cloud: Native adapters have been implemented for major cloud platforms, including Alibaba Cloud, Google Cloud, and AWS.
  • MetaX: GPU hardware fault detection capabilities were contributed to the community by MetaX,
  • Kubernetes: Support for additional cgroup driver types has been added, along with an automated detection mechanism.
  • IOtracing: The iotracing tool has been open-sourced. It supports rapid execution and automated tracing from the command line, addressing intermittent I/O performance issues.
  • Configuration: All configuration options have been standardized. This significantly reduces the complexity.
  • Testing: The test suite has been expanded to unit, integration, and end-to-end tests.

Documentation

The documentation structure for version 2.2.0 has been revised and expanded. These improvements are intended to accelerate the onboarding process and enable users to more fully leverage HUATUO.

  • Configuration Detailed explanations of configuration parameters and recommended best practices have been added.

  • Metrics All monitoring metrics have been clearly categorized and accompanied by detailed descriptions.

  • Events These events span the system, application, and hardware layers, providing data for issue diagnosis.

  • AutoTracing Documentation for the automatic tracing feature has been added, covering supported trace types, core characteristics, and usage recommendations.


HUATUO is an operating system observability project open-sourced by DiDi and incubated under the China Computer Federation (CCF).

微信