238 / 2025-06-15 16:21:44
Rolling Bearing Life Prediction with Multi-Scale Feature Fusion and Attention Mechanism Enhancement
RUL prediction,Rolling bearing,Dual attention mechanism,Multi-scale residual pyramid,Relative position coding
终稿
Yingming Yang / Kunming University of Science and Technology
Zhihai Wang / Kunming University of Science and Technology
Xiaoqin Liu / Kunming University of Science and Technology
Tao Liu / Kunming University of Science and Technology
Meiwang Meng / Kunming University of Science and Technology
Jun Zhou / Kunming University of Science and Technology
Rolling bearings, as critical components of rotating machinery equipment, play a vital role in equipment safety and operational reliability. Existing rolling bearing remaining useful life (RUL) prediction methods suffer from deficiencies in degradation information mining and critical time-step information acquisition, leading to the loss of crucial degradation information and limited model prediction accuracy and generalization capability. To address these challenges, this study proposes a deep multiscale feature fusion network architecture with dual attention mechanisms. The method first extracts multi-domain features from rolling bearing vibration signals, including traditional time-domain features such as variance, mean, energy, root mean square, peak factor, and frequency-domain features like frequency centroid, as well as innovative features including inverse tangent standard deviation and inverse hyperbolic sine standard deviation. Subsequently, the feature set is standardized using the maximum absolute value normalization method to provide high-quality input data for the subsequent deep learning model. The overall network architecture comprises four core components: SE attention mechanism, multiscale feature extraction and fusion module, dual attention-enhanced Transformer encoder layers, and regression prediction layer, forming an end-to-end deep learning prediction framework.

In terms of technical innovation, the core contributions of this research are manifested in three key technological breakthroughs. First, a multiscale residual pyramid layer is designed, which proportionally reduces feature maps to 1/2, 1/4, and 1/8 sizes through downsampling operations, employs 1×3, 1×5, and 1×7 convolution kernels for multiscale feature extraction, and achieves feature fusion between adjacent layers through upsampling operations, effectively capturing both local features and global degradation trends within the network. Second, a dual-stage attention enhancement mechanism is proposed, which calibrates and balances the temporal importance of feature sequences through the SE attention mechanism, uses global average pooling to compress global spatial information into channel descriptors, while introducing relative position encoding-enhanced multi-head self-attention mechanisms that establish a two-dimensional coordinate system with time-step coordinates, construct relative position index matrices, and combine relative position weights with attention weights, enhancing the model's perception capability for time-series data. Finally, a time-aware enhanced Transformer architecture is constructed, employing a 3-layer cascaded multiscale residual pyramid structure with parameter settings of 2 encoder layers, 4 attention heads, and 64-dimensional embedding, which can effectively extract and fuse features at different scales while avoiding gradient vanishing or explosion problems, achieving enhancement of critical time-step information and strengthening temporal dependencies between time steps.

Experimental validation employs the Xi'an Jiaotong University rolling bearing accelerated fatigue full-life dataset, covering complete bearing lifecycle data under three operating conditions. Experimental results demonstrate that the proposed method performs excellently across multiple evaluation metrics: average RMSE of 0.064, MAE of 0.052, and R² of 0.947, significantly outperforming existing mainstream methods. Compared to Transformer-Encoder, the comprehensive average error is reduced by 27.10%-39.83%; compared to SE-Transformer-Encoder, the comprehensive average error is reduced by 43.73%-48.09%; compared to GCU-Transformer, the comprehensive average error is reduced by 36.34%-45.28%. Ablation experiments validate the significant improvement effect of relative position encoding on prediction performance, while generalization experiments demonstrate the model's good adaptability and robustness under different operating conditions. This research breaks through the limitations of single-scale modeling and achieves collaborative optimization of deep network multiscale feature fusion and secondary attention mechanism enhancement, providing important theoretical foundations and technical support for rolling bearing fault prediction and equipment maintenance strategy formulation. Future work will focus on model lightweighting, generalization capability optimization, and further improvement of multiscale feature fusion strategies to enhance the model's adaptability and engineering practicality under different operating conditions.
重要日期
  • 会议日期

    08月01日

    2025

    08月04日

    2025

  • 07月04日 2025

    初稿截稿日期

主办单位
中国机械工程学会设备智能运维分会
承办单位
新疆大学
移动端
在手机上打开
小程序
打开微信小程序
客服
扫码或点此咨询