With its omnidirectional spatial field of view, panoramic depth estimation has become a central subject in discussions surrounding 3D reconstruction techniques. Despite the need for panoramic RGB-D datasets, the scarcity of panoramic RGB-D cameras proves a considerable obstacle, thus limiting the practicality of supervised techniques in the estimation of panoramic depth. Self-supervised learning, leveraging RGB stereo image pairs, is poised to surmount this hurdle, given its reduced dataset dependency. This paper introduces SPDET, a self-supervised panoramic depth estimation network with edge awareness, seamlessly integrating a transformer and spherical geometry features. Employing the panoramic geometry feature, we construct our panoramic transformer to generate accurate and high-resolution depth maps. LY294002 We further introduce a pre-filtered depth image rendering method to synthesize novel view images for self-supervision. Our parallel effort focuses on designing an edge-aware loss function to refine self-supervised depth estimation within panoramic image datasets. In conclusion, we demonstrate the prowess of our SPDET via a suite of comparative and ablation experiments, reaching the pinnacle of self-supervised monocular panoramic depth estimation. At the GitHub location, https://github.com/zcq15/SPDET, one can find our code and models.
Generative, data-free quantization, a novel compression technique, enables quantization of deep neural networks to low bit-widths, making it independent of real data. Data generation is achieved by utilizing the batch normalization (BN) statistics of the full-precision networks in order to quantize the networks. However, the practical application is invariably hampered by the substantial issue of deteriorating accuracy. Our theoretical analysis emphasizes the necessity of a diverse synthetic dataset for successful data-free quantization. However, existing approaches, where synthetic data is experimentally restricted by batch normalization (BN) statistics, demonstrate pronounced homogenization across both the sample and the overall distribution. The paper presents a general Diverse Sample Generation (DSG) methodology for generative data-free quantization, aiming to alleviate the detrimental homogenization issue. To alleviate the distribution constraint in the BN layer, we initially loosen the statistical alignment of features. By varying the influence of specific batch normalization (BN) layers in the loss function, and reducing sample-to-sample correlations, we enhance the diversity of generated samples from statistical and spatial perspectives. Comprehensive image classification studies confirm that our DSG maintains consistent high-quality quantization performance across different neural architectures, especially with ultra-low bit-width implementations. Quantization-aware training and post-training quantization approaches generally benefit from the data diversification introduced by our DSG, demonstrating its broad applicability and effectiveness.
Using a nonlocal multidimensional low-rank tensor transformation (NLRT), we propose a method for denoising MRI images in this paper. A non-local MRI denoising approach, based on a non-local low-rank tensor recovery framework, is initially designed. LY294002 Moreover, a multidimensional low-rank tensor constraint is employed to acquire low-rank prior knowledge, integrated with the three-dimensional structural characteristics of MRI image cubes. Our NLRT's denoising performance relies on its ability to retain substantial image detail. By leveraging the alternating direction method of multipliers (ADMM) algorithm, the optimization and updating of the model is addressed. Comparative experiments have been conducted on a selection of cutting-edge denoising methods. The experimental analysis of the denoising method's performance involved the addition of Rician noise with different strengths to gauge the results. The experimental data strongly suggests that our noise-reduction technique (NLTR) possesses an exceptional capacity to reduce noise in MRI images, ultimately leading to high-quality reconstructions.
Medication combination prediction (MCP) aids experts in their analysis of the intricate systems that regulate health and disease. LY294002 Current studies often focus on portraying patients based on past medical records, but frequently neglect the essential value of medical knowledge, encompassing prior experience and pharmacological information. Utilizing medical knowledge, this article constructs a graph neural network (MK-GNN) model, which seamlessly integrates patient characteristics and medical knowledge information. Further detail shows patient characteristics are extracted from their medical files, separated into different feature sub-spaces. These patient characteristics are subsequently linked to form a unified feature representation. Using prior knowledge to understand the correlation between medications and diagnoses, heuristic medication features are inferred from the diagnostic results. These medicinal characteristics within such medication enable the MK-GNN model to find the optimal parameters. Subsequently, prescriptions' medication relationships are built into a drug network, seamlessly integrating medication knowledge into medication vector representations. The results unequivocally highlight the MK-GNN model's superior performance compared to existing state-of-the-art baselines when measured across various evaluation metrics. The MK-GNN model's practical application is showcased in this case study.
Event anticipation, as part of the process studied in cognitive research, is associated with human event segmentation. Following this key discovery, we devise a simple yet effective end-to-end self-supervised learning framework for the delineation of events and the detection of their boundaries. Our system, distinct from standard clustering methods, capitalizes on a transformer-based feature reconstruction technique to discern event boundaries through the analysis of reconstruction errors. The ability of humans to discover new events is rooted in the difference between their predictions and the data they receive from their surroundings. The different semantic interpretations of boundary frames make their reconstruction a difficult task (frequently resulting in significant errors), aiding event boundary identification. Furthermore, because the reconstruction process happens at the semantic level rather than the pixel level, we create a temporal contrastive feature embedding (TCFE) module for learning the semantic visual representation needed for frame feature reconstruction (FFR). The process of this procedure mirrors the human experience of accumulating knowledge through long-term memory. Our project's focus is on segmenting generic occurrences, not on localizing particular events. We prioritize the precise determination of event commencement and conclusion. For this reason, we have settled upon the F1 score (precision over recall) as the primary metric for an unbiased comparison with earlier strategies. At the same time, we compute both the conventional frame-based average across frames, abbreviated as MoF, and the intersection over union (IoU) metric. Our work is comprehensively benchmarked against four public datasets, yielding dramatically superior outcomes. At https://github.com/wang3702/CoSeg, the source code for CoSeg is accessible.
This article examines the problem of uneven running length in incomplete tracking control, a common occurrence in industrial processes, including those in chemical engineering, often stemming from artificial or environmental shifts. The strictly repetitive nature of iterative learning control (ILC) has a significant impact on its design and implementation. In conclusion, a point-to-point iterative learning control (ILC) approach is enhanced by the development of a dynamic neural network (NN) predictive compensation scheme. For the purpose of tackling the complexities in establishing an accurate mechanism model for real-world process control, a data-driven approach is also utilized. The iterative dynamic predictive data model (IDPDM) process, which employs iterative dynamic linearization (IDL) and radial basis function neural networks (RBFNN), requires input-output (I/O) signals. The resultant model subsequently establishes extended variables to resolve the impact of incomplete operational periods. An objective function underpins a proposed learning algorithm that incorporates multiple iterative error assessments. The NN proactively adapts this learning gain to the evolving system through continuous updates. The composite energy function (CEF) and compression mapping provide evidence for the system's convergence. In conclusion, a pair of numerical simulation examples are provided.
Graph classification tasks benefit significantly from the superior performance of graph convolutional networks (GCNs), whose structure can be interpreted as a composite encoder-decoder system. Nonetheless, the existing methods are often deficient in comprehensively considering both global and local aspects in the decoding process, ultimately causing the loss of important global information or overlooking crucial local details within complex graphs. The prevalent cross-entropy loss, although beneficial in general, presents a global measure for the encoder and decoder, hindering the ability to supervise their respective training states. In order to resolve the issues mentioned above, we present a multichannel convolutional decoding network (MCCD). The MCCD model initially incorporates a multi-channel GCN encoder, which generalizes better than a single-channel encoder. This improvement is due to multiple channels' ability to extract graph data from diverse perspectives. To decode graphical information, we propose a novel decoder structured with a global-to-local learning method, effectively enabling the extraction of global and local features. Furthermore, we implement a balanced regularization loss to oversee the training processes of the encoder and decoder, ensuring their adequate training. Experiments using standard datasets reveal the effectiveness of our MCCD in relation to accuracy, processing speed, and computational intricacy.