A two-stage approach to agricultural products authentication by mining subtle local features

Dat Tran Anh1
1 Thuyloi University

Main Article Content

Abstract

Counterfeit agricultural products recognition poses a significant challenge due to the high visual similarity between genuine and fake items. Existing methods often struggle to capture the subtle details necessary for reliable differentiation. This paper presents Focus on Detail (FoD), a novel approach that emphasizes the automatic discovery of trustworthy 'authenticity cues' on the products. By employing custom-designed loss functions and a semi-supervised training strategy, FoD learns to suppress distracting background regions and focus exclusively on the most critical local features. Experimental results demonstrate that FoD achieves superior performance on standard benchmark datasets, establishing a new state-of-the-art in both accuracy and speed.

Article Details

References

Arandjelovic, R., Gronat, P., Torii, A., Pajdla, T., & Sivic, J. (2018). NetVLAD: CNN Architecture for Weakly Supervised Place Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6), 1437–1451. https://doi.org/10.1109/TPAMI.2017.2711011
Damm, S., Laszkiewicz, M., Lederer, J., & Fischer, A. (2025). AnomalyDINO: Boosting Patch-based Few-Shot Anomaly Detection with DINOv2. Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025, 1319–1329. https://doi.org/10.1109/WACV61041.2025.00136
Goyal, P., & Ferrara, E. (2018). Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 151, 78–94. https://doi.org/10.1016/j.knosys.2018.03.022
Hausler, S., Garg, S., Xu, M., Milford, M., & Fischer, T. (2021). Patch-NetVlad: Multi-scale fusion of locally-global descriptors for place recognition. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 14136–14147. https://doi.org/10.1109/CVPR46437.2021.01392
Lu, F., Zhang, L., Lan, X., Dong, S., Wang, Y., & Yuan, C. (2024). Towards Seamless Adaptation of Pre-Trained Models for Visual Place Recognition. 12th International Conference on Learning Representations, ICLR 2024, 1–22.
Mukhiddinov, M., Muminov, A., & Cho, J. (2022). Improved Classification Approach for Fruits and Vegetables Freshness Based on Deep Learning. Sensors, 22(21). https://doi.org/10.3390/s22218192
Perron, Y., Sydorov, V., Wijker, A. P., Evans, D., Pottier, C., & Landrieu, L. (2024). Archaeoscape: Bringing Aerial Laser Scanning Archaeology to the Deep Learning Era. Advances in Neural Information Processing Systems, 37(NeurIPS), 1–25.
Tran-Anh, D., Vu, H. N., Bui-Quoc, B., & Dao Hoang, N. (2024). LmGa: Combining label mapping method with graph attention network for agricultural recognition. Information and Knowledge Systems. https://doi.org/10.1007/s10115-024-02234-z
Wang, H., Zhang, T., & Salzmann, M. (2024). Sinder: Repairing the singular defects of dinov2. European Conference on Computer Vision, 20–35.
Zhuo, W., Tang, Z., Xue, W., Ding, H., & Shen, L. (2025). DINOv2-powered Few-Shot Semantic Segmentation: A Unified Framework via Cross-Model Distillation and 4D Correlation Mining. https://arxiv.org/pdf/2504.15669
Huang, Y., Zou, J., Meng, L., Yue, X., Zhao, Q., Li, J., Song, C., Jimenez, G., Li, S., & Fu, G. (2024). Comparative Analysis of ImageNet Pre-Trained Deep Learning Models and DINOv2 in Medical Imaging Classification. Proceedings - 2024 IEEE 48th Annual Computers, Software, and Applications Conference, COMPSAC 2024, 297–305. https://doi.org/10.1109/COMPSAC61105.2024.00049
Izquierdo, S., & Civera, J. (2023). Optimal Transport Aggregation for Visual Place Recognition. 17658–17668. https://doi.org/10.1109/CVPR52733.2024.01672 http://arxiv.org/abs/2311.15937