计算机科学
人工智能
图像质量
计算机视觉
图像(数学)
作者
Bolong Liu,Hao Zhang,Jie Liu,Qiang Wang
标识
DOI:10.1109/secon58729.2023.10287530
摘要
Smart agriculture requires an extensive convergence of information technology and agriculture. Attaining intelligence mandates an enormous amount of data to train models. However, it is challenging to acquire a large number of crop image data, limiting the application and growth of computer vision technology in agriculture. To address this problem, we designed a crop image generation system that combines a large language model with visual language multi-modal large models to augment the scale, variety, and resolution of crop image data. First, the system inputs existing real crop images into the visual language multimodal model to extract features and represent crop images in text form. Then, the system passes the crop text representation to the language model for cleaning and processing, which generates prompts to create crop images. The prompts are input into the visual language multi-modal model to generate crop images based on text representation of crops. The resulting crop images undergo image quality evaluation in the visual language multimodal model, and high-quality crop images are saved to the crop image dataset based on the quality evaluation. These steps lead to the formation of the final generated crop image dataset. The experimental results indicate that the crop images generated using the proposed system are similar to but different from the example images. This characteristic enables the expansion of crop data while circumventing redundancy and allowing for resolution control, which is crucial for dense segmentation tasks. Using this method, the existing data can be enlarged up to 7.5 times.
科研通智能强力驱动
Strongly Powered by AbleSci AI