• Open Daily: 10am - 10pm
    Alley-side Pickup: 10am - 7pm

    3038 Hennepin Ave Minneapolis, MN
    612-822-4611

Open Daily: 10am - 10pm | Alley-side Pickup: 10am - 7pm
3038 Hennepin Ave Minneapolis, MN
612-822-4611
Large Vision-Language Models: Pre-Training, Prompting, and Applications

Large Vision-Language Models: Pre-Training, Prompting, and Applications

Hardcover

Series: Advances in Computer Vision and Pattern Recognition

DatabasesGeneral ComputersProbability & Statistics

ISBN10: 3031949684
ISBN13: 9783031949685
Publisher: Springer
Published: Aug 31 2025
Pages: 429
Weight: 1.76
Height: 1.00 Width: 6.14 Depth: 9.21
Language: English

The rapid progress in the field of large multimodal foundation models, especially vision-language models, has dramatically transformed the landscape of machine learning, computer vision, and natural language processing. These powerful models, trained on vast amounts of multimodal data mixed with images and text, have demonstrated remarkable capabilities in tasks ranging from image classification and object detection to visual content generation and question answering. This book provides a comprehensive and up-to-date exploration of large vision-language models, covering the key aspects of their pre-training, prompting techniques, and diverse real-world computer vision applications. It is an essential resource for researchers, practitioners, and students in the fields of computer vision, natural language processing, and artificial intelligence.

Also in

General Computers