{"id":37220,"date":"2025-09-09T18:53:19","date_gmt":"2025-09-09T10:53:19","guid":{"rendered":"https:\/\/aicats.wiki\/?p=37220"},"modified":"2025-09-09T18:53:19","modified_gmt":"2025-09-09T10:53:19","slug":"2025%e5%b9%b4vlm%e5%a4%9a%e6%a8%a1%e6%85%8b%e5%a4%a7%e6%a8%a1%e5%9e%8b%e6%8e%a8%e8%96%a6%ef%bc%9aai%e8%a6%96%e8%a6%ba%e8%aa%9e%e8%a8%80%e8%9e%8d%e5%90%88%e6%87%89%e7%94%a8%e7%9a%847%e5%a4%a7%e6%9c%80","status":"publish","type":"post","link":"https:\/\/aicats.wiki\/tw\/2025\/09\/09\/37220-html","title":{"rendered":"2025\u5e74vlm\u591a\u6a21\u614b\u5927\u6a21\u578b\u63a8\u85a6\uff1aAI\u8996\u89ba\u8a9e\u8a00\u878d\u5408\u61c9\u7528\u76847\u5927\u6700\u4f73\u5de5\u5177"},"content":{"rendered":"<p><strong>2025\u5e74\uff0c<a href=\"https:\/\/aicats.wiki\/tw\/2025\/07\/14\/9309-html\/\" title=\"Gemini Google \u662f\u4ec0\u9ebc\uff1f\u4e00\u6587\u770b\u61c2Google \u65b0\u4e00\u4ee3AI \u5927\u6a21\u578b\u7684\u6838\u5fc3\u529f\u80fd\u8207\u61c9\u7528\u5834\u666f\">\u591a\u6a21\u614b\u5927\u6a21\u578b<\/a>\uff08VLM, Vision-Language Model\uff09\u6210\u70ba<a class=\"external\" href=\"https:\/\/aicats.wiki\/tw\/tag\/ai\" title=\"\u67e5\u770b\u8207 AI \u76f8\u95dc\u7684\u6587\u7ae0\" target=\"_blank\">AI<\/a>\u6280\u8853\u767c\u5c55\u65b0\u9ad8\u5730\u3002<\/strong>\u672c\u6587\u6df1\u5ea6\u6574\u7406\u5168\u74037\u5927VLM\u6838\u5fc3\u7522\u54c1\uff0c<strong>\u6bd4\u8f03\u958b\u6e90\u9589\u6e90\u3001\u6280\u8853\u8def\u7dda\u3001\u61c9\u7528\u5834\u666f\u8207\u5728\u5730\u5316\u80fd\u529b\uff0c\u5168\u9762\u89e3\u6790\u5176\u6700\u65b0\u512a\u52a3\u52e2<\/strong>\u3002\u6587\u7ae0\u9069\u5408\u958b\u767c\u8005\u3001\u4f01\u696d\u6c7a\u7b56\u8005\u3001\u79d1\u5b78\u7814\u7a76\u5de5\u4f5c\u8005\u4e00\u7ad9\u638c\u63e1AI\u8996\u89ba\u8a9e\u8a00\u878d\u5408\u7684\u6700\u4f73\u9078\u7528\u8da8\u52e2\u8207\u90e8\u7f72\u5efa\u8b70\u3002<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1797\" height=\"909\" src=\"https:\/\/aicats.wiki\/wp-content\/uploads\/2025\/09\/image-168.jpg\" alt=\"2025\u5e74vlm\u591a\u6a21\u614b\u5927\u6a21\u578b\u63a8\u85a6\uff1aAI\u8996\u89ba\u8a9e\u8a00\u878d\u5408\u61c9\u7528\u76847\u5927\u6700\u4f73\u5de5\u5177\" class=\"wp-image-42380\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>VLM 2025\u5e74\u5ea6\u591a\u6a21\u614b\u5927\u6a21\u578b\u5de5\u5177\u7e3d\u89bd<\/strong><\/h2>\n\n\n\n<p>\u5728\u6b63\u5f0f\u63a8\u85a6\u524d\uff0c\u5148\u900f\u904e\u8868\u683c\u70ba\u8b80\u8005\u5448\u73fe<strong>2025\u5e74\u6700\u53d7\u95dc\u6ce8\u76847\u5927VLM\u5de5\u5177<\/strong>\u7279\u6027\u4e00\u89bd\uff1a<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><th>\u540d\u7a31<\/th><th>\u958b\u6e90\/\u9589\u6e90<\/th><th>\u95dc\u9375\u7279\u8272<\/th><th>\u6587\u5b57\u4e0a\u4e0b\u6587\u8996\u7a97<\/th><th>API\/\u81ea\u67b6<\/th><th>\u9023\u7d50<\/th><\/tr><tr><td><strong>Gemini 2.5 Pro<\/strong><\/td><td>\u9589\u6e90<\/td><td>\u901a\u7528\u591a\u6a21\u614b\u4efb\u52d9\uff0c\u6975\u9ad8\u5f48\u6027<\/td><td>10k~20k<\/td><td>\u5b98\u65b9\u5e73\u53f0<\/td><td><a href=\"https:\/\/ai.google.com\/\" target=\"_blank\" rel=\"noopener\" class=\"external\" >Google AI Studio<\/a><\/td><\/tr><tr><td><strong>GPT-5<\/strong><\/td><td>\u9589\u6e90<\/td><td>\u7d71\u4e00Transformer\uff0c\u591a\u6a21\u614b\u9ad8\u6548\u878d\u901a<\/td><td>128k<\/td><td>\u5b98\u65b9\u5e73\u53f0<\/td><td><a href=\"https:\/\/openai.com\/gpt-4o\/\" target=\"_blank\" rel=\"noopener\" class=\"external\" >OpenAI<\/a><\/td><\/tr><tr><td><strong>Claude 4.1<\/strong><br><strong> Vision<\/strong><\/td><td>\u9589\u6e90<\/td><td>OCR\/\u5716\u8868\u7279\u5316\uff0c\u5f37\u79d1\u5b78\u63a8\u7406<\/td><td>200k<\/td><td>\u5b98\u65b9\u5e73\u53f0<\/td><td><a href=\"https:\/\/www.anthropic.com\/\" target=\"_blank\" rel=\"noopener\" class=\"external\" >Anthropic<\/a><\/td><\/tr><tr><td><strong>Qwen 2.5-VL-72B<\/strong><\/td><td>\u958b\u6e90<\/td><td>\u4efb\u610f\u89e3\u6790\u5ea6\/\u9577\u8996\u8a0a\u591a\u6a21\u8907\u96dc\u4efb\u52d9<\/td><td>128k<\/td><td>API\/\u81ea\u5efa<\/td><td><a href=\"https:\/\/github.com\/QwenLM\/Qwen-VL\" target=\"_blank\" rel=\"noopener\" class=\"external\" >Qwen-VL<\/a><\/td><\/tr><tr><td><strong>Llama 4 Scout<\/strong><\/td><td>\u958b\u6e90<\/td><td>\u6df7\u5408\u5c08\u5bb6\u6a5f\u5236\uff0c\u6975\u9ad8\u53ef\u64f4\u5c55\u6027<\/td><td>10k~100k<\/td><td>API\/\u81ea\u5efa<\/td><td><a href=\"https:\/\/ai.meta.com\/llama\/\" target=\"_blank\" rel=\"noopener\" class=\"external\" >Llama 4<\/a><\/td><\/tr><tr><td><strong>MiniCPM-V 8B<\/strong><\/td><td>\u958b\u6e90<\/td><td>\u8d85\u4f4e\u53c3\u7aef\u5074\u63a8\u7406\uff0c\u5168\u9762\u5f71\u7247\/\u5716\u7247\u7406\u89e3<\/td><td>32k+<\/td><td>API\/\u81ea\u5efa<\/td><td><a href=\"https:\/\/github.com\/OpenBMB\/MiniCPM-V\" target=\"_blank\" rel=\"noopener\" class=\"external\" >MiniCPM-V<\/a><\/td><\/tr><tr><td><strong>CogVLM 17B<\/strong><\/td><td>\u958b\u6e90<\/td><td>\u5fae\u8abfSOTA\u6548\u80fd\uff0c\u8996\u89ba+\u8a9e\u8a00\u9ad8\u5206\u6e2c\u8a66<\/td><td>16k<\/td><td>API\/\u81ea\u5efa<\/td><td><a href=\"https:\/\/github.com\/THUDM\/CogVLM\" target=\"_blank\" rel=\"noopener\" class=\"external\" >CogVLM<\/a><\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><em>\u8868\u683c\u8aaa\u660e\uff1a\u5de5\u5177\u7686\u652f\u63f4\u73fe\u4ee3\u591a\u6a21\u614b\u878d\u5408\u4e3b\u6d41\u9700\u6c42\uff0c\u90e8\u5206\u5de5\u5177\u53ef\u900f\u904e<a href=\"https:\/\/novita.ai\/models\/llm\/qwen-qwen2.5-vl-72b-instruct\" target=\"_blank\" rel=\"noopener\" class=\"external\" >Novita AI<\/a>\u7b49\u5e73\u53f0\u4f4e\u6210\u672cAPI\u5b58\u53d6\u3002<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>\u5168\u7403\u8996\u91ce\uff1avlm\u591a\u6a21\u614bAI\u6a21\u578b\u5b9a\u7fa9\u8207\u61c9\u7528\u50f9\u503c<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>\u4ec0\u9ebc\u662f\u591a\u6a21\u614b\u5927\u6a21\u578b\uff08VLM\uff09\uff1f<\/strong><\/h3>\n\n\n\n<p><strong>\u591a\u6a21\u614b\u5927\u6a21\u578b\uff08Vision-Language Model, VLM\uff09\u662f\u53ef\u540c\u6642\u8655\u7406\u5f71\u50cf\u548c\u6587\u672c\uff0c\u4e26\u7522\u751f\u81ea\u7136\u8a9e\u8a00\u8f38\u51fa\u7684AI\u7cfb\u7d71\u3002<\/strong>VLM\u5177\u5099\u5f37\u5927\u300c\u770b\u5716\u8aaa\u8a71\u300d\u3001\u6307\u4ee4\u7406\u89e3\u751f\u6210\u53ca\u8907\u96dc\u63a8\u7406\u80fd\u529b\uff0c\u662f\u63a8\u52d5\u667a\u6167\u554f\u7b54\u3001\u6587\u4ef6\u8cea\u6aa2\u3001\u8996\u89ba\u5206\u6790\u3001OCR\u3001\u6cd5\u5f8b\/\u79d1\u7814\u52a9\u7406\u7b49\u5834\u666f\u7684\u6838\u5fc3\u57fa\u790e\u8a2d\u65bd\u3002<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>VLM\u5de5\u4f5c\u539f\u7406\u6838\u5fc3<\/strong><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u8996\u89ba\u7279\u5fb5\u63d0\u53d6\u5668\uff08ViT\u3001CLIP\u7b49\uff09\uff1a<\/strong>\u5c07\u5716\u7247\u3001\u8996\u8a0a\u50cf\u7d20\u8f49\u70ba\u9ad8\u5c64\u8868\u5fb5\u3002<\/li>\n\n\n\n<li><strong>\u8a9e\u8a00\u6a21\u5f0f\u4e3b\u5e79\uff08Llama\u3001Qwen\u7b49\uff09\uff1a<\/strong>\u5c0d\u8996\u89ba\u8868\u5fb5\u8207\u6587\u5b57\u878d\u5408\uff0c\u751f\u6210\u56de\u61c9\u3002<\/li>\n\n\n\n<li><strong>\u8de8\u6a21\u614b\u878d\u5408\u6280\u8853\uff1a<\/strong>\u5982\u4ea4\u53c9\u6ce8\u610f\u529b\u3001\u5e8f\u5217\u7d71\u4e00\u7de8\u78bc\uff0c\u5be6\u73fe\u8996\u89ba\u8207\u6587\u5b57\u6df1\u5ea6\u8026\u5408\u3002<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img decoding=\"async\" width=\"1797\" height=\"909\" src=\"https:\/\/aicats.wiki\/wp-content\/uploads\/2025\/09\/image-168-1.jpg\" alt=\"\u591a\u6a21\u614b\u5927\u6a21\u578b\uff08VLM\uff09\u6587\u7ae0\u4ecb\u7d39\" class=\"wp-image-42383\" style=\"width:1077px;height:auto\"\/><figcaption class=\"wp-element-caption\">\u5716\uff0f<a href=\"https:\/\/blog.csdn.net\/2401_84495872\/article\/details\/149836506\" title=\"\" target=\"_blank\"  rel=\"nofollow noopener\"  class=\"external\" >\u591a\u6a21\u614b\u5927\u6a21\u578b\uff08VLM\uff09\u6587\u7ae0\u4ecb\u7d39<\/a><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>\u70ba\u4ec0\u9ebc2025\u5e74VLM\u662f\u7522\u696d\u7126\u9ede\uff1f<\/strong><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u6578\u64da\u7121\u754c\u878d\u5408\uff1a<\/strong>\u6587\u5b57\u3001\u5716\u7247\u3001\u7d50\u69cb\u5316\u8cc7\u6599\u5834\u666f\u7121\u7e2b\u9023\u52d5\u3002<\/li>\n\n\n\n<li><strong>\u9ad8\u50f9\u503c\u61c9\u7528\u9a45\u52d5\uff1a<\/strong>\u91ab\u7642\u5f71\u50cf\u3001\u5831\u8868\u5206\u6790\u3001\u653f\u4f01\u667a\u6167\u8fa6\u516c\u5ba4\u7b49\u3002<\/li>\n\n\n\n<li><strong>\u4f01\u696d\u7d1a\u90e8\u7f72\uff1a<\/strong>\u4e3b\u6d41\u958b\u6e90\u6a21\u578b\u53ef\u9748\u6d3b\u96f2\u7aef\/\u672c\u5730\/\u7aef\u5074\u90e8\u7f72\uff0c\u9069\u5408\u591a\u7a2e\u786c\u9ad4\u3002<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>7\u5927\u6700\u4f73Vision-Language Model\u5de5\u5177\u6df1\u5ea6\u63a8\u85a6<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">1. Gemini 2.5 Pro<\/h3>\n\n\n\n<p><strong>\u958b\u767c\u5546\uff1a<\/strong>Google DeepMind<br><strong>\u7279\u9ede\uff1a<\/strong><strong>\u6975\u81f4\u901a\u7528\u6027<\/strong>\uff0c\u9069\u5408\u5927\u898f\u6a21\u9ad8\u8981\u6c42\u591a\u6a21\u614b\u5834\u666f\uff1b\u652f\u63f4\u6587\u5b57\u3001\u5716\u7247\u3001\u5f71\u7247\u3001\u7d50\u69cb\u5316\u591a\u8cc7\u6599\u985e\u578b\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u67b6\u69cb\u7279\u5fb5\uff1a<\/strong>\u51cd\u7d50SigLIP-ViT\u8996\u89ba\u5854+\u4ea4\u53c9\u6ce8\u610f\u529bTransformer\u8a2d\u8a08\uff0c\u4efb\u52d9\u5207\u63db\u9748\u6d3b\uff0c\u63a8\u7406\u6975\u5feb\u3002<\/li>\n\n\n\n<li><strong>\u6587\u5b57\u8996\u7a97\uff1a<\/strong> 1\u842c~2\u842ctokens\u3002<\/li>\n\n\n\n<li><strong>\u958b\u653e\u6027\uff1a<\/strong>\u50c5Google AI Studio\u6216API\u8abf\u7528\uff0c\u7121\u958b\u6e90\u7248\u672c\u3002<\/li>\n\n\n\n<li><strong>\u9069\u5408\u5834\u666f\uff1a<\/strong>\u96f2\u7aef\u539f\u751fSaaS\u3001\u570b\u969b\u9ad8\u5b89\u5168\u9700\u6c42\u3002<\/li>\n\n\n\n<li><strong>\u61c9\u7528\u7a0b\u5f0f\u4eae\u9ede\uff1a<\/strong>Excel\u8868\u89e3\u8b80\u3001\u591a\u8a9e\u7a2e\u6587\u4ef6OCR\u3001\u5f71\u7247\u554f\u7b54\u7b49\u3002<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/aicats.wiki\/wp-content\/uploads\/2025\/08\/my_prefix_1756198744.png\" alt=\"Gemini 2.5 Pro\u4ecb\u9762\" class=\"wp-image-51824\"\/><figcaption class=\"wp-element-caption\">\u5716\uff0f<a title=\"\" href=\"https:\/\/ai.google.com\/\" target=\"_blank\"  rel=\"nofollow noopener\"  class=\"external\" >Gemini 2.5 Pro\u4ecb\u9762<\/a><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">2. GPT-5<\/h3>\n\n\n\n<p><strong>\u958b\u767c\u5546\uff1a<\/strong>OpenAI<br><strong>\u7279\u9ede\uff1a<\/strong><strong>\u7d71\u4e00Transformer<\/strong>\uff0c\u5716\u7247\u3001\u97f3\u8a0a\u3001\u6587\u5b57\u8f38\u5165\u8f38\u51fa\u4e00\u9ad4\u878d\u5408\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u67b6\u69cb\uff1a<\/strong>\u6240\u6709\u8f38\u5165\u7576\u4f5c\u5e8f\u5217\u8655\u7406\uff0c\u8cc7\u8a0a\u6d41\u66a2\u3002<\/li>\n\n\n\n<li><strong>\u6587\u5b57\u8996\u7a97\uff1a<\/strong>128k tokens\u3002<\/li>\n\n\n\n<li><strong>\u958b\u653e\u6027\uff1a<\/strong>\u9589\u6e90\uff0c\u50c5OpenAI API\u3002<\/li>\n\n\n\n<li><strong>\u9069\u5408\u5834\u666f\uff1a<\/strong>\u667a\u6167\u5ba2\u670d\u3001\u591a\u6a21\u4e92\u52d5\u3001\u5373\u6642\u5716\u7247\u89e3\u6790+\u97f3\u8a0a\u8fa8\u8b58\u3002<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img decoding=\"async\" width=\"1797\" height=\"909\" src=\"https:\/\/aicats.wiki\/wp-content\/uploads\/2025\/09\/image-168.png\" alt=\"GPT-5\u4ecb\u9762\" class=\"wp-image-42386\" style=\"width:1087px;height:auto\"\/><figcaption class=\"wp-element-caption\">\u5716\uff0f<a href=\"https:\/\/chatgpt.com\/zh-Hans-CN\/overview\" title=\"\" target=\"_blank\"  rel=\"nofollow noopener\"  class=\"external\" >GPT-5\u4ecb\u9762<\/a><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">3. Claude 4.1 Vision<\/h3>\n\n\n\n<p><strong>\u958b\u767c\u5546\uff1a<\/strong>Anthropic<br><strong>\u7279\u9ede\uff1a<\/strong><strong>\u79d1\u5b78\u63a8\u7406\u8207OCR\u5168\u7403\u9818\u5148<\/strong>\uff0c\u8655\u7406\u8d85\u5927PDF\u3001\u7d50\u69cb\u5316\u6587\u4ef6\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u67b6\u69cb\uff1a<\/strong>\u91cd\u63a1\u6a23ViT+\u8f15\u91cf\u8f49\u63a5\u5668\uff0c\u63a8\u9032\u9577\u6587\u6a94OCR\u9ad8\u7cbe\u5ea6\u3002<\/li>\n\n\n\n<li><strong>\u6587\u5b57\u8996\u7a97\uff1a<\/strong>200k tokens\u3002<\/li>\n\n\n\n<li><strong>\u9069\u5408\u5834\u666f\uff1a<\/strong>\u5b78\u8853\u79d1\u7814\u3001\u91d1\u878d\u5831\u544a\u3001\u6cd5\u5f8b\u8cc7\u6599\u5206\u6790\u3002<\/li>\n\n\n\n<li><strong>\u61c9\u7528\u7a0b\u5f0f\u4eae\u9ede\uff1a<\/strong>PDF\/\u8868\u683c\u667a\u80fd\u8655\u7406\u7b49\u3002<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1797\" height=\"909\" src=\"https:\/\/aicats.wiki\/wp-content\/uploads\/2025\/09\/image-169.png\" alt=\"Image\" class=\"wp-image-42389\"\/><figcaption class=\"wp-element-caption\">\u5716\uff0f<a href=\"https:\/\/www.anthropic.com\/\" title=\"\" target=\"_blank\"  rel=\"nofollow noopener\"  class=\"external\" >Claude 4.1 Vision\u4ecb\u9762<\/a><\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><th>\u540d\u7a31<\/th><th>\u7279\u8272\u4e3b\u653b<\/th><th>\u901a\u7528\u5834\u666f<\/th><th>\u9577\u6587\u652f\u6301<\/th><th>OCR\u80fd\u529b<\/th><th>\u4e92\u52d5\u80fd\u529b<\/th><th>\u958b\u653e\u6027<\/th><\/tr><tr><td>Gemini 2.5 Pro<\/td><td>\u4e00\u822c<\/td><td>\u5f3a<\/td><td>\u5f3a<\/td><td>\u5f3a<\/td><td>\u6975\u5f37<\/td><td>\u9589\u6e90<\/td><\/tr><tr><td>GPT-5<\/td><td>\u878d\u5408<\/td><td>\u5f3a<\/td><td>\u6975\u5f37<\/td><td>\u5f3a<\/td><td>\u6975\u5f37<\/td><td>\u9589\u6e90<\/td><\/tr><tr><td>Claude 4.1 Vision<\/td><td>PDF\/OCR<\/td><td>\u8f03\u5f37<\/td><td>\u6975\u5f37<\/td><td>\u5353\u8d8a<\/td><td>\u5f3a<\/td><td>\u9589\u6e90<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">4. Qwen 2.5-VL-72B<\/h3>\n\n\n\n<p><strong>\u958b\u767c\u5546\uff1a<\/strong>\u963f\u91cc\u96f2\u901a\u7fa9\u5343\u554f<br><strong>\u7279\u9ede\uff1a<\/strong><strong>2025\u5e74\u6700\u5168\u80fd\u958b\u6e90\u591a\u6a21\u614b\u5927\u6a21\u578b\u4e4b\u4e00<\/strong>\uff0c\u9577\u5f71\u7247\u3001\u4efb\u610f\u89e3\u6790\u5ea6\u4efb\u52d9\u9748\u6d3b\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u67b6\u69cb\uff1a<\/strong>Window-Attention ViT+MRoPE+72B MoE\uff0c\u8907\u96dc\u4efb\u52d9\u9ad8\u6548\u8655\u7406\u3002<\/li>\n\n\n\n<li><strong>\u6587\u5b57\u8996\u7a97\uff1a<\/strong>128k tokens\u3002<\/li>\n\n\n\n<li><strong>\u958b\u653e\u6027\uff1a<\/strong>\u5b8c\u5168\u958b\u6e90\uff0c\u53ef\u81ea\u5efa\/\u7528API\u547c\u53eb\u3002<\/li>\n\n\n\n<li><strong>\u9069\u5408\u5834\u666f\uff1a<\/strong>\u4f01\u696d\u79c1\u6709\u5316\u6587\u4ef6AI\u3001\u9577\u6587\u5b57\/\u5f71\u7247\u7406\u89e3\u3002<\/li>\n\n\n\n<li><strong>\u61c9\u7528\u7a0b\u5f0f\u4eae\u9ede\uff1a<\/strong>GPU\u7b97\u529b\u7bc0\u7701\uff0c\u6210\u672c\u53ef\u63a7\u3001\u9577\u5f71\u7247\u7406\u89e3\u7b49\u3002<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1797\" height=\"909\" src=\"https:\/\/aicats.wiki\/wp-content\/uploads\/2025\/09\/image-170.png\" alt=\"Image\" class=\"wp-image-42396\"\/><figcaption class=\"wp-element-caption\">\u5716\uff0f<a href=\"https:\/\/github.com\/QwenLM\/Qwen-VL\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"external\" ><br>Qwen-VL<\/a><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">5. Llama 4 Scout \/ Llama 4 Vision<\/h3>\n\n\n\n<p><strong>\u958b\u767c\u5546\uff1a<\/strong>Meta AI<br><strong>\u7279\u9ede\uff1a<\/strong><strong>\u5148\u9032\u7684\u6df7\u5408\u5c08\u5bb6\u591a\u6a21\u67b6\u69cb<\/strong>\uff0c\u4efb\u52d9\u5f48\u6027\u5f37\uff0c\u958b\u767c\u8005\u793e\u7fa4\u6d3b\u8e8d\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u67b6\u69cb\uff1a<\/strong>\u52d5\u614bViT\u88dc\u4e01\u3001\u591a\u5c08\u5bb6\u6fc0\u6d3b\uff0c\u652f\u63f4\u9ad8\u4e26\u767c\u8207\u6c34\u5e73\u64f4\u5c55\u3002<\/li>\n\n\n\n<li><strong>\u6587\u5b57\u8996\u7a97\uff1a<\/strong>10k~100k tokens\u3002<\/li>\n\n\n\n<li><strong>\u958b\u653e\u6027\uff1a<\/strong>\u5b8c\u5168\u958b\u6e90\uff0cAPI\u5373\u958b\u5373\u7528\u3002<\/li>\n\n\n\n<li><strong>\u9069\u5408\u5834\u666f\uff1a<\/strong>\u5ba2\u88fd\u5316SaaS\u3001\u81ea\u52d5\u8fa6\u516c\u5ba4\u52a9\u7406\u3001\u908a\u7de3\u90e8\u7f72\u3002<\/li>\n\n\n\n<li><strong>\u61c9\u7528\u7a0b\u5f0f\u4eae\u9ede\uff1a<\/strong>\u652f\u63f4\u591a\u8a9e\u7a2e\u3001\u4f4e\u5ef6\u9072\u63a8\u7406\u3002<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1797\" height=\"909\" src=\"https:\/\/aicats.wiki\/wp-content\/uploads\/2025\/09\/image-171.jpg\" alt=\"Image\" class=\"wp-image-42397\"\/><figcaption class=\"wp-element-caption\">\u5716\uff0f<a href=\"https:\/\/ai.meta.com\/llama\/\" target=\"_blank\" rel=\"noopener\" class=\"external\" >Llama 4<\/a><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">6. MiniCPM-V 8B<\/h3>\n\n\n\n<p><strong>\u958b\u767c\u5546\uff1a<\/strong>OpenBMB &amp; \u6e05\u83efNLP<br><strong>\u7279\u9ede\uff1a<\/strong><strong>\u7aef\u5074\u591a\u6a21\u614bVLM\u65b0\u661f<\/strong>\uff0c\u4f4e\u7b97\u529b\u6d41\u66a2\u63a8\u7406\uff0c\u9069\u7528IoT\u8207\u884c\u52d5\u7aef\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u67b6\u69cb\uff1a<\/strong>\u7aef\u5074\u5c08\u7528\u7c21\u5316\u8996\u89ba\u5854+8B\u5c0f\u8a9e\u8a00\u6a21\u578b\u3002<\/li>\n\n\n\n<li><strong>\u6587\u5b57\u8996\u7a97\uff1a<\/strong>32k+<\/li>\n\n\n\n<li><strong>\u958b\u653e\u6027\uff1a<\/strong>\u5b8c\u5168\u958b\u6e90\u3002<\/li>\n\n\n\n<li><strong>\u9069\u5408\u5834\u666f\uff1a<\/strong>\u672c\u5730\u4f4e\u529f\u8017\u3001\u5de5\u696d\u61c9\u7528\u3002<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1797\" height=\"909\" src=\"https:\/\/aicats.wiki\/wp-content\/uploads\/2025\/09\/image-171.png\" alt=\"Image\" class=\"wp-image-42400\"\/><figcaption class=\"wp-element-caption\">\u5716\uff0f<a href=\"https:\/\/github.com\/OpenBMB\/MiniCPM-V\" target=\"_blank\" rel=\"noopener\" class=\"external\" >MiniCPM-V<\/a><\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">7. CogVLM 17B<\/h3>\n\n\n\n<p><strong>\u958b\u767c\u5546\uff1a<\/strong>THUDM\uff08\u6e05\u83ef\uff09<br><strong>\u7279\u9ede\uff1a<\/strong><strong>\u9ad8\u54c1\u8cea\u9810\u8a13\u7df4\uff0c\u591a\u8de8\u6a21\u614b\u6e2c\u8a66SOTA<\/strong>\uff0c\u7d30\u7c92\u5ea6\u5339\u914d\u3001\u5e7b\u89ba\u4f4e\u3002<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u67b6\u69cb\uff1a<\/strong>BLIP2-Qformer\u6b0a\u91cd\u958b\u653e\uff0c\u6613\u65bc\u81ea\u8a02\u4e8c\u6b21\u958b\u767c\u3002<\/li>\n\n\n\n<li><strong>\u6587\u5b57\u8996\u7a97\uff1a<\/strong>16k tokens\u3002<\/li>\n\n\n\n<li><strong>\u958b\u653e\u6027\uff1a<\/strong>\u5b8c\u5168\u958b\u6e90\u3002<\/li>\n\n\n\n<li><strong>\u9069\u5408\u5834\u666f\uff1a<\/strong>\u79d1\u7814\u4e8c\u6b21\u958b\u767c\u3001\u5716\u7247\u8aaa\u8a71\u3002<\/li>\n<\/ul>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1797\" height=\"909\" src=\"https:\/\/aicats.wiki\/wp-content\/uploads\/2025\/09\/image-172.png\" alt=\"Image\" class=\"wp-image-42401\"\/><figcaption class=\"wp-element-caption\">\u5716\uff0f<a href=\"https:\/\/github.com\/THUDM\/CogVLM\" target=\"_blank\" rel=\"noopener\" class=\"external\" >CogVLM<\/a><\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>\u5404\u5927VLM\u5de5\u5177\u9069\u7528\u5834\u666f\u4e00\u89bd<\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><th>\u4e3b\u8981\u5834\u666f<\/th><th>\u63a8\u85a6VLM<\/th><th>\u7279\u8272\/\u8aaa\u660e<\/th><\/tr><tr><td>\u591a\u8a9e\u8a00\u9577\u6587\u6a94+\u5716\u7247<\/td><td><strong>Claude 3.7 Vision<\/strong>\u3001<strong>Qwen 2.5-VL-72B<\/strong><\/td><td>\u6975\u81f4\u9577\u7a97\u53e3\uff0cPDF\/\u8868\u683c\/\u6cd5\u5f8b\u7a3f\u4ef6<\/td><\/tr><tr><td>OCR+\u5716\u8868<\/td><td><strong>Qwen 2.5-VL-72B<\/strong>\u3001<strong>GPT-4o<\/strong><\/td><td>\u9ad8\u7cbe\u5ea6\u7d50\u69cb\u5316\u8cc7\u6599\u5206\u6790<\/td><\/tr><tr><td>\u5f71\u7247\/\u5716\u7247\u7406\u89e3<\/td><td><strong>Gemini 2.5 Pro<\/strong>\u3001<strong>Llama 4 Vision<\/strong><\/td><td>\u8907\u96dc\u591a\u6a21\u614b\u4efb\u52d9<\/td><\/tr><tr><td>\u7aef\u5074\u63a8\u7406<\/td><td><strong>MiniCPM-V 8B<\/strong><\/td><td>IoT\u3001\u5de5\u696d\u7aef\u5074\u63a8\u7406<\/td><\/tr><tr><td>\u5ba2\u88fd\u5316\u8a13\u7df4<\/td><td><strong>CogVLM 17B<\/strong>\u3001<strong>Llama 4 Vision<\/strong><\/td><td>\u79c1\u6709\u8cc7\u6599\u589e\u5f37<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p><em>\u63a8\u85a6API\u5e73\u53f0\uff1a\u53ef\u7528<a href=\"https:\/\/novita.ai\/models\/llm\/qwen-qwen2.5-vl-72b-instruct\" target=\"_blank\" rel=\"noopener\" class=\"external\" >Novita AI<\/a>\u7b49\u5e73\u53f0\u76f4\u63a5API\u547c\u53eb\u4e3b\u6d41VLM\u6a21\u578b\u3002<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>\u7522\u696d\u8da8\u52e2\u8207\u9078\u578b\u5efa\u8b70<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\">* \u70ba\u4ec0\u9ebc\u958b\u6e90VLM\uff08Qwen 2.5-VL\u3001Llama 4\uff09\u6703\u6210\u4e2d\u570b\u5e02\u5834\u4e3b\u6d41\uff1f<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u653f\u7b56\u5408\u898f\uff1a<\/strong>\u81ea\u4e3b\u53ef\u63a7\u8cc7\u6599\u96b1\u79c1\uff0c\u9069\u914d\u570b\u5167\u653f\u7b56\u3002<\/li>\n\n\n\n<li><strong>\u9ad8\u6027\u50f9\u6bd4\uff1a<\/strong>\u5927\u898f\u6a21\u90e8\u7f72\/\u672c\u5730\u63a8\u7406\uff0c\u64fa\u812b\u6d77\u5916\u96f2\u7aef\u4f9d\u8cf4\u3002<\/li>\n\n\n\n<li><strong>\u6280\u8853\u751f\u614b\u6d3b\u8e8d\uff1a<\/strong>\u63d2\u4ef6\u8c50\u5bcc\uff0c\u6587\u4ef6\u5b8c\u5584\u3002<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">* \u4f55\u6642\u4f7f\u7528\u9589\u6e90\u5927\u6a21\u578b\uff08GPT-4o, Gemini Pro\uff09\uff1f<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u5168\u7403\u5316\u9700\u6c42\uff1a<\/strong>\u8de8\u570b\u516c\u53f8\/\u570b\u969b\u79d1\u7814\u9ad4\u9a57AI\u6975\u9650\u3002<\/li>\n\n\n\n<li><strong>\u6975\u7aef\u5927\u6578\u64da\u5834\u666f\uff1a<\/strong>\u8d85\u9577\u4e0a\u4e0b\u6587\/\u591a\u7aef\u5373\u6642\u540c\u6b65\u3002<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">* \u672a\u4f86\u8da8\u52e2<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>\u591a\u6a21\u614bVLM\u5411\u8f15\u91cf\u7aef\u5074+\u96f2\u7aef\u539f\u751fAPI\u4e26\u9032<\/strong><\/li>\n\n\n\n<li><strong>\u5e7b\u89ba\u7387\u63a7\u5236\/\u5716\u8868\u7406\u89e3\/\u591a\u8a9e\u6df7\u5408\u70ba\u4e3b\u6230\u7dda<\/strong><\/li>\n\n\n\n<li><strong>API\u5e73\u53f0\u5927\u52e2\uff0c\u61c9\u7528\u9580\u6abb\u6301\u7e8c\u964d\u4f4e<\/strong><\/li>\n<\/ul>\n\n\n\n<p><strong>\u7e3d\u7d50\uff1a<\/strong>2025\u5e74\uff0cVLM\u5df2\u9032\u5165<strong>\u958b\u6e90\u8207API\u878d\u5408\u3001\u570b\u7522\u81ea\u7814\u8207\u570b\u969b\u9589\u6e90\u5171\u9032<\/strong>\u968e\u6bb5\uff0c\u6240\u6709\u7522\u696d\u90fd\u53ef\u4f9d\u9700\u6c42\u3001\u9810\u7b97\u3001\u6c11\u71df\u5316\u6216\u6975\u81f4\u6027\u80fd\uff0c\u5f9e7\u5927\u6700\u4f73\u5de5\u5177\u4e2d\u9748\u6d3b\u9078\u578b\u3002\u672a\u4f86\uff0c\u591a\u6a21\u614b\u6a21\u578b\u5fc5\u5c07\u63a8\u52d5\u7522\u54c1\u5275\u65b0\u8207\u751f\u7522\u529b\u8e8d\u5347\u3002\u66f4\u591a\u958b\u6e90VLM\u9ad4\u9a57\u53ef\u898b<a href=\"https:\/\/novita.ai\/model-api\" target=\"_blank\" rel=\"noopener\" class=\"external\" >Novita AI\u6a21\u578b\u5eab<\/a>\u3002<\/p>\n\n\n\n<p><\/p>","protected":false},"excerpt":{"rendered":"<p>2025\u5e74\uff0c\u591a\u6a21\u6001\u5927\u6a21\u578b\uff08VLM, Vision-Language Model\uff09\u6210\u4e3aAI\u6280\u672f\u53d1\u5c55\u65b0\u9ad8\u5730\u3002\u672c\u6587\u6df1 [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_crsspst_to_aicatswiki":true,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"footnotes":""},"categories":[305],"tags":[247,336,853,324],"content_visibility":[262],"class_list":["post-37220","post","type-post","status-publish","format-standard","hentry","category-ai-tools-platforms","tag-ai"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/posts\/37220","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/comments?post=37220"}],"version-history":[{"count":1,"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/posts\/37220\/revisions"}],"predecessor-version":[{"id":42404,"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/posts\/37220\/revisions\/42404"}],"wp:attachment":[{"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/media?parent=37220"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/categories?post=37220"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/tags?post=37220"},{"taxonomy":"content_visibility","embeddable":true,"href":"https:\/\/aicats.wiki\/tw\/wp-json\/wp\/v2\/content_visibility?post=37220"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}