{"id":751,"date":"2024-04-17T13:28:59","date_gmt":"2024-04-17T05:28:59","guid":{"rendered":"http:\/\/madapapa.com\/wordpress\/?p=751"},"modified":"2024-04-17T20:04:22","modified_gmt":"2024-04-17T12:04:22","slug":"nim-yi-yu-he-wei","status":"publish","type":"post","link":"http:\/\/madapapa.com\/wordpress\/?p=751","title":{"rendered":"nim\u610f\u6b32\u4f55\u4e3a"},"content":{"rendered":"<p>\u4e3a\u4e86\u4e86\u89e3nim\u610f\u6b32\u4f55\u4e3a\uff0c\u6211\u4ece\u4e0b\u9762\u51e0\u7bc7\u6587\u7ae0\u91cc\u6458\u5f55\u4e86\u611f\u5174\u8da3\u7684\u89c2\u70b9\uff0c\u9996\u5148\u662f\u6211\u5728gtc\u7ebf\u4e0a\u4f1a\u8bae\u62cd\u6444\u7684\u7167\u7247<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/madapapa.com\/wordpress\/wp-content\/uploads\/2024\/04\/17133394338980-1.jpg\" alt=\"\" \/><\/p>\n<p><img decoding=\"async\" src=\"http:\/\/madapapa.com\/wordpress\/wp-content\/uploads\/2024\/04\/17133394340519-1.jpg\" alt=\"\" \/><\/p>\n<h2><a id=\"%E8%8B%B1%E4%BC%9F%E8%BE%BE%E5%88%A9%E7%94%A8nim%E5%8A%A0%E9%80%9F%E7%94%9F%E6%88%90%E5%BC%8Fai\" class=\"anchor\" aria-hidden=\"true\"><span class=\"octicon octicon-link\"><\/span><\/a>\u82f1\u4f1f\u8fbe\u5229\u7528nim\u52a0\u901f\u751f\u6210\u5f0fAI<\/h2>\n<p><a href=\"https:\/\/www.hpcwire.com\/2024\/03\/19\/nvidia-looks-to-accelerate-genai-adoption-with-nim\/\">Nvidia Looks to Accelerate GenAI Adoption with NIM<\/a><\/p>\n<p>kimi\u7684\u7b54\u590d\uff1a<\/p>\n<pre><code class=\"language-plain_text\">\u5982\u679c\u60a8\u53ea\u662f\u5728\u5355\u673a\u6216\u8005\u8fb9\u7f18\u8bbe\u5907\u4e0a\u4f7f\u7528Nvidia Inference Microservice\uff08NIM\uff09\uff0c\u60a8\u53ef\u80fd\u4e0d\u9700\u8981Helm\u3002Helm\u662f\u4e00\u4e2aKubernetes\u7684\u5305\u7ba1\u7406\u5de5\u5177\uff0c\u5b83\u4e3b\u8981\u7528\u4e8e\u81ea\u52a8\u5316Kubernetes\u73af\u5883\u4e2d\u7684\u5e94\u7528\u7a0b\u5e8f\u90e8\u7f72\u3002\u5f53\u60a8\u5728\u5355\u673a\u6216\u8fb9\u7f18\u8bbe\u5907\u4e0a\u4f7f\u7528NIM\u65f6\uff0c\u60a8\u7684\u73af\u5883\u53ef\u80fd\u4e0d\u4f9d\u8d56\u4e8eKubernetes\uff0c\u56e0\u6b64\u4e5f\u5c31\u4e0d\u9700\u8981Helm\u6765\u8fdb\u884c\u90e8\u7f72\u548c\u7ba1\u7406\u3002\n\n\u5728\u8fd9\u79cd\u60c5\u51b5\u4e0b\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u90e8\u7f72NIM\u4f5c\u4e3a\u4e00\u4e2a\u5bb9\u5668\u5316\u7684\u5fae\u670d\u52a1\uff0c\u6216\u8005\u5c06\u5176\u96c6\u6210\u5230\u60a8\u73b0\u6709\u7684\u57fa\u7840\u8bbe\u65bd\u4e2d\u3002NIM\u7684\u8bbe\u8ba1\u5141\u8bb8\u5b83\u7075\u6d3b\u5730\u9002\u5e94\u4e0d\u540c\u7684\u90e8\u7f72\u573a\u666f\uff0c\u5305\u62ec\u4f46\u4e0d\u9650\u4e8eKubernetes\u96c6\u7fa4\u3001\u4e91\u73af\u5883\u3001\u865a\u62df\u673a\u6216\u7269\u7406\u673a\u3002\u8fd9\u6837\uff0c\u65e0\u8bba\u60a8\u662f\u5728\u6570\u636e\u4e2d\u5fc3\u3001\u79c1\u6709\u4e91\u8fd8\u662f\u8fb9\u7f18\u8ba1\u7b97\u73af\u5883\u4e2d\uff0c\u90fd\u53ef\u4ee5\u5229\u7528NIM\u6765\u52a0\u901fAI\u63a8\u7406\u4efb\u52a1\u3002\n\n\u6b64\u5916\uff0cNIM\u7684\u5bb9\u5668\u5316\u7279\u6027\u610f\u5473\u7740\u5b83\u53ef\u4ee5\u8f7b\u677e\u5730\u5728\u4e0d\u540c\u7684\u73af\u5883\u4e2d\u8fc1\u79fb\u548c\u8fd0\u884c\uff0c\u800c\u4e0d\u9700\u8981\u5bf9\u5e95\u5c42\u57fa\u7840\u8bbe\u65bd\u8fdb\u884c\u5927\u91cf\u7684\u4fee\u6539\u3002\u8fd9\u79cd\u7075\u6d3b\u6027\u4f7f\u5f97NIM\u6210\u4e3a\u5728\u591a\u79cd\u8ba1\u7b97\u73af\u5883\u4e2d\u90e8\u7f72AI\u6a21\u578b\u7684\u7406\u60f3\u9009\u62e9\u3002\n\n\u603b\u4e4b\uff0c\u5982\u679c\u60a8\u7684\u4f7f\u7528\u573a\u666f\u4e0d\u6d89\u53caKubernetes\u96c6\u7fa4\uff0c\u90a3\u4e48\u60a8\u4e0d\u9700\u8981Helm\u3002\u60a8\u53ef\u4ee5\u76f4\u63a5\u90e8\u7f72\u548c\u8fd0\u884cNIM\uff0c\u6216\u8005\u5c06\u5176\u96c6\u6210\u5230\u60a8\u7684\u5355\u673a\u6216\u8fb9\u7f18\u8bbe\u5907\u73af\u5883\u4e2d\uff0c\u4ee5\u5b9e\u73b0\u9ad8\u6548\u7684AI\u63a8\u7406\u3002\n<\/code><\/pre>\n<h2><a id=\"%E4%BB%8E%E2%80%9C%E6%8E%A8%E7%90%86%E6%9C%8D%E5%8A%A1%E2%80%9D%E5%88%B0%E2%80%9C%E6%8E%A8%E7%90%86%E5%BE%AE%E6%9C%8D%E5%8A%A1%E2%80%9D\" class=\"anchor\" aria-hidden=\"true\"><span class=\"octicon octicon-link\"><\/span><\/a>\u4ece\u201c\u63a8\u7406\u670d\u52a1\u201d\u5230\u201c\u63a8\u7406\u5fae\u670d\u52a1\u201d<\/h2>\n<p><a href=\"https:\/\/venturebeat.com\/ai\/whats-a-nim-nvidia-inference-manager-is-new-approach-to-gen-ai-model-deployment-that-could-change-the-industry\/\">What\u2019s a NIM? Nvidia Inference Microservices is new approach to gen AI model deployment that could change the industry<\/a><\/p>\n<ul>\n<li>Nvidia Inference Microservices (NIM)\uff0c<br \/>\nwhich packages optimized inference engines, industry-standard APIs and support for AI models into containers for easy deployment\u3002While NIM provides prebuilt models, it also allows organizations to bring their own proprietary data and will support and help to accelerate Retrieval Augmented Generation (RAG) deployment.<\/li>\n<\/ul>\n<pre><code class=\"language-plain_text\">What exactly is Nvidia NIM?\n\nAt the most basic level, a NIM is a container full of microservices. \n\nThe container can include any type of model, ranging from open to proprietary models, that can run anywhere there is an Nvidia GPU \u2014 be that in the cloud, or even just in a laptop. In turn, that container can be deployed anywhere a container can run, \n* which could be a Kubernetes deployment in the cloud, \n* a Linux server or \n* even a serverless Function-as-a-Service model. Nvidia will have the serverless function approach on its new ai.nvidia.com website, where developers can go to begin working with NIM prior to deployment.\n\nTo be clear, a NIM isn\u2019t a replacement for any prior approach to model delivery from Nvidia. It\u2019s a container that includes a highly optimized model for Nvidia GPUs along with the necessary technologies to improve inference.\n<\/code><\/pre>\n<ul>\n<li>\u66f4\u597d\u652f\u6301RAG<\/li>\n<\/ul>\n<pre><code class=\"language-plain_text\">The RAG approach will benefit from the integration of NVIDIA NeMo Retriever microservices inside of NIM deployments. NeMo Retriever is a technology that Nvidia announced in November 2023 as an approach to help enable RAG with an optimized approach for data retrieval.\n<\/code><\/pre>\n<p>\u9664\u4e86langchain\uff0cllamaindex\uff0c\u73b0\u5728\u53c8\u5192\u51fa\u4e00\u5bb6\u5e94\u7528\u6846\u67b6\u516c\u53f8<a href=\"https:\/\/www.deepset.ai\">deepset<\/a><\/p>\n<p>\u82f1\u4f1f\u8fbenim\u7f51\u7ad9<br \/>\n<a href=\"https:\/\/www.nvidia.com\/en-us\/ai\/\">https:\/\/www.nvidia.com\/en-us\/ai\/<\/a><\/p>\n<p><a href=\"https:\/\/build.nvidia.com\/mistralai\/mixtral-8x7b-instruct\">https:\/\/build.nvidia.com\/mistralai\/mixtral-8x7b-instruct<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u4e3a\u4e86\u4e86\u89e3nim\u610f\u6b32\u4f55\u4e3a\uff0c\u6211\u4ece\u4e0b\u9762\u51e0\u7bc7\u6587\u7ae0\u91cc\u6458\u5f55\u4e86\u611f\u5174\u8da3\u7684\u89c2\u70b9\uff0c\u9996\u5148\u662f\u6211\u5728gtc\u7ebf\u4e0a\u4f1a\u8bae\u62cd\u6444\u7684\u7167\u7247 \u82f1\u4f1f\u8fbe\u5229\u7528nim\u52a0\u901f\u751f\u6210\u5f0fAI Nvidia Looks to Accelerate GenAI Adoption with NIM kimi\u7684\u7b54\u590d\uff1a \u5982\u679c\u60a8\u53ea\u662f\u5728\u5355\u673a\u6216\u8005\u8fb9\u7f18\u8bbe\u5907\u4e0a\u4f7f\u7528Nvidia Inference Microservice\uff08NIM\uff09\uff0c\u60a8\u53ef\u80fd\u4e0d\u9700\u8981Helm\u3002Helm\u662f\u4e00\u4e2aKubernetes\u7684\u5305\u7ba1\u7406\u5de5\u5177\uff0c\u5b83\u4e3b\u8981\u7528\u4e8e\u81ea\u52a8\u5316Kubernetes\u73af\u5883\u4e2d\u7684\u5e94\u7528\u7a0b\u5e8f\u90e8\u7f72\u3002\u5f53\u60a8\u5728\u5355\u673a\u6216\u8fb9\u7f18\u8bbe\u5907\u4e0a\u4f7f\u7528NIM\u65f6\uff0c\u60a8\u7684\u73af\u5883\u53ef\u80fd\u4e0d\u4f9d\u8d56\u4e8eKubernetes\uff0c\u56e0\u6b64\u4e5f\u5c31\u4e0d\u9700\u8981Helm\u6765\u8fdb\u884c\u90e8\u7f72\u548c\u7ba1\u7406\u3002 \u5728\u8fd9\u79cd\u60c5\u51b5\u4e0b\uff0c\u60a8\u53ef\u4ee5\u76f4\u63a5\u90e8\u7f72NIM\u4f5c\u4e3a\u4e00\u4e2a\u5bb9\u5668\u5316\u7684\u5fae\u670d\u52a1\uff0c\u6216\u8005\u5c06\u5176\u96c6\u6210\u5230\u60a8\u73b0\u6709\u7684\u57fa\u7840\u8bbe\u65bd\u4e2d\u3002NIM\u7684\u8bbe\u8ba1\u5141\u8bb8\u5b83\u7075\u6d3b\u5730\u9002\u5e94\u4e0d\u540c\u7684\u90e8\u7f72\u573a\u666f\uff0c\u5305\u62ec\u4f46\u4e0d\u9650\u4e8eKubernetes\u96c6\u7fa4\u3001\u4e91\u73af\u5883\u3001\u865a\u62df\u673a\u6216\u7269\u7406\u673a\u3002\u8fd9\u6837\uff0c\u65e0\u8bba\u60a8\u662f\u5728\u6570\u636e\u4e2d\u5fc3\u3001\u79c1\u6709\u4e91\u8fd8\u662f\u8fb9\u7f18\u8ba1\u7b97\u73af\u5883\u4e2d\uff0c\u90fd\u53ef\u4ee5\u5229\u7528NIM\u6765\u52a0\u901fAI\u63a8\u7406\u4efb\u52a1\u3002 \u6b64\u5916\uff0cNIM\u7684\u5bb9\u5668\u5316\u7279\u6027\u610f\u5473\u7740\u5b83\u53ef\u4ee5\u8f7b\u677e\u5730\u5728\u4e0d\u540c\u7684\u73af\u5883\u4e2d\u8fc1\u79fb\u548c\u8fd0\u884c\uff0c\u800c\u4e0d\u9700\u8981\u5bf9\u5e95\u5c42\u57fa\u7840\u8bbe\u65bd\u8fdb\u884c\u5927\u91cf\u7684\u4fee\u6539\u3002\u8fd9\u79cd\u7075\u6d3b\u6027\u4f7f\u5f97NIM\u6210\u4e3a\u5728\u591a\u79cd\u8ba1\u7b97\u73af\u5883\u4e2d\u90e8\u7f72AI\u6a21\u578b\u7684\u7406\u60f3\u9009\u62e9\u3002 \u603b\u4e4b\uff0c\u5982\u679c\u60a8\u7684\u4f7f\u7528\u573a\u666f\u4e0d\u6d89\u53caKubernetes\u96c6\u7fa4\uff0c\u90a3\u4e48\u60a8\u4e0d\u9700\u8981Helm\u3002\u60a8\u53ef\u4ee5\u76f4\u63a5\u90e8\u7f72\u548c\u8fd0\u884cNIM\uff0c\u6216\u8005\u5c06\u5176\u96c6\u6210\u5230\u60a8\u7684\u5355\u673a\u6216\u8fb9\u7f18\u8bbe\u5907\u73af\u5883\u4e2d\uff0c\u4ee5\u5b9e\u73b0\u9ad8\u6548\u7684AI\u63a8\u7406\u3002 \u4ece\u201c\u63a8\u7406\u670d\u52a1\u201d\u5230\u201c\u63a8\u7406\u5fae\u670d\u52a1\u201d What\u2019s a NIM? Nvidia Inference Microservices is new approach to gen AI model deployment that could change the industry Nvidia Inference Microservices (NIM)\uff0c which packages optimized inference engines, industry-standard APIs and support for AI models into containers &hellip; <a href=\"http:\/\/madapapa.com\/wordpress\/?p=751\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">nim\u610f\u6b32\u4f55\u4e3a<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_bbp_topic_count":0,"_bbp_reply_count":0,"_bbp_total_topic_count":0,"_bbp_total_reply_count":0,"_bbp_voice_count":0,"_bbp_anonymous_reply_count":0,"_bbp_topic_count_hidden":0,"_bbp_reply_count_hidden":0,"_bbp_forum_subforum_count":0,"footnotes":""},"categories":[48,47],"tags":[],"class_list":["post-751","post","type-post","status-publish","format-standard","hentry","category-artificial-intelligence","category-datascience"],"_links":{"self":[{"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/751","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=751"}],"version-history":[{"count":3,"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/751\/revisions"}],"predecessor-version":[{"id":758,"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=\/wp\/v2\/posts\/751\/revisions\/758"}],"wp:attachment":[{"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=751"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=751"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/madapapa.com\/wordpress\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=751"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}