英文是打开世界的钥匙,同志尚需努力

使用colab进行【中文分词模型测试】,编程环境自动给出了train_test_spint函数的提示,我对于这种提示不太清楚,就让chatgpt来解释。

请chatgpt第一次解释

The image you uploaded shows the function signature and description for a function used to split arrays or matrices into random train and test subsets. This function is likely train_test_split from the sklearn.model_selection module in the scikit-learn library. Here’s a breakdown of the parameters and how to use them:

Function Signature
python
Copy code
(*arrays: Any, test_size: Any | None = None, train_size: Any | None = None, random_state: Any | None = None, shuffle: bool = True, stratify: Any | None = None) -> list
Description
Purpose: The function splits arrays or matrices into random train and test subsets.
Usage: It wraps input validation and application to input data into a single call for splitting (and optionally subsampling) data into a one-liner.
Parameters
*arrays: Sequence of indexables with the same length/shape[0]. This is where you pass the data you want to split. It can be features (X) and labels (y).
test_size: The proportion of the dataset to include in the test split. Can be a float (e.g., 0.25 for 25% test data), an integer (number of samples), or None (in which case the value is set to the complement of the train size).
train_size: The proportion of the dataset to include in the train split. Can be a float, an integer, or None (in which case the value is automatically set to the complement of the test size).
random_state: Controls the shuffling applied to the data before splitting. Pass an integer for reproducible output across multiple function calls.
shuffle: Whether or not to shuffle the data before splitting. Default is True.
stratify: If not None, data is split in a stratified fashion, using this as the class labels.

请chatgpt第二次解释

I understand that the concept can be a bit confusing, so let me clarify it further.

The random_state parameter in functions like train_test_split controls the randomness of the operation. It ensures that the random processes involved (such as shuffling the data) are repeatable. By setting random_state to a specific value, you ensure that the split will be the same each time you run the code, as long as the data and the code remain unchanged.

Here’s a step-by-step explanation and demonstration:

  1. What does random_state do?

    • random_state sets the seed for the random number generator. Different seeds will produce different sequences of random numbers, and therefore different shuffles and splits of the data.
  2. Why does it matter?

    • Using the same random_state value allows you to reproduce your results exactly. This is useful for debugging, sharing your results with others, and ensuring consistent results when rerunning experiments.
  3. Different values for random_state:

    • Different random_state values will initialize the random number generator differently, leading to different shuffles of the data and, consequently, different splits.

Demonstration

Let’s demonstrate this with a simple example using train_test_split with different random_state values.

Using random_state=1

from sklearn.model_selection import train_test_split

# Sample data
X = [[1, 2], [3, 4], [5, 6], [7, 8]]
y = [0, 1, 0, 1]

# Split the data with random_state=1
X_train_1, X_test_1, y_train_1, y_test_1 = train_test_split(X, y, test_size=0.25, random_state=1)

print("random_state=1")
print("X_train:", X_train_1)
print("X_test:", X_test_1)
print("y_train:", y_train_1)
print("y_test:", y_test_1)

Using random_state=42

# Split the data with random_state=42
X_train_42, X_test_42, y_train_42, y_test_42 = train_test_split(X, y, test_size=0.25, random_state=42)

print("\nrandom_state=42")
print("X_train:", X_train_42)
print("X_test:", X_test_42)
print("y_train:", y_train_42)
print("y_test:", y_test_42)

Expected Output

For random_state=1

random_state=1
X_train: [[5, 6], [1, 2], [7, 8]]
X_test: [[3, 4]]
y_train: [0, 0, 1]
y_test: [1]

For random_state=42

random_state=42
X_train: [[7, 8], [3, 4], [1, 2]]
X_test: [[5, 6]]
y_train: [1, 1, 0]
y_test: [0]

As you can see, using random_state=1 and random_state=42 results in different splits of the data. This is because the random number generator is initialized differently with each value, resulting in different shuffling of the data before it is split.

Consistency Across Runs

If you run the above code multiple times with the same random_state value, you will get the same split every time. However, changing the random_state value will change the split.

Running Multiple Times for the Same random_state

To verify this, you can run the split multiple times with the same random_state:

for i in range(3):
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
    print(f"Run {i+1}:")
    print("X_train:", X_train)
    print("X_test:", X_test)
    print("y_train:", y_train)
    print("y_test:", y_test)
    print("-" * 30)

Expected Output for Each Run with random_state=42

Run 1:
X_train: [[7, 8], [3, 4], [1, 2]]
X_test: [[5, 6]]
y_train: [1, 1, 0]
y_test: [0]
------------------------------
Run 2:
X_train: [[7, 8], [3, 4], [1, 2]]
X_test: [[5, 6]]
y_train: [1, 1, 0]
y_test: [0]
------------------------------
Run 3:
X_train: [[7, 8], [3, 4], [1, 2]]
X_test: [[5, 6]]
y_train: [1, 1, 0]
y_test: [0]
------------------------------

This demonstrates that the split is consistent across multiple runs with the same random_state. Changing the random_state value will produce a different but consistent split for that specific value.

很好的软件行业现状描述

原文在此,下面是全文引用

We have used too many levels of abstractions and now the future looks bleak

Published on 2023-10-21. Modified on 2023-11-06.

A big percentage of so-called experts today only know how to configure tools, but they understand nothing about how things work at the deeper level. This is a real challenge and a big problem for the future.

A steering wheel is an abstraction that makes it easier for me to drive my car. Power steering is yet another level of abstraction that further improves the driving experience. Abstractions are nice, they generally improve the quality of life. However, in Denmark we have a proverb that says:

Too little and too much spoils everything.

What good does an abstraction do when it breaks and nobody any longer understand how the technology works under the hood?

Everything in the tech industry is driven with a very hardcore eye for profit and very little interest in anything else. So you need to be able to push out new products or new services as fast as possible. This means more abstraction and more automation, less and less people, and less deeper understanding.

Today programmers and system administrators no longer exist, instead we have DevOps and even DevSecOps, in which the industry is trying very hard to stuff every single task into the job description of a single individual. The tech guys needs to do development (Dev), security (Sec) and operations (Ops), i.e. system administration, but since no single individual can truly master all that, we need to automate as much as possible in order to save money and avoid the complexities of human social interaction between different tech departments. As a result, the modern tech person is only taught about how to use specific tools, he or she then knows very little about the technology under the hood.

It doesn’t help that technology has become increasingly difficult to understand, but more and more of modern life depend heavily upon the tech we’re using. So what is going to happen when the level of understanding in the tech industry reaches such a low point in which the majority of people don’t even know how to fix the tools they are using?

"Manual scene" from the WALL-E movie.

People have become accustomed to the state of abstraction and they think it’s the correct approach and they happily contribute to the mess by adding even more abstraction.

Yes, let’s all go back to coding in assembly!

― Sarcastic comment by arrogant developer

We need abstractions, no doubt about it, but every level of abstraction comes with a heavy price which, ironically enough, eventually can cause a massive loss in profit.

Modern programming scares me in many respects, where they will just build layer after layer after layer that does nothing except translate.

― Ken Thompson

Already now a majority of "security people" know very little about security and only about how to use some kind of pre-made penetration testing tool. The penetration testing tool shows a bunch of green lights in its web GUI board and all is assumed well. Yet, a real security expert with evil intentions has broken the system long ago and keeps selling valuable data on the darknet. Nothing is leaked and nothing is discovered. This can go on for years without anyone finding out because, well, the GUI board says that all is OK.

Some students today apparently don’t even know what files and folders are.

Advice to people studying technology

Never just follow hype or trends.
Be curious. Don’t just learn tools, try to understand how the underlying technology works.
If possible, try at least once to manually do what e.g. a configuration tool does for you.
If possible, try to look at the code for the tool. Even a basic understanding of the code can be very valuable.
Stay curious. Keep learning. Experiment. Dive deeper into the technology that interests you. If possible, set up a homelab and use it as a playground for learning and breaking things.
Question everything. Especially things that don’t make any sense to you. Don’t just assume that someone else knows better – that’s how you quickly turn into a blind follower. Sometimes someone else truly knows better, but don’t just assume that to be the case by default. And be brave! Stand by the truth and your convictions even if that makes you feel like you stand alone.

People blindly following eachother
The point I am making with this post is not that everything needs to be understood from first principles by everyone, or that you shouldn’t use any tools. As I said, we need abstractions. Furthermore, we have people who specialize in different fields such that e.g. the mechanic fixes the truck while the driver drives the truck.

Rather, what I am addressing is the important value of an engineer’s attitude toward technology by people working with technology.

In e.g. software development, too many specialists have been abstracted away and replaced by tools and automation and fewer and fewer people understand anything even one layer directly beneath the layer they are working on.

This is a major problem because we will eventually reach a point in which very few people can fix anything in the layers below. And the fact of the matter is that we have already partly reached that point!

About half a year ago I stumbled upon some front-end web developers who did not know that you can create a website without a deployment tool and that you don’t need any JavaScript at all, even when the website takes payment. I asked a friend of mine about it, who at the time was teaching a programming class in Python, and he said:

Don’t be surprised by this. This is the level today. The industry wants us to mass produce people who know how to "push buttons", not people who understand anything at a deeper level.

I know that there will always be some people who take an interest in the deeper levels, that is not the point. The point is that in software development specifically, we have long reached the point where we have added too many layers of abstraction and too few people understand what they are doing. The industry is shooting itself in the foot.

If e.g. I am a web developer, whether front-end or back-end, or are doing so-called "integration work", and I am producing websites without much coding or any knowledge of TCP/IP, DNS, HTTP, TLS, security, etc., using only pre-made tools or frameworks, then that will make me about as useful as a monkey with a torque wrench when something goes wrong.

If you have any comments or corrections please feel free to email them to me. Also, if you found any of the content on this website useful consider supporting me on Patreon 😉

减肥最佳食品

最后发现,碳水化合物(面包、炸薯条、煮土豆)的效果最好。其中,煮土豆的饱腹感指数最高。
这是因为食物中含有的纤维、蛋白质和水越多,消化时间就越长,从而产生更久的饱腹感。而且,碳水化合物会立即升高血糖,从而使得身体马上知道已经饱了。
煮土豆含有大量的纤维和水,同样热量的情况下,体积又很大,还会升高血糖,所以产生了长时间的饱腹感。

原文链接:http://www.mendosa.com/satiety.htm
ruanyifeng翻译:http://www.ruanyifeng.com/blog/2024/05/weekly-issue-299.html

用kimi和chatgpt来解释决策树

以下是kimi对上面图片的分析
您提供的文件内容似乎是决策树算法生成的一个决策树的文本表示。这个决策树用于分类任务,其中每个节点都是一个决策点,每个叶节点代表一个分类结果。以下是对文件内容的解释:

  1. 根节点:决策树从根节点开始,根节点的决策条件是 sub_grade <= 7.5,表示子等级(可能是贷款风险等级)小于或等于7.5。这个节点的基尼不纯度指数(Gini Index)是0.5,表示在该节点下,数据集是完全随机的,即正负样本各占一半。

  2. 左子树:如果 sub_grade <= 7.5 为真,那么进入左子树。左子树的第一个决策点是 total_rec_late_fee <= 9.0,表示最近记录的逾期费用小于或等于9.0。这个节点的Gini指数是0.436,表示比根节点稍微纯净一些。

  3. 右子树:如果 sub_grade <= 7.5 为假,那么进入右子树。右子树的第一个决策点是 sub_grade <= 15.5,表示子等级小于或等于15.5。这个节点的Gini指数是0.49,表示这个分支的数据集比左子树的稍微不纯净一些。

  4. 进一步的分支:每个子节点都有可能继续分支,直到满足停止条件。例如,左子树的某个分支进一步根据 purpose <= 1.5 进行决策,而右子树的分支可能根据 total_rec_late_fee <= 0.5 进行决策。

  5. 叶节点:最终,每个分支都会到达一个叶节点,叶节点给出了一个分类结果,例如 class = safeclass = not safe。这些叶节点的Gini指数通常较低,表示在这些节点下的数据集分类非常明确。

  6. 样本数量和分类值:每个节点都包含了样本数量(samples)和分类值(value),分类值是一个二维数组,第一个数字代表正类(例如“safe”),第二个数字代表负类(例如“not safe”)。

  7. Gini指数:Gini指数是一个衡量数据集纯度的指标,值越低表示数据集越纯净。在决策树中,我们希望在每个节点上通过特征选择来最大化信息增益,从而降低Gini指数。

这个决策树可能是用于信用评分或风险评估的,其中 sub_grade 可能代表贷款的子等级,total_rec_late_fee 可能代表总的最近记录的逾期费用,purpose 可能代表贷款目的。最终的分类结果 class 可能表示贷款是否被认为是安全的(safe)或不安全的(not safe)。

下面是chatgpt的解释

The image you’ve uploaded depicts a decision tree, which is used to model decisions and their possible consequences, in this case categorizing something as "safe" or "not safe" based on various features. Let’s break down this specific tree step-by-step:

Root Node

  • Feature: total_grade <= 7.5
  • Gini: 0.5
  • Samples: 34881
  • Class Value: [17348, 17533]
  • Decision: This is the initial decision point. If the total_grade is less than or equal to 7.5, we follow the left branch. Otherwise, we follow the right branch. The class distribution nearly equally split between safe and not safe, indicating initial uncertainty (Gini = 0.5 shows maximum impurity).

First Level

Left Branch

  • Feature: grade <= 0.5
  • Gini: 0.436
  • Samples: 10295
  • Class Value: [3303, 6992]
  • Decision: In this branch, we evaluate another feature, grade. If grade is less than or equal to 0.5, we move further left; otherwise, further right. The Gini coefficient has decreased, showing a decrease in impurity from the root.

Right Branch

  • Feature: sub_grade <= 15.5
  • Gini: 0.49
  • Samples: 24586
  • Class Value: [14045, 10541]
  • Decision: On this branch, the decision is based on sub_grade. A lower Gini than the root node, but still high, showing substantial impurity.

Further Branching

Let’s detail one more depth level:

Further Left on the First Left Branch (following grade <= 0.5)

  • Left Sub-branch:
    • Feature: None (Leaf Node)

    • Gini: 0.352

    • Samples: 4621

    • Class Value: [1055, 3566]

    • Class: safe

    • Right Sub-branch:

    • Feature: purpose <= 1.5

    • Gini: 0.371

    • Samples: 346

    • Class Value: [261, 85]

    • Class: not safe

These nodes represent further decisions and classifications based on additional splits.

Observations

  1. Path to Decision: Each path from the root to a leaf node represents a set of conditions that lead to a final classification (either "safe" or "not safe").
  2. Gini Impurity: A Gini index of 0 indicates perfect purity (all samples in a node fall into a single category). Nodes are created by selecting splits that decrease the Gini impurity among child nodes compared to their parent node.
  3. Class Labels: These show the decision at each node, based on the majority of samples’ classification if no further split can achieve better purity or other stopping criteria have been met (like maximum depth or minimum sample size at a node).

This decision tree helps in making predictions by tracing through the tree with the values of the features in the dataset, following branches according to these values until a leaf node is reached, which gives the prediction.

一个决策树项目中的数据处理

因为要使用决策树,没法对非数值进行处理,所以需要把dataframe中的一些数据转换成数值,使用了如下code

from sklearn.preprocessing import LabelEncoder
from collections import defaultdict
d = defaultdict(LabelEncoder)

X_trans = X.apply(lambda x: d[x.name].fit_transform(x))
X_trans.head()

ChatGPT4解释如下

The expression lambda x: d[x.name].fit_transform(x) is a compact way of applying a function to each column of a pandas DataFrame. Let’s dissect this expression further to understand how it works in the context of transforming categorical data into numerical format using LabelEncoder in a Python environment.

The lambda Function

A lambda function in Python is a small anonymous function defined with the keyword lambda. The syntax is:

lambda arguments: expression

In this case, the lambda function has one argument x (representing a column of the DataFrame) and the expression involves applying the fit_transform method from a LabelEncoder instance to x.

How the lambda Function is Applied

  • x: Within the context of the apply() method, x represents each column in the DataFrame one at a time. For instance, if the DataFrame has columns "Color" and "Size", x will be the "Color" column data in one iteration and the "Size" column data in the next.

Usage of d[x.name]

  • d: This is a defaultdict containing instances of LabelEncoder. The use of defaultdict ensures that if a LabelEncoder for a particular column name does not exist, it will automatically create a new LabelEncoder instance without raising an error.

  • x.name: This attribute fetches the name of the column currently being processed (like "Color" or "Size").

  • d[x.name]: This retrieves the LabelEncoder instance for the column named x.name. If it’s the first time this column is being accessed, defaultdict initializes a new LabelEncoder for this key.

The fit_transform Method

  • fit_transform(): This method is part of LabelEncoder and serves a dual purpose:
    • fit: It learns the mapping from the labels in the column to integers. For example, for a column with labels ["Red", "Blue", "Green"], it might learn the mapping Red -> 0, Blue -> 1, Green -> 2.
    • transform: It then transforms the labels into these integers based on the mapping it has learned.

Putting It All Together

When you call X.apply(lambda x: d[x.name].fit_transform(x)), it performs the following steps for each column in DataFrame X:

  • Accesses each column as x.
  • Fetches the LabelEncoder for that column by x.name.
  • Applies fit_transform to the column, converting text labels into integers.
  • Returns a new DataFrame where each text label has been replaced by an integer.

This method is particularly useful when preparing data for machine learning models, which generally require numeric input features rather than text labels. The transformation is done in-place and efficiently for each column, utilizing the flexibility of defaultdict and the functionality of LabelEncoder.

ChatGPT4 对比服务器性能差异

CPU服务器

对比这两组服务器的规格,我们可以从性能、功能和适用场景等角度进行分析:

CPU性能
第一组服务器:
CPU设计为单一Die,有助于降低内部延迟,提高处理效率。
每颗CPU拥有至少28核心,主频达到2.6GHz,缓存容量至少42MB。
支持AVX512指令集,适合进行高强度的计算任务,如科学计算和工程模拟。
国产设备:
每颗CPU核心数至少32核,主频2.2GHz,缓存容量至少64MB,高于第一组服务器。
支持8个DDR4内存通道,相当于第一组服务器。
分析:国产设备的CPU在核心数量和缓存容量上优于第一组服务器,但主频略低,可能影响单核任务的执行效率。

内存配置
第一组服务器:
至少4条内存,每条32GB,频率3200MHz,支持ECC校验。
国产设备:
至少8条内存,同样的容量和频率,支持ECC校验。
分析:国产设备在内存数量上更具优势,这可能对运行大规模数据库或内存密集型应用更有利。

存储配置
第一组服务器:
系统盘:≥960GB SSD,SATA接口。
数据盘:PCIe 4.0接口,≥3.84TB SSD,连续读取速率≥6500MB/s。
国产设备:
系统盘:≥480GB SSD,SATA接口。
数据盘:SATA接口,≥1.92TB SSD。
分析:第一组服务器的数据盘具有更高的存储容量和显著更快的读取速率,更适合需求高速数据处理的场景,如高性能计算和大数据分析。

网络和阵列卡
两组服务器在网络配置相同:均配备双光口万兆网卡,支持bond0,适合高速网络环境。
阵列卡:
第一组服务器阵列卡缓存1G以上。
国产设备阵列卡缓存2G以上,更有利于大规模数据处理和缓冲。
适用场景:

第一组服务器:适合需要高计算性能和高速数据访问的应用,如科学研究、图形渲染和金融模型计算。
国产设备:由于其更多的内存和更大的阵列卡缓存,更适合数据密集型任务和大规模多任务处理,例如云服务和大型企业级应用。
综合来看,两组服务器各有优势,具体选择应根据实际需求和预算考虑。国产设备在内存和缓存方面的优势可能为特定用途提供了更好的性价比。

GPU服务器

继续对比这两组服务器的规格,我们可以深入分析它们的性能、功能以及潜在的适用场景:

CPU性能
第一组服务器与第二组服务器在CPU规格上基本一致:
都配备了两颗CPU,每颗CPU核心数至少24核,主频至少2.2GHz,缓存至少64MB。
这种配置提供了均衡的多核和单核处理能力,适合广泛的应用。
内存配置
内存规格完全一致:
两组服务器都配置有8条32GB DDR4内存,频率3200MHz,支持ECC校验。
这样的内存配置可以有效支持大规模数据处理和复杂的计算需求。
GPU配置
第一组服务器:
搭载8张H800 GPU,但未详述其性能规格。
第二组服务器:
明确要求整机算力不低于768 TFlops的半精度算力,显存总量不低于128GB,显存带宽不低于800GB/s,支持高带宽的卡间互连(不低于192GB/s)。
这样的GPU配置非常适合于高性能计算、深度学习和复杂的图形处理。
存储配置
系统盘和数据盘:
两组服务器在系统盘和数据盘的配置上几乎相同,均为SATA接口SSD,系统盘≥480GB,数据盘容量≥1.92TB。
网络和阵列卡
网络配置相同:
均配备双光口万兆网卡,支持bond0,确保高速稳定的网络连接。
阵列卡配置略有差异:
第一组服务器阵列卡缓存1G以上。
第二组服务器阵列卡缓存2G以上,对于需要高速数据写入和读取的应用来说,更大的缓存可以提供更好的性能。
适用场景
第一组服务器:
由于未详细描述GPU性能,这组服务器可能适合通用的数据中心应用,如虚拟化、数据库管理和中等规模的计算需求。
第二组服务器:
明确的高性能GPU配置使其非常适合于AI训练、科学计算和任何需要大规模并行处理的场景。
综上,第二组服务器在GPU性能上明显优于第一组,特别是在处理高负载的计算密集型任务时。这种配置的机器特别适用于需要高度计算资源的专业领域,例如人工智能、模拟和高性能计算(HPC)领域。

奇怪的问题

chatGPT4 付费版本,似乎没法通过网络搜索获得H800这个阉割版的具体数据,而kimi则展现了灵活务实的一面,给出下面的回答
来自kimi的回答

根据搜索结果中的信息,H800 GPU 的具体半精度算力(FP16)没有直接给出。然而,从搜索结果中我们可以推断一些相关信息。在该搜索结果中提到,H800 的算力水平约等于 A100 的 50% 和 H100 的 15%。由于 H100 GPU 在 FP16 精度下的算力是 4 petaflops,我们可以据此推算 H800 GPU 的大致半精度算力。

如果 H800 的算力是 H100 的 15%,那么 H800 的 FP16 算力将是:

\[ 4 \text{ petaflops} \times 0.15 = 0.6 \text{ petaflops} \]

即 H800 GPU 的半精度算力大约是 0.6 petaflops。请注意,这是一个基于现有信息推算的结果,并非直接从搜索结果中得到的具体数值。

nim意欲何为

为了了解nim意欲何为,我从下面几篇文章里摘录了感兴趣的观点,首先是我在gtc线上会议拍摄的照片

英伟达利用nim加速生成式AI

Nvidia Looks to Accelerate GenAI Adoption with NIM

kimi的答复:

如果您只是在单机或者边缘设备上使用Nvidia Inference Microservice(NIM),您可能不需要Helm。Helm是一个Kubernetes的包管理工具,它主要用于自动化Kubernetes环境中的应用程序部署。当您在单机或边缘设备上使用NIM时,您的环境可能不依赖于Kubernetes,因此也就不需要Helm来进行部署和管理。

在这种情况下,您可以直接部署NIM作为一个容器化的微服务,或者将其集成到您现有的基础设施中。NIM的设计允许它灵活地适应不同的部署场景,包括但不限于Kubernetes集群、云环境、虚拟机或物理机。这样,无论您是在数据中心、私有云还是边缘计算环境中,都可以利用NIM来加速AI推理任务。

此外,NIM的容器化特性意味着它可以轻松地在不同的环境中迁移和运行,而不需要对底层基础设施进行大量的修改。这种灵活性使得NIM成为在多种计算环境中部署AI模型的理想选择。

总之,如果您的使用场景不涉及Kubernetes集群,那么您不需要Helm。您可以直接部署和运行NIM,或者将其集成到您的单机或边缘设备环境中,以实现高效的AI推理。

从“推理服务”到“推理微服务”

What’s a NIM? Nvidia Inference Microservices is new approach to gen AI model deployment that could change the industry

  • Nvidia Inference Microservices (NIM),
    which packages optimized inference engines, industry-standard APIs and support for AI models into containers for easy deployment。While NIM provides prebuilt models, it also allows organizations to bring their own proprietary data and will support and help to accelerate Retrieval Augmented Generation (RAG) deployment.
What exactly is Nvidia NIM?

At the most basic level, a NIM is a container full of microservices. 

The container can include any type of model, ranging from open to proprietary models, that can run anywhere there is an Nvidia GPU — be that in the cloud, or even just in a laptop. In turn, that container can be deployed anywhere a container can run, 
* which could be a Kubernetes deployment in the cloud, 
* a Linux server or 
* even a serverless Function-as-a-Service model. Nvidia will have the serverless function approach on its new ai.nvidia.com website, where developers can go to begin working with NIM prior to deployment.

To be clear, a NIM isn’t a replacement for any prior approach to model delivery from Nvidia. It’s a container that includes a highly optimized model for Nvidia GPUs along with the necessary technologies to improve inference.
  • 更好支持RAG
The RAG approach will benefit from the integration of NVIDIA NeMo Retriever microservices inside of NIM deployments. NeMo Retriever is a technology that Nvidia announced in November 2023 as an approach to help enable RAG with an optimized approach for data retrieval.

除了langchain,llamaindex,现在又冒出一家应用框架公司deepset

英伟达nim网站
https://www.nvidia.com/en-us/ai/

https://build.nvidia.com/mistralai/mixtral-8x7b-instruct

GCP的免费金和免费服务的区别

CGP的spot vm,不支持使用free tair credits,吓我一跳

实际是庸人自扰,因为我把free tier和free credits混淆在了一起。

  • free tier指用户可以免费、限量、限品种的使用部分google的服务,例如地图、计算资源等
  • free credits指用户可以在注册成功后的90天内,免费获得300美金来购买google的各项服务。

重新创建GCP instance并安装

夭折的GCP instance

  • 因为不知道什么原因,我在google cloud上的第一个instance(20240215,春节期间申请)被停机了,无法重启,官方给出的说明如下:
  • 分析:是在设置vpn network的firewall rule时,按照eve安装文档要求,新增了定制的egress流量(outband),其中tcp协议是全部端口(0-65535),来源ip是所有范围(ipv4,0.0.0.0/0),所以造成instance运行没两天,就被强制停机。
  • 没办法,我一方面发邮件给google cloud的support发了邮件,同时也思考解决方案。邮件很快有回复
    这并没有什么卵用,我除了礼貌感谢外,没有其他办法。
  • 还好这是云啊,直接删除instance,重新安装部署(可怜短短几天,85加币就没有了)

重新创建GCP的instance

  • 按照eve要求,重新创建具有vmx功能的虚拟机,也就是nested套娃虚拟机;
  • 这次特地选择了vm是spot vm,是所有vm中的丐中丐,价格据说off 60%,代价是随时会被停机,下面是介绍
This page describes Spot VMs: virtual machine (VM) instances that are excess Compute Engine capacity. Spot VMs have significant discounts, but Compute Engine might preemptively stop or delete (preempt) Spot VMs to reclaim the capacity at any time. Spot VMs are the latest version of preemptible VMs.

The provisioning model of a VM affects its discounts and availability; Compute Engine VMs can use either the standard provisioning model (standard VMs) (default) or the spot provisioning model (Spot VMs). In contrast to standard VMs, Spot VMs are available at much lower prices but have lower availability because they are subject to preemption. Use Spot VMs to reduce costs for fault-tolerant workloads.

To learn more about Compute Engine VMs in general, read the Virtual machine instances documentation. To learn how to create Spot VMs, read Create and use Spot VMs.

What are Spot VMs?

Spot VMs are available at much lower prices—60-91% discounts for most machine types and GPUs as well as smaller discounts for local SSDs and A3 VMs—compared to the on-demand price for standard VMs. However, Compute Engine might preempt Spot VMs at any time, such as when it needs the resources for other tasks. At this uncertain preemption time, Compute Engine either stops (default) or deletes your Spot VMs depending on your specified termination action for each Spot VM. Spot VMs are excess Compute Engine capacity, so their availability varies based on Compute Engine usage. Spot VMs do not have a minimum or maximum runtime unless you specifically limit the runtime.

If your workloads are fault-tolerant and can withstand possible VM preemption, Spot VMs can reduce your Compute Engine costs significantly. For example, batch processing jobs can run on Spot VMs. If some of those VMs stop during processing, the job slows but does not completely stop. Spot VMs complete your batch processing tasks without placing additional load on your existing VMs and without requiring you to pay full price for additional standard VMs.

  • 我们可以看到,在spot模式下,google vm可以便宜到0.01美元每小时
    *

  • 下面是我的选择

  • 还可以更加丧心病狂,因为spot停机后是不收费的(硬盘和静态ip需要收费),那么我们再来一个定期停机,一切就完美了。
    当完成定期计划来启停机器时,会出现如下告警

Compute Engine System service account service-111462968063@compute-system.iam.gserviceaccount.com needs to have [compute.instances.start] permissions applied in order to perform this operation.

参考下面的方案就能解决
Solution:

  • Login to the Google Cloud Console
  • To to IAM & Admin
  • Select IAM in the left panel
  • Click the box Include Google-provided role grants on the right side of the window. This enables showing Google-managed service accounts.
  • Locate the service account. Press CTRL-F5 to open a search box. Enter the text compute-system.
  • Click the pencil icon to edit the service account.
  • Click ADD ANOTHER ROLE
  • In the role filter enter Compute. Select Compute Instance Admin (v1).
  • Click SAVE

Wait a few minutes before trying to use the new permissions.

  • 完成后上传交换机的os文件(见下面章节)
  • 安装eve,并启动
  • eve界面启动后,添加node,启动switch都正常,但是telnet无法工作(ssh正常)
    • firewall rule添加ingress,tcp 23,无效
    • firewall rule添加egress,tcp 23,无效
    • firewall rule添加ingress,tcp 0-65535,ip 0.0.0.0/0,telnet成功!

上传交换机和路由器的系统文件

  1. Cisco的iol/ios镜像文件

    1. 把iol文件,后缀是bin,上传到目录:/opt/unetlab/addons/iol/bin/
    2. 使用如下命令来修改权限(切记)
    /opt/unetlab/wrappers/unl_wrapper -a fixpermissions
    
    1. 运行授权文件No need to fix the permission of the license file as we will run this file using the python program. Run command
    python3     CiscoIOUKeygen3f.py
    
    1. 使用如下命令来测试iou/ios文件是否安装成功
    cd /opt/unetlab/addons/iol/bin
    touch NETMAP
    LD_LIBRARY_PATH=/opt/unetlab/addons/iol/lib /opt/unetlab/addons/iol/bin/<iosname.bin> 1
    
    ***************************************************************
    IOS On Unix - Cisco Systems confidential, internal use only
    Under no circumstances is this software to be provided to any
    non Cisco staff or customers.  To do so is likely to result
    in disciplinary action. Please refer to the IOU Usage policy at
    wwwin-iou.cisco.com for more information.
    ***************************************************************
    
                  Restricted Rights Legend
    
    Use, duplication, or disclosure by the Government is
    subject to restrictions as set forth in subparagraph
    (c) of the Commercial Computer Software - Restricted
    Rights clause at FAR sec. 52.227-19 and subparagraph
    (c) (1) (ii) of the Rights in Technical Data and Computer
    Software clause at DFARS sec. 252.227-7013.
    
               cisco Systems, Inc.
               170 West Tasman Drive
               San Jose, California 95134-1706
    
  2. Cisco的其他镜像文件
    1.

  3. 华为的镜像文件
    1.

  4. 华三的镜像文件
    1.

oracle cloud and google cloud

春节在家,一直到初五都没闲着,这几天主要是在折腾eve-ng,装了2台mac电脑,还有一个google cloud,用下来还是gcp给力,毕竟300美金不是闹着玩的。

oracle cloud

说google cloud的之前,得先唠叨下我去年申请的oracle cloud,号称永久免费的云,虽然配置不怎么地,胜在免费,用下来挺不错,唯一不爽的是一直要登录oracle的shell,今天突然发现,使用token的方式,与ssh密码登录是不矛盾的,只能怪自己学艺不精吧

vi /etc/ssh/sshd.conf

在上面的配置文件里,修改两个地方

  • PasswordAuthentication yes
  • Passwordlogin yes

然后重新启动ssh服务

systemctl restart sshd

就可以实现ssh登录了

google cloud

为什么要折腾google cloud呢?按理说我自己的目标是在aws找一个网工的职业,在苦寒之地可以不用洗盘子和在超市搬货。所以在我寻找eve-ng这个软件的部署方式时,突然发现原来是可以在云主机上部署的,唯一的特殊要求是,需要支持nested,也就是在虚拟机套娃(在一台虚拟机里,再安装一台虚拟机),这个时候就要用到一个比较特殊的技术,vmx,也就是允许虚拟机把硬件资源开放出来,让套娃虚拟机继续使用,和我们搞gpu的passthrough很类似,很可惜,国内的阿里云是不支持这样做的,目测下来只有google cloud支持的比较好。

所以早晨抓紧时间,利用目前google的优惠政策,申请了一台3个月的免费主机,算下来,3个月我也能在网络技术方面小有所成了。

下面是这次的几个心得

  1. 申请ecs的时候,选4个vcpu,16g内存,因为eve可能要支持100个node,每个node也许就要部署一个1G的镜像包 ,性能太差就玩不起来;
  2. 硬盘可以选的差一点,我直接用了standard,而不是ssd,每个月50G可以省7美金,用下来觉得自己真是会精打细算;
  3. cpu架构一定要选intel的cascade或skylake,一方面便宜,另外对vmx支持比较好
  4. 3个月biling大概350加币,马上就要突破300美金了,有一点点紧迫的感觉
  5. 云主机创建完成,会发现ssh的环境和cloud shell的环境不同,ssh上的是ubuntu20的环境,这就是套娃虚拟机的系统,是用eve官网给的命令安装的;而google cloud的环境是debian的系统,内核版本很高,关于这两个环境不不同的原因我还没有搞的很清楚,毕竟整个ecs创建的过程中,没有让我选择os啊?