我最近在用sensenova的api做测试,每次都在代码里上传,太麻烦了,打算做个小app。
第一步是系统设计。
用chatgpt里的software architecture visualiser工具
第一次生成效果
第二次重新调整了输入,经历几次语法错误后,生成如下
我最近在用sensenova的api做测试,每次都在代码里上传,太麻烦了,打算做个小app。
第一步是系统设计。
用chatgpt里的software architecture visualiser工具
第一次生成效果
第二次重新调整了输入,经历几次语法错误后,生成如下
总体评价,开发手册比较全面,但是部分关键概念对小白不是非常友好,很多地方和chatgpt比起来是有不少差距的,但是,我看了看价格,所有的缺点就变成优点了。:)
有python的官方sensenova库,pip直接安装就好
1.但使用官方库,目前在构建知识库的时候,只支持knowledge 1这种参数,也就是json格式。如果想使用knowledge 2,也就是直接上传pdf、word等格式的文件,就要采用直接调用api的方式,也就是要使用python request库才可以。虽然区别不大,但也是小坑。
对于文件,要区分你是打算fine tune还是知识库,两种目的的文件格式是完全不同的,如下
第二种方式,只支持api
我在项目上遇到了一个问题:需要从上传的知识库(先上传文件,然后转化成知识库,应该是向量化了),对某一个应急事件进行分类判断。但是chatgpt和sensenova给出了同样错误的答案。
openai对于中国古诗还是不在行啊
这是商汤sensennova大语言模型的示例代码(对话生成_无对话历史)
stream = True # 流式输出或非流式输出
model_id = "nova-ptc-xl-v1" # 填写真实的模型ID
resp = sensenova.ChatCompletion.create(
messages=[{"role": "user", "content": "人生天地间,下一句是啥"}],
model=model_id,
stream=stream,
max_new_tokens=1024,
n=1,
repetition_penalty=1.05,
temperature=0.9,
top_p=0.7,
know_ids=[],
user="sensenova-python-test-user",
knowledge_config={
"control_level": "normal",
"knowledge_base_result": True,
"knowledge_base_configs":[]
},
plugins={
"associated_knowledge": {
"content": "需要注入给模型的知识",
"mode": "concatenate"
},
"web_search": {
"search_enable": True,
"result_enable": True
},
}
)
if not stream:
resp = [resp]
for part in resp:
choices = part['data']["choices"]
for c_idx, c in enumerate(choices):
if len(choices) > 1:
sys.stdout.write("===== Chat Completion {} =====\n".format(c_idx))
if stream:
delta = c.get("delta")
if delta:
sys.stdout.write(delta)
else:
sys.stdout.write(c["message"])
if len(choices) > 1:
sys.stdout.write("\n")
sys.stdout.flush()
下面我们来line by line的解释
This code seems to be a part of a chatbot implementation using the Sensenova framework, where it sends a message to a chat model and processes the response, either in a streaming manner or as a complete response.
这段代码似乎是使用 Sensenova 框架的聊天机器人实现的一部分,其中它发送消息到聊天模型并处理响应,无论是以流式方式还是作为完整的响应。
top_p:核采样参数,用于解码生成token的过程。
“核采样参数,解码生成token时,在概率和大于等于top_p的最小token集合中进行采样”这句话描述的是一种称为“核采样”(Top-p sampling)的机制,它用于生成语言模型的回复。下面是这个概念的详细解释:
核采样(Top-p sampling):这是一种在自然语言处理中用于生成文本的技术。在生成下一个词(token)时,核采样只考虑那些累积概率和达到指定阈值 top_p 的最可能的词。 解码生成token:当语言模型生成回复时,它逐个生成词(token)。解码过程是选择每个步骤中应该生成哪个词的过程。 在概率和大于等于top_p的最小token集合中进行采样:这意味着在生成每个词时,模型会查看所有可能的下一个词及其概率。然后,它计算这些概率的累积和,并选择一个累积和至少为 top_p 的词的集合。模型仅从这个集合中随机选择下一个词,而不是从所有可能的词中选择。 举例来说,如果 top_p 设为0.7,模型会考虑累积概率和至少达到70%的那部分词。这样的选择过程确保了生成的文本既有一定的多样性(因为不总是选择最可能的词),同时也保持了一定的连贯性和可读性。 核采样是一种平衡生成文本多样性和可预测性的有效方法,常用于各种基于深度学习的语言生成模型。
In Python, an instance is a specific object created from a class. A class is a blueprint that defines the members (attributes and methods) it supports, and an instance is a concrete object created based on this blueprint. Each instance has the attributes and methods defined in the class.
Example: If you have a Dog class, when you create a Dog object, that object is an instance of the Dog class.
class Dog:
def __init__(self, name):
self.name = name
my_dog = Dog("Buddy") # my_dog is an instance of the Dog class
In Python, an object is the fundamental unit in the program, and almost everything is an object, including numbers, strings, data structures, functions, and even classes themselves. Each object has a type (indicating what it is) and identity (a unique identifier).
Example: In the Dog class above, my_dog is not just an instance, but also an object.
In Python, an item typically refers to a single element within a container type like a list, tuple, or dictionary.
Example: In a list, each element at a position is an item.
numbers = [1, 2, 3]
first_item = numbers[0] # first_item is an item in the list, with a value of 1
In Python, an attribute is a variable that belongs to an object. In a class definition, attributes are variables defined outside of methods and are used to store the state of an instance.
Example: In the Dog class, name is an attribute.
class Dog:
def __init__(self, name):
self.name = name # name is an attribute
my_dog = Dog("Buddy")
print(my_dog.name) # Accessing the name attribute of my_dog
In this example, Dog is a class, my_dog is an instance of the Dog class (and also an object), name is an attribute of the my_dog object, and [1, 2, 3] is a list where 1, 2, and 3 are items in the list.
让我们结合Python语言来重新解释这些概念,并给出相应的例子:
在Python中,实例是根据类创建的具体对象。类是一种定义其成员(属性和方法)的模板,而实例是根据这个模板创建的具体对象。每个实例都拥有类中定义的属性和方法。
例子:如果有一个类Dog,当你创建一个Dog的对象时,这个对象就是Dog类的一个实例。
class Dog:
def __init__(self, name):
self.name = name
my_dog = Dog("Buddy") # my_dog 是 Dog 类的一个实例
在Python中,对象是程序中的基本单元,几乎所有东西都是对象,包括数字、字符串、数据结构、函数,以及类本身。每个对象都有类型(表示它是什么)和身份(标识它是独一无二的)。
例子:在上面的Dog类中,my_dog不仅是一个实例,也是一个对象。
项目(Item):
在Python中,项目通常指的是容器类型(如列表、元组、字典)中的单个元素。
例子:在一个列表中,每个位置的元素都是一个项目。
numbers = [1, 2, 3]
first_item = numbers[0] # first_item 是列表中的一个项目,值为 1
在Python中,属性是附属于对象的变量。在类的定义中,属性是定义在方法外的变量,它们用于存储实例的状态。
例子:在Dog类中,name是一个属性。
class Dog:
def __init__(self, name):
self.name = name # name 是一个属性
my_dog = Dog("Buddy")
print(my_dog.name) # 访问 my_dog 的 name 属性
在这个例子中,Dog是一个类,my_dog是Dog类的一个实例(同时也是一个对象),name是my_dog对象的一个属性,而[1, 2, 3]是一个列表,其中的1、2、3是列表的项目。
前两天升级openai后,这两天一直在解决下面的错误,openai的error属性缺失
File /usr/local/lib/python3.11/site-packages/langchain/chat_models/openai.py:77, in _create_retry_decorator(llm, run_manager)
68 def _create_retry_decorator(
69 llm: ChatOpenAI,
70 run_manager: Optional[
71 Union[AsyncCallbackManagerForLLMRun, CallbackManagerForLLMRun]
72 ] = None,
73 ) -> Callable[[Any], Any]:
74 import openai
76 errors = [
---> 77 openai.error.Timeout,
78 openai.error.APIError,
79 openai.error.APIConnectionError,
80 openai.error.RateLimitError,
81 openai.error.ServiceUnavailableError,
82 ]
83 return create_base_retry_decorator(
84 error_types=errors, max_retries=llm.max_retries, run_manager=run_manager
85 )
AttributeError: module 'openai' has no attribute 'error'
开始有点着急,毕竟openai已经是最新版本了,不想降级,然后就开始把openai.py里的相关code都注释掉,这下可好,压下葫芦起了瓢,各种稀奇古怪的问题都跑出来,
不得已检查各种包的版本
pip3 list --outdated
发现langchain也要升级了,最后一招制敌。
pip3 install --upgrade langchain
升级到新版本后,问题解决了
pip3 show langchain
Name: langchain
Version: 0.0.352
Summary: Building applications with LLMs through composability
Home-page: https://github.com/langchain-ai/langchain
Author:
Author-email:
License: MIT
Location: /usr/local/lib/python3.11/site-packages
Requires: aiohttp, dataclasses-json, jsonpatch, langchain-community, langchain-core, langsmith, numpy, pydantic, PyYAML, requests, SQLAlchemy, tenacity
如果按照网上的建议,非得降级才能解决,这不是自断手脚嘛。
pip install openai==0.28.1
pip install langchain==0.0.330
这是吴恩达和langchain创始人联合开的一门课,目前我在学两个课程,1个是应用框架介绍,1个是langchain和data结合。
langchain application course
##我的初步思考
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=0,
)
chat = ChatOpenAI(temperature=0.0, model=llm_model)
维特根斯坦说过,你如何使用语言,你就如何思维;你如何思维,你也如何使用语言。
你如何使用语言呢,其实就是你语言的关键词有哪些,这些关键词的顺序是怎样的。
"语言不仅仅是传达思想的工具;它也塑造和限制了我们所能思考的内容。" – 这句话源自维特根斯坦的著作《逻辑哲学论》(Tractatus Logico-Philosophicus)。维特根斯坦在这里阐述了一个关于语言和思维的基本观点:语言不仅仅是用来表达我们已有的思想,它实际上还在形塑和限制我们的思维方式。换句话说,我们能够想到什么,以及我们如何去思考,很大程度上受到我们所使用的语言结构和词汇的限制。例如,如果某种语言没有表达特定概念的词汇,那么使用这种语言的人可能很难理解或思考这个概念。这体现了语言不只是思维的载体,也是思维的塑造者和限制者。
"我们的语言是我们思考的镜子。" – 来自《文化与价值》(Culture and Value)。在这句话中,维特根斯坦强调了语言和思维之间的反映和相互作用。这里,语言被视为一种反映我们思考方式的工具。就像镜子一样,语言展示了我们的思考模式和心理过程。这意味着通过观察和分析我们的语言使用,我们可以了解自己的思维习惯和方式。这种观点也强调了语言和思维之间不可分割的关系,表明我们的语言实际上揭示了我们的思考特征和倾向。
vs code的Juypter notebook宽带只有一行字高,显示很长的输出内容时,非常不方便。
调整方法如下
1 进入code/settings/settings
2 搜索notebook output
3 把word wrap的小框勾选即可
4 如果想让output更宽,还可以在line height里选择1-7任意数字,就可以让output变宽。
调整效果如下
% pip3 install --upgrade openai
openai.ChatCompletion.create() -> client.chat.completions.create()
response = openai.chat.completions.create(
model = model,
messages = messages,
temperature = 0,
)
return response.choices[0].message.content
我实际没有修改client,也是成功的。但是从下面吴恩达的介绍看,response也要修改为属性调用,而不是直接调用。
不过经过chatgpt的解释,我还是使用了client这个实例来调用相关函数,原因如下(更灵活,更好支持异步操作、更符合行业惯例)
The shift from using openai.chat to client.chat in the OpenAI API reflects a change in the design of the OpenAI Python library. Here are the key reasons for this change:
Client Instance Configuration:
Using a client instance allows for more flexible configuration. You can create multiple client instances with different configurations (e.g., different API keys, different base URLs for testing or production environments) within the same application. This is more scalable and adaptable for complex applications or those that need to interact with multiple environments or configurations.
Asynchronous Support:
The new API design is likely structured to better support asynchronous operations. By using a client object, it's easier to manage asynchronous requests and responses, which are becoming increasingly important for efficient handling of network operations, especially in web applications and services.
Consistency with Other SDKs:
Many modern API SDKs (Software Development Kits) use a similar pattern where you create a client instance and then use methods on this instance to interact with the API. This design pattern is familiar to many developers and aligns with best practices in software development.
Encapsulation and Extensibility:
By using a client object, the OpenAI library can encapsulate functionality and state more effectively. This makes the library easier to extend and maintain. It also allows for better handling of resources like network connections.
Error Handling and Debugging:
A client-based approach can offer improved error handling and debugging capabilities. Since each client instance can manage its own state and configuration, it's easier to trace and handle errors, log activities, and debug issues.
In summary, the shift to using a client instance in the OpenAI Python library is likely driven by a desire for greater flexibility, support for asynchronous operations, consistency with modern software development practices, improved encapsulation and extensibility, and enhanced error handling and debugging capabilities.
更详细的异步调用的好处
The emphasis on asynchronous support in the new API design, particularly with the use of a client object, reflects a broader trend in software development towards more efficient and scalable network operations. Here's a deeper look into why this is beneficial:
1. Improved Performance and Scalability:
Non-blocking Calls: Asynchronous operations allow your application to make non-blocking network calls. This means that your application doesn't have to wait for a response from the OpenAI server before continuing with other tasks. This is especially beneficial in web applications where multiple users might be making requests at the same time.
Handling Multiple Requests: Asynchronous programming is more efficient at handling multiple simultaneous network requests. This is crucial for high-load applications that need to maintain responsiveness under heavy request volumes.
2. Better Resource Utilization:
Concurrency: Asynchronous operations enable better utilization of system resources. While waiting for a response from an API call, your application can perform other tasks, thereby making better use of the CPU and other resources.
Reduced Latency: In a synchronous model, each operation must complete before the next one starts, potentially leading to higher latency. Asynchronous operations can overlap, which can reduce overall latency in the application.
3. Enhanced User Experience:
Responsive Applications: In a web or mobile application, asynchronous operations can significantly enhance the user experience. Users aren't left waiting for operations to complete and can continue interacting with other parts of the application.
Real-time Updates: Asynchronous programming facilitates real-time updates to the user interface, which can be crucial for applications that require immediate feedback, such as chatbots or live data monitoring.
4. Simplified Error Handling:
Asynchronous workflows: Asynchronous programming often comes with more sophisticated ways to handle errors and exceptions. For instance, in Python's asyncio, you can use try/except blocks within asynchronous functions to manage exceptions effectively.
5. Alignment with Modern Web Standards:
WebSockets and HTTP/2: Modern web protocols like WebSockets and HTTP/2 are designed to work efficiently with asynchronous communication, making it a natural fit for applications that leverage these technologies.
Implementation in Python:
asyncio Library: Python’s asyncio library is a popular choice for writing asynchronous code. It provides a framework for writing single-threaded concurrent code using coroutines, multiplexing I/O access, and running network clients and servers.
Integration with Frameworks: Many modern Python web frameworks (like FastAPI, Sanic, etc.) are built with native support for asynchronous operations, making it easier to integrate with asynchronous APIs like OpenAI's.
In summary, the shift towards asynchronous support with a client object in API design is a response to the growing need for more efficient, scalable, and responsive applications, particularly in the context of web and network services. It aligns well with modern software practices and technological advancements.
昨天去龙华殡仪馆,送汤老师最后一程。
昨天上午,天上飘着微微的冰雨,我们没有乘坐公司大巴,三个人自行前往。现场前来悼念的人非常多,我们在入口处被阻拦,开始还乖乖停留在入口处,后来发现这些临时的工作人员没有任何原则,对于哪些人应该排队,哪些人应该优先进入,管理的非常混乱。当我们质疑为什么其他人可以不用排队时,他们只是诺诺的说一声,“这是领导”。挺好笑的一个小插曲。
还好我自由散漫惯了,没有傻等,径直走了进去,发现里面并不拥挤。
汤老师灵堂非常肃穆,国家领导人的花圈也很醒目,一侧答礼致谢的都是和汤老师很亲密的人,杨老师、他们的儿子、徐立、晓刚、许冰等。
从龙华殡仪馆出来,我们在日月光吃午饭,席间大家也是不胜唏嘘,也对以后公司的未来颇有一些担心。
我反而对自己有点担心,很具体的来说,就是怎么维护好身体和精神状态,我不仅希望自己长寿,而且希望未来有质量的活着,只有这样,才能有本钱好好继续探索50年的宇宙,有AI相伴的宇宙。
公司的一名同事,应该是秒画的设计师,用AI生成了一幅画,匿名发布在maimai上。
我觉得,这是对汤老师最好的告别。