API请求
URLs
POST zstack/v1/ai/model-servicesHeaders
Authorization: OAuth the-session-uuidBody
{
"param": {
"name": "text to text model service",
"description": "This is text to text model service you can chose model",
"yaml": "services:\n - ports:\n - 3000\n name: qwen1.5-7b-chat:2b34xhrmqwhomjkd\n livez: /livez\n readyz: /readyz\n serviceBootupTime: 30\nenv:\n - key:value\n - key:value\ndistro:\n packages: vim,nfs-utils\npython:\n requirements_txt: ./requirements.txt\n index_url: https://pypi.tuna.tsinghua.edu.cn/simple\n trusted_host: pypi.tuna.tsinghua.edu.cn",
"modelCenterUuid": "modelCenterUuidExample",
"installPath": "/opt/zstack/ai/model-services/qwen1.5-7b-chat",
"requestCpu": 4,
"requestMemory": 1024,
"gpuComputeCapability": "7.5",
"startCommand": "python3 app.py",
"pythonVersion": "3.8.10",
"condaVersion": "4.9.2",
"type": "Endpoint",
"source": "Other",
"architectureImages": [
{
"cpuArchitecture": "X86_64",
"vmImageUuid": "vmImageUuidX86Example",
"dockerImage": "registry.example.com/x86_64/myimage:latest"
},
{
"cpuArchitecture": "AARCH64",
"vmImageUuid": "vmImageUuidAarchExample",
"dockerImage": "registry.example.com/aarch64/myimage:latest"
}
],
"supportDistributed": false
},
"systemTags": [],
"userTags": []
}说明: 上述示例中systemTags、userTags字段可以省略。列出是为了表示body中可以包含这两个字段。
Curl示例
curl -H "Content-Type: application/json;charset=UTF-8" \
-H "Authorization: OAuth b86c9016b4f24953a9edefb53ca0678c" \
-X POST -d '{"param":{"name":"text to text model service","description":"This is text to text model service you can chose model","yaml":"services:\n - ports:\n - 3000\n name: qwen1.5-7b-chat:2b34xhrmqwhomjkd\n livez: /livez\n readyz: /readyz\n serviceBootupTime: 30\nenv:\n - key:value\n - key:value\ndistro:\n packages: vim,nfs-utils\npython:\n requirements_txt: ./requirements.txt\n index_url: https://pypi.tuna.tsinghua.edu.cn/simple\n trusted_host: pypi.tuna.tsinghua.edu.cn","modelCenterUuid":"modelCenterUuidExample","installPath":"/opt/zstack/ai/model-services/qwen1.5-7b-chat","requestCpu":4,"requestMemory":1024,"gpuComputeCapability":"7.5","startCommand":"python3 app.py","pythonVersion":"3.8.10","condaVersion":"4.9.2","type":"Endpoint","source":"Other","architectureImages":[{"cpuArchitecture":"X86_64","vmImageUuid":"vmImageUuidX86Example","dockerImage":"registry.example.com/x86_64/myimage:latest"},{"cpuArchitecture":"AARCH64","vmImageUuid":"vmImageUuidAarchExample","dockerImage":"registry.example.com/aarch64/myimage:latest"}],"supportDistributed":false}}' \
http://localhost:8080/zstack/v1/ai/model-services参数列表
| 名字 | 类型 | 位置 | 描述 | 可选值 | 起始版本 |
|---|---|---|---|---|---|
| name | String | body(包含在param结构中) | 资源名称 | 5.1.8 | |
| description (可选) | String | body(包含在param结构中) | 资源的详细描述 | 5.1.8 | |
| yaml | String | body(包含在param结构中) | yaml格式的配置文件 | 5.1.8 | |
| requestCpu | Integer | body(包含在param结构中) | 需要使用的cpu数量 | 5.1.8 | |
| requestMemory | Long | body(包含在param结构中) | 需要使用的内存数量 | 5.1.8 | |
| zoneUuid (可选) | String | body(包含在param结构中) | 区域UUID | 5.1.8 | |
| systemTags (可选) | List | body | 系统标签 | 5.1.8 | |
| userTags (可选) | List | body | 用户标签 | 5.1.8 | |
| modelCenterUuid | String | body(包含在param结构中) | 5.1.8 | ||
| gpuComputeCapability (可选) | String | body(包含在param结构中) | 5.1.8 | ||
| installPath | String | body(包含在param结构中) | 5.1.8 | ||
| system (可选) | Boolean | body(包含在param结构中) | 5.1.8 | ||
| startCommand | String | body(包含在param结构中) | 5.1.8 | ||
| pythonVersion (可选) | String | body(包含在param结构中) | 5.1.8 | ||
| condaVersion (可选) | String | body(包含在param结构中) | 5.1.8 | ||
| type (可选) | String | body(包含在param结构中) |
|
5.1.8 | |
| framework (可选) | String | body(包含在param结构中) | 5.1.8 | ||
| resourceUuid (可选) | String | body(包含在param结构中) | 资源UUID | 5.1.8 | |
| tagUuids (可选) | List | body(包含在param结构中) | 标签UUID列表 | 5.1.8 | |
| source (可选) | String | body(包含在param结构中) |
|
5.3.28 | |
| modelUuids (可选) | List | body(包含在param结构中) | 5.3.28 | ||
| architectureImages (可选) | List | body(包含在param结构中) | 5.3.28 | ||
| supportDistributed (可选) | Boolean | body(包含在param结构中) | if distributed inference deployment supported | 5.3.46 | |
| containerCommand (可选) | String | body(包含在param结构中) | 容器启动命令。 | 5.5.22 | |
| containerArgs (可选) | String | body(包含在param结构中) | 容器启动参数。 | 5.5.22 | |
| vendorToSpecUuidsMap (可选) | Map | body(包含在param结构中) | GPU厂商到GPU规格UUID列表的映射。 | 5.5.22 |
API返回
返回示例
{
"inventory": {
"name": "text to text model service",
"description": "This is text to text model service you can chose model",
"yaml": "model service parameters",
"requestCpu": 4,
"requestMemory": 1024
}
}#error
| 名字 | 类型 | 描述 | 起始版本 |
|---|---|---|---|
| code | String | 错误码号,错误的全局唯一标识,例如SYS.1000, HOST.1001 | 5.1.8 |
| description | String | 错误的概要描述 | 5.1.8 |
| details | String | 错误的详细信息 | 5.1.8 |
| elaboration | String | 保留字段,默认为null | 5.1.8 |
| opaque | LinkedHashMap | 保留字段,默认为null | 5.1.8 |
| cause | ErrorCode | 根错误,引发当前错误的源错误,若无原错误,该字段为null | 5.1.8 |
#inventory
| 名字 | 类型 | 描述 | 起始版本 |
|---|---|---|---|
| uuid | String | 资源的UUID,唯一标识该资源 | 5.1.8 |
| name | String | 资源名称 | 5.1.8 |
| description | String | 资源的详细描述 | 5.1.8 |
| readme | String | README内容 | 5.1.8 |
| yaml | String | 服务YAML配置 | 5.1.8 |
| requestCpu | Integer | 请求CPU数量 | 5.1.8 |
| requestMemory | Long | 请求内存大小 | 5.1.8 |
| modelCenterUuid | String | 模型中心UUID | 5.1.8 |
| type | String | 模型服务类型 | 5.1.8 |
| framework | String | 模型服务框架 | 5.1.8 |
| source | String | 模型服务来源 | 5.1.8 |
| size | Long | 模型服务大小 | 5.1.8 |
| system | Boolean | 是否为系统模型服务 | 5.1.8 |
| hasNewVersion | Boolean | 是否存在新版本 | 5.1.8 |
| gpuComputeCapability | String | GPU算力要求 | 5.1.8 |
| installPath | String | 模型服务安装路径 | 5.1.8 |
| pythonVersion | String | Python版本 | 5.1.8 |
| condaVersion | String | Conda版本 | 5.1.8 |
| version | String | 模型服务版本 | 5.1.8 |
| startCommand | String | 启动命令 | 5.1.8 |
| containerCommand | String | 容器启动命令 | 5.1.8 |
| containerArgs | String | 容器启动参数 | 5.1.8 |
| supportDistributed | Boolean | 是否支持分布式部署 | 5.1.8 |
| cpuArchitectures | List | 支持的CPU架构 | 5.1.8 |
| vendorToSpecUuidsMap | Map | GPU厂商与规格UUID映射 | 5.1.8 |
| modelServiceRefs | List | 模型与模型服务的绑定关系 | 5.1.8 |
| modelServiceImages | List | 模型服务镜像列表 | 5.1.8 |
| createDate | Timestamp | 创建时间 | 5.1.8 |
| lastOpDate | Timestamp | 最后一次修改时间 | 5.1.8 |
SDK示例
Java SDK
AddModelServiceAction action = new AddModelServiceAction();
action.name = "text to text model service";
action.description = "This is text to text model service you can chose model";
action.yaml = "services:
- ports:
- 3000
name: qwen1.5-7b-chat:2b34xhrmqwhomjkd
livez: /livez
readyz: /readyz
serviceBootupTime: 30
env:
- key:value
- key:value
distro:
packages: vim,nfs-utils
python:
requirements_txt: ./requirements.txt
index_url: https://pypi.tuna.tsinghua.edu.cn/simple
trusted_host: pypi.tuna.tsinghua.edu.cn";
action.modelCenterUuid = "modelCenterUuidExample";
action.installPath = "/opt/zstack/ai/model-services/qwen1.5-7b-chat";
action.requestCpu = 4;
action.requestMemory = 1024;
action.gpuComputeCapability = "7.5";
action.startCommand = "python3 app.py";
action.pythonVersion = "3.8.10";
action.condaVersion = "4.9.2";
action.type = "Endpoint";
action.source = "Other";
action.architectureImages = asList([cpuArchitecture:X86_64, vmImageUuid:vmImageUuidX86Example, dockerImage:registry.example.com/x86_64/myimage:latest],[cpuArchitecture:AARCH64, vmImageUuid:vmImageUuidAarchExample, dockerImage:registry.example.com/aarch64/myimage:latest]);
action.supportDistributed = false;
action.sessionId = "b86c9016b4f24953a9edefb53ca0678c";
AddModelServiceAction.Result res = action.call();Python SDK
AddModelServiceAction action = AddModelServiceAction()
action.name = "text to text model service"
action.description = "This is text to text model service you can chose model"
action.yaml = "services:
- ports:
- 3000
name: qwen1.5-7b-chat:2b34xhrmqwhomjkd
livez: /livez
readyz: /readyz
serviceBootupTime: 30
env:
- key:value
- key:value
distro:
packages: vim,nfs-utils
python:
requirements_txt: ./requirements.txt
index_url: https://pypi.tuna.tsinghua.edu.cn/simple
trusted_host: pypi.tuna.tsinghua.edu.cn"
action.modelCenterUuid = "modelCenterUuidExample"
action.installPath = "/opt/zstack/ai/model-services/qwen1.5-7b-chat"
action.requestCpu = 4
action.requestMemory = 1024
action.gpuComputeCapability = "7.5"
action.startCommand = "python3 app.py"
action.pythonVersion = "3.8.10"
action.condaVersion = "4.9.2"
action.type = "Endpoint"
action.source = "Other"
action.architectureImages = [[cpuArchitecture:X86_64, vmImageUuid:vmImageUuidX86Example, dockerImage:registry.example.com/x86_64/myimage:latest], [cpuArchitecture:AARCH64, vmImageUuid:vmImageUuidAarchExample, dockerImage:registry.example.com/aarch64/myimage:latest]]
action.supportDistributed = false
action.sessionId = "b86c9016b4f24953a9edefb53ca0678c"
AddModelServiceAction.Result res = action.call()