API请求
URLs
POST zstack/v1/ai/model-services/{uuid}Headers
Authorization: OAuth the-session-uuidBody
{
"params": {
"name": "example",
"description": "This is an example description.",
"size": 0,
"gpuComputeCapability": "7.5",
"startCommand": "python start.py",
"pythonVersion": "3.8",
"system": false,
"type": "Endpoint",
"yaml": "example-yaml-configuration",
"framework": "HuggingFace",
"requestCpu": 4,
"requestMemory": 8192,
"cpuArchitectures": [
"X86_64",
"AARCH64"
],
"architectureImages": [
{
"cpuArchitecture": "X86_64",
"vmImageUuid": "21cb9f654e9d3feab962ce3f5eec5796",
"dockerImage": "registry.example.com/x86_64/myimage:latest"
},
{
"cpuArchitecture": "AARCH64",
"vmImageUuid": "1d919fda666b3bdcb5f4925f1cf17c71",
"dockerImage": "registry.example.com/aarch64/myimage:latest"
}
]
},
"systemTags": [],
"userTags": []
}说明: 上述示例中systemTags、userTags字段可以省略。列出是为了表示body中可以包含这两个字段。
Curl示例
curl -H "Content-Type: application/json;charset=UTF-8" \
-H "Authorization: OAuth b86c9016b4f24953a9edefb53ca0678c" \
-X POST -d '{"params":{"name":"example","description":"This is an example description.","size":0,"gpuComputeCapability":"7.5","startCommand":"python start.py","pythonVersion":"3.8","system":false,"type":"Endpoint","yaml":"example-yaml-configuration","framework":"HuggingFace","requestCpu":4,"requestMemory":8192,"cpuArchitectures":["X86_64","AARCH64"],"architectureImages":[{"cpuArchitecture":"X86_64","vmImageUuid":"21cb9f654e9d3feab962ce3f5eec5796","dockerImage":"registry.example.com/x86_64/myimage:latest"},{"cpuArchitecture":"AARCH64","vmImageUuid":"1d919fda666b3bdcb5f4925f1cf17c71","dockerImage":"registry.example.com/aarch64/myimage:latest"}]}}' \
http://localhost:8080/zstack/v1/ai/model-services/d76744cb53f9378abf42b907df3efb6e参数列表
| 名字 | 类型 | 位置 | 描述 | 可选值 | 起始版本 |
|---|---|---|---|---|---|
| uuid | String | url | 资源的UUID,唯一标示该资源 | 5.3.20 | |
| name (可选) | String | body(包含在params结构中) | 资源名称 | 5.3.20 | |
| description (可选) | String | body(包含在params结构中) | 资源的详细描述 | 5.3.20 | |
| readme (可选) | String | body(包含在params结构中) | 模型服务的说明文档 | 5.3.20 | |
| size (可选) | long | body(包含在params结构中) | 模型服务的大小 | 5.3.20 | |
| gpuComputeCapability (可选) | String | body(包含在params结构中) | GPU计算能力 | 5.3.20 | |
| installPath (可选) | String | body(包含在params结构中) | 模型服务存储的url | 5.3.20 | |
| startCommand (可选) | String | body(包含在params结构中) | 模型服务的启动命令 | 5.3.20 | |
| pythonVersion (可选) | String | body(包含在params结构中) | 模型服务的python版本 | 5.3.20 | |
| condaVersion (可选) | String | body(包含在params结构中) | 模型服务的conda版本 | 5.3.20 | |
| system (可选) | Boolean | body(包含在params结构中) | 是否为系统默认模型服务 | 5.3.20 | |
| type (可选) | String | body(包含在params结构中) | 模型服务的类型 |
|
5.3.20 |
| yaml (可选) | String | body(包含在params结构中) | 模型服务的yaml配置内容 | 5.3.20 | |
| source (可选) | String | body(包含在params结构中) | 模型服务的源 |
|
5.3.20 |
| framework (可选) | String | body(包含在params结构中) | 模型服务的框架 | 5.3.20 | |
| requestCpu (可选) | Integer | body(包含在params结构中) | 模型服务需求的CPU | 5.3.20 | |
| requestMemory (可选) | Long | body(包含在params结构中) | 模型服务需求的内存 | 5.3.20 | |
| resourceUuid (可选) | String | body(包含在params结构中) | 资源UUID | 5.3.20 | |
| tagUuids (可选) | List | body(包含在params结构中) | 标签UUID列表 | 5.3.20 | |
| systemTags (可选) | List | body | 系统标签 | 5.3.20 | |
| userTags (可选) | List | body | 用户标签 | 5.3.20 | |
| cpuArchitectures (可选) | List | body(包含在params结构中) | 5.3.40 | ||
| architectureImages (可选) | List | body(包含在params结构中) | 不同CPU架构对应的镜像配置。 | 5.3.40 | |
| supportDistributed (可选) | Boolean | body(包含在params结构中) | 是否支持分布式推理 | 5.3.46 | |
| containerCommand (可选) | String | body(包含在params结构中) | 容器启动命令。 | 5.5.22 | |
| containerArgs (可选) | String | body(包含在params结构中) | 容器启动参数。 | 5.5.22 | |
| vendorToSpecUuidsMap (可选) | Map | body(包含在params结构中) | GPU厂商到GPU规格UUID列表的映射。 | 5.5.22 |
API返回
返回示例
{
"inventory": {
"uuid": "90d425a802ff44eeb530f21587633700",
"name": "example",
"description": "Example description for modelService",
"yaml": "services:\n - ports:\n - 3000\n name: qwen1.5-7b-chat:2b34xhrmqwhomjkd\n livez: /livez\n readyz: /readyz\n serviceBootupTime: 30\nenv:\n - key:value\n - key:value\ndistro:\n packages: vim,nfs-utils\npython:\n requirements_txt: ./requirements.txt\n index_url: https://pypi.tuna.tsinghua.edu.cn/simple\n trusted_host: pypi.tuna.tsinghua.edu.cn\n",
"requestCpu": 1,
"requestMemory": 1024,
"modelCenterUuid": "432c5fdd49374bb0a2fd7877f0a877cf",
"type": "Endpoint",
"system": true,
"gpuComputeCapability": "3.7",
"installPath": "/example/install/path",
"pythonVersion": "3.8.10",
"condaVersion": "23.7.4",
"startCommand": "python3 app.py",
"supportDistributed": true,
"modelServiceImages": [
{
"uuid": "c13336388d574fec824752525ae05a21",
"modelServiceUuid": "90d425a802ff44eeb530f21587633700",
"cpuArchitecture": "x86_64",
"vmImageUuid": "b31bc4574f22426a89902edac5d14a72",
"dockerImage": "registry.example.com/x86_64/myimage:latest",
"createDate": "Nov 25, 2025 11:51:50 AM",
"lastOpDate": "Nov 25, 2025 11:51:50 AM"
},
{
"uuid": "46c0e4bb46e849059e6d6bc8881d86f3",
"modelServiceUuid": "90d425a802ff44eeb530f21587633700",
"cpuArchitecture": "aarch64",
"vmImageUuid": "b2085e2ac2d84029a7947cb09178daaa",
"dockerImage": "registry.example.com/aarch64/myimage:latest",
"createDate": "Nov 25, 2025 11:51:50 AM",
"lastOpDate": "Nov 25, 2025 11:51:50 AM"
}
],
"createDate": "Nov 25, 2025 11:51:50 AM",
"lastOpDate": "Nov 25, 2025 11:51:50 AM"
}
}| 名字 | 类型 | 描述 | 起始版本 |
|---|---|---|---|
| success | boolean | 5.3.20 | |
| error | ErrorCode | 错误码,若不为null,则表示操作失败, 操作成功时该字段为null。 详情参考error | 5.3.20 |
#error
| 名字 | 类型 | 描述 | 起始版本 |
|---|---|---|---|
| code | String | 错误码号,错误的全局唯一标识,例如SYS.1000, HOST.1001 | 5.2.1 |
| description | String | 错误的概要描述 | 5.2.1 |
| details | String | 错误的详细信息 | 5.2.1 |
| elaboration | String | 保留字段,默认为null | 5.2.1 |
| opaque | LinkedHashMap | 保留字段,默认为null | 5.2.1 |
| cause | ErrorCode | 根错误,引发当前错误的源错误,若无原错误,该字段为null | 5.2.1 |
SDK示例
Java
SDK
CloneModelServiceAction action = new CloneModelServiceAction();
action.uuid = "d76744cb53f9378abf42b907df3efb6e";
action.name = "example";
action.description = "This is an example description.";
action.size = 0;
action.gpuComputeCapability = "7.5";
action.startCommand = "python start.py";
action.pythonVersion = "3.8";
action.system = false;
action.type = "Endpoint";
action.yaml = "example-yaml-configuration";
action.framework = "HuggingFace";
action.requestCpu = 4;
action.requestMemory = 8192;
action.cpuArchitectures = asList("X86_64","AARCH64");
action.architectureImages = asList([cpuArchitecture:X86_64, vmImageUuid:21cb9f654e9d3feab962ce3f5eec5796, dockerImage:registry.example.com/x86_64/myimage:latest],[cpuArchitecture:AARCH64, vmImageUuid:1d919fda666b3bdcb5f4925f1cf17c71, dockerImage:registry.example.com/aarch64/myimage:latest]);
action.sessionId = "b86c9016b4f24953a9edefb53ca0678c";
CloneModelServiceAction.Result res = action.call();Python SDK
CloneModelServiceAction action = CloneModelServiceAction()
action.uuid = "d76744cb53f9378abf42b907df3efb6e"
action.name = "example"
action.description = "This is an example description."
action.size = 0
action.gpuComputeCapability = "7.5"
action.startCommand = "python start.py"
action.pythonVersion = "3.8"
action.system = false
action.type = "Endpoint"
action.yaml = "example-yaml-configuration"
action.framework = "HuggingFace"
action.requestCpu = 4
action.requestMemory = 8192
action.cpuArchitectures = [X86_64, AARCH64]
action.architectureImages = [[cpuArchitecture:X86_64, vmImageUuid:21cb9f654e9d3feab962ce3f5eec5796, dockerImage:registry.example.com/x86_64/myimage:latest], [cpuArchitecture:AARCH64, vmImageUuid:1d919fda666b3bdcb5f4925f1cf17c71, dockerImage:registry.example.com/aarch64/myimage:latest]]
action.sessionId = "b86c9016b4f24953a9edefb53ca0678c"
CloneModelServiceAction.Result res = action.call()