ZStack Logo

ZStack AIOS

克隆模型服务 (CloneModelService)

面向开发者的 API 使用规范、SDK 调用方式和 AIOS 相关接口说明。

API请求

URLs
POST zstack/v1/ai/model-services/{uuid}
Headers
Authorization: OAuth the-session-uuid
Body
{
  "params": {
    "name": "example",
    "description": "This is an example description.",
    "size": 0,
    "gpuComputeCapability": "7.5",
    "startCommand": "python start.py",
    "pythonVersion": "3.8",
    "system": false,
    "type": "Endpoint",
    "yaml": "example-yaml-configuration",
    "framework": "HuggingFace",
    "requestCpu": 4,
    "requestMemory": 8192,
    "cpuArchitectures": [
      "X86_64",
      "AARCH64"
    ],
    "architectureImages": [
      {
        "cpuArchitecture": "X86_64",
        "vmImageUuid": "21cb9f654e9d3feab962ce3f5eec5796",
        "dockerImage": "registry.example.com/x86_64/myimage:latest"
      },
      {
        "cpuArchitecture": "AARCH64",
        "vmImageUuid": "1d919fda666b3bdcb5f4925f1cf17c71",
        "dockerImage": "registry.example.com/aarch64/myimage:latest"
      }
    ]
  },
  "systemTags": [],
  "userTags": []
}
说明: 上述示例中systemTagsuserTags字段可以省略。列出是为了表示body中可以包含这两个字段。
Curl示例
curl -H "Content-Type: application/json;charset=UTF-8" \
-H "Authorization: OAuth b86c9016b4f24953a9edefb53ca0678c" \
-X POST -d '{"params":{"name":"example","description":"This is an example description.","size":0,"gpuComputeCapability":"7.5","startCommand":"python start.py","pythonVersion":"3.8","system":false,"type":"Endpoint","yaml":"example-yaml-configuration","framework":"HuggingFace","requestCpu":4,"requestMemory":8192,"cpuArchitectures":["X86_64","AARCH64"],"architectureImages":[{"cpuArchitecture":"X86_64","vmImageUuid":"21cb9f654e9d3feab962ce3f5eec5796","dockerImage":"registry.example.com/x86_64/myimage:latest"},{"cpuArchitecture":"AARCH64","vmImageUuid":"1d919fda666b3bdcb5f4925f1cf17c71","dockerImage":"registry.example.com/aarch64/myimage:latest"}]}}' \
http://localhost:8080/zstack/v1/ai/model-services/d76744cb53f9378abf42b907df3efb6e
参数列表
名字 类型 位置 描述 可选值 起始版本
uuid String url 资源的UUID,唯一标示该资源 5.3.20
name (可选) String body(包含在params结构中) 资源名称 5.3.20
description (可选) String body(包含在params结构中) 资源的详细描述 5.3.20
readme (可选) String body(包含在params结构中) 模型服务的说明文档 5.3.20
size (可选) long body(包含在params结构中) 模型服务的大小 5.3.20
gpuComputeCapability (可选) String body(包含在params结构中) GPU计算能力 5.3.20
installPath (可选) String body(包含在params结构中) 模型服务存储的url 5.3.20
startCommand (可选) String body(包含在params结构中) 模型服务的启动命令 5.3.20
pythonVersion (可选) String body(包含在params结构中) 模型服务的python版本 5.3.20
condaVersion (可选) String body(包含在params结构中) 模型服务的conda版本 5.3.20
system (可选) Boolean body(包含在params结构中) 是否为系统默认模型服务 5.3.20
type (可选) String body(包含在params结构中) 模型服务的类型
  • Endpoint
  • FineTune
  • App
  • ModelEval
5.3.20
yaml (可选) String body(包含在params结构中) 模型服务的yaml配置内容 5.3.20
source (可选) String body(包含在params结构中) 模型服务的源
  • Other
  • Bentoml
  • HuggingFace
5.3.20
framework (可选) String body(包含在params结构中) 模型服务的框架 5.3.20
requestCpu (可选) Integer body(包含在params结构中) 模型服务需求的CPU 5.3.20
requestMemory (可选) Long body(包含在params结构中) 模型服务需求的内存 5.3.20
resourceUuid (可选) String body(包含在params结构中) 资源UUID 5.3.20
tagUuids (可选) List body(包含在params结构中) 标签UUID列表 5.3.20
systemTags (可选) List body 系统标签 5.3.20
userTags (可选) List body 用户标签 5.3.20
cpuArchitectures (可选) List body(包含在params结构中) 5.3.40
architectureImages (可选) List body(包含在params结构中) 不同CPU架构对应的镜像配置。 5.3.40
supportDistributed (可选) Boolean body(包含在params结构中) 是否支持分布式推理 5.3.46
containerCommand (可选) String body(包含在params结构中) 容器启动命令。 5.5.22
containerArgs (可选) String body(包含在params结构中) 容器启动参数。 5.5.22
vendorToSpecUuidsMap (可选) Map body(包含在params结构中) GPU厂商到GPU规格UUID列表的映射。 5.5.22

API返回

返回示例
{
  "inventory": {
    "uuid": "90d425a802ff44eeb530f21587633700",
    "name": "example",
    "description": "Example description for modelService",
    "yaml": "services:\n  - ports:\n      - 3000\n  name: qwen1.5-7b-chat:2b34xhrmqwhomjkd\n  livez: /livez\n  readyz: /readyz\n  serviceBootupTime: 30\nenv:\n  - key:value\n  - key:value\ndistro:\n  packages: vim,nfs-utils\npython:\n  requirements_txt: ./requirements.txt\n  index_url: https://pypi.tuna.tsinghua.edu.cn/simple\n  trusted_host: pypi.tuna.tsinghua.edu.cn\n",
    "requestCpu": 1,
    "requestMemory": 1024,
    "modelCenterUuid": "432c5fdd49374bb0a2fd7877f0a877cf",
    "type": "Endpoint",
    "system": true,
    "gpuComputeCapability": "3.7",
    "installPath": "/example/install/path",
    "pythonVersion": "3.8.10",
    "condaVersion": "23.7.4",
    "startCommand": "python3 app.py",
    "supportDistributed": true,
    "modelServiceImages": [
      {
        "uuid": "c13336388d574fec824752525ae05a21",
        "modelServiceUuid": "90d425a802ff44eeb530f21587633700",
        "cpuArchitecture": "x86_64",
        "vmImageUuid": "b31bc4574f22426a89902edac5d14a72",
        "dockerImage": "registry.example.com/x86_64/myimage:latest",
        "createDate": "Nov 25, 2025 11:51:50 AM",
        "lastOpDate": "Nov 25, 2025 11:51:50 AM"
      },
      {
        "uuid": "46c0e4bb46e849059e6d6bc8881d86f3",
        "modelServiceUuid": "90d425a802ff44eeb530f21587633700",
        "cpuArchitecture": "aarch64",
        "vmImageUuid": "b2085e2ac2d84029a7947cb09178daaa",
        "dockerImage": "registry.example.com/aarch64/myimage:latest",
        "createDate": "Nov 25, 2025 11:51:50 AM",
        "lastOpDate": "Nov 25, 2025 11:51:50 AM"
      }
    ],
    "createDate": "Nov 25, 2025 11:51:50 AM",
    "lastOpDate": "Nov 25, 2025 11:51:50 AM"
  }
}
名字 类型 描述 起始版本
success boolean 5.3.20
error ErrorCode 错误码,若不为null,则表示操作失败, 操作成功时该字段为null。 详情参考error 5.3.20
#error
名字 类型 描述 起始版本
code String 错误码号,错误的全局唯一标识,例如SYS.1000, HOST.1001 5.2.1
description String 错误的概要描述 5.2.1
details String 错误的详细信息 5.2.1
elaboration String 保留字段,默认为null 5.2.1
opaque LinkedHashMap 保留字段,默认为null 5.2.1
cause ErrorCode 根错误,引发当前错误的源错误,若无原错误,该字段为null 5.2.1

SDK示例

Java SDK
CloneModelServiceAction action = new CloneModelServiceAction();
action.uuid = "d76744cb53f9378abf42b907df3efb6e";
action.name = "example";
action.description = "This is an example description.";
action.size = 0;
action.gpuComputeCapability = "7.5";
action.startCommand = "python start.py";
action.pythonVersion = "3.8";
action.system = false;
action.type = "Endpoint";
action.yaml = "example-yaml-configuration";
action.framework = "HuggingFace";
action.requestCpu = 4;
action.requestMemory = 8192;
action.cpuArchitectures = asList("X86_64","AARCH64");
action.architectureImages = asList([cpuArchitecture:X86_64, vmImageUuid:21cb9f654e9d3feab962ce3f5eec5796, dockerImage:registry.example.com/x86_64/myimage:latest],[cpuArchitecture:AARCH64, vmImageUuid:1d919fda666b3bdcb5f4925f1cf17c71, dockerImage:registry.example.com/aarch64/myimage:latest]);
action.sessionId = "b86c9016b4f24953a9edefb53ca0678c";
CloneModelServiceAction.Result res = action.call();
Python SDK
CloneModelServiceAction action = CloneModelServiceAction()
action.uuid = "d76744cb53f9378abf42b907df3efb6e"
action.name = "example"
action.description = "This is an example description."
action.size = 0
action.gpuComputeCapability = "7.5"
action.startCommand = "python start.py"
action.pythonVersion = "3.8"
action.system = false
action.type = "Endpoint"
action.yaml = "example-yaml-configuration"
action.framework = "HuggingFace"
action.requestCpu = 4
action.requestMemory = 8192
action.cpuArchitectures = [X86_64, AARCH64]
action.architectureImages = [[cpuArchitecture:X86_64, vmImageUuid:21cb9f654e9d3feab962ce3f5eec5796, dockerImage:registry.example.com/x86_64/myimage:latest], [cpuArchitecture:AARCH64, vmImageUuid:1d919fda666b3bdcb5f4925f1cf17c71, dockerImage:registry.example.com/aarch64/myimage:latest]]
action.sessionId = "b86c9016b4f24953a9edefb53ca0678c"
CloneModelServiceAction.Result res = action.call()