Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
4494336
ADd kling
sunqirui1987 Mar 7, 2026
3cf6b2c
kling ai UPDATE
sunqirui1987 Mar 8, 2026
c722586
xai kling
sunqirui1987 Mar 8, 2026
3263b8a
feat(kling): add multi_prompt support for v3 and fix shot indices in …
sunqirui1987 Mar 8, 2026
5e3b54d
docs(examples/openai): rewrite README with complete block-structured …
sunqirui1987 Mar 8, 2026
316d731
feat: OpenAI response refactor, Qiniu image support, Gemini provider …
sunqirui1987 Mar 8, 2026
945de0a
feat(gemini): Backend abstraction, Qiniu provider and examples
sunqirui1987 Mar 8, 2026
88fcb54
refactor(gemini): program to interfaces, simplify docs
sunqirui1987 Mar 8, 2026
a3aa1f4
Merge: add restrict_gen and spec updates
sunqirui1987 Mar 8, 2026
71d1225
feat(vidu): add Vidu video generation specification and examples
sunqirui1987 Mar 9, 2026
9d71aaf
feat(veo): add Veo video generation support via Qiniu provider
sunqirui1987 Mar 9, 2026
7bd1b48
feat: add Sora video generation support with Qiniu provider
sunqirui1987 Mar 9, 2026
955e176
feat: add Sora video generation support with Qiniu provider
sunqirui1987 Mar 9, 2026
601f812
feat(audio): add ASR/TTS spec and Qiniu provider with examples
sunqirui1987 Mar 9, 2026
540f429
feat: support wrapped Qiniu services with runtime API key updates
sunqirui1987 Mar 10, 2026
07e6aa5
add viduq2-turbo | viduq2-pro model
sunqirui1987 Mar 10, 2026
810722d
feat(spec): introduce unified VideoSchema for video generation
sunqirui1987 Mar 11, 2026
7262fff
add vidu audio support
sunqirui1987 Mar 16, 2026
a4b79a7
add vidu q3
sunqirui1987 Mar 16, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,6 @@ go.work.sum
.vscode/

.DS_Store


results.txt
112 changes: 110 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,110 @@
xai
=====
# xai

Unified Go SDK for AI chat, image generation, and video generation. Supports multiple providers (OpenAI-compatible, Gemini, Kling, Sora, Veo, Vidu) through a common API.

## Features

- **Chat Completions**: Text, image, video multimodal chat with streaming
- **Function Calling**: Tool use round-trip with `tool_use` / `tool_result`
- **Image Generation**: Text-to-image, image-to-image, image edit
- **Video Generation**: Text-to-video, image-to-video, remix, keyframe
- **Long-running Operations**: `CallSync` + `TaskID` + `GetTask` for async task persistence

## Prerequisites

- Go 1.24+
- `QINIU_API_KEY` for real API calls (omit for mock mode)

## Installation

```bash
go get github.com/goplus/xai
```

## Quick Start

```bash
# Set API key for real calls
export QINIU_API_KEY=your-key

# OpenAI-compatible chat (text, image, video, function calling)
go run ./examples/openai text
go run ./examples/openai image video function-call

# Gemini chat + image generation
go run ./examples/gemini chat-text image-generate

# Kling image & video
go run ./examples/kling/images kling-v2-1
go run ./examples/kling/video kling-v2-6

# Sora video
go run ./examples/sora text-to-video image-to-video

# Veo video
go run ./examples/veo veo-3.0-generate-preview

# Vidu video
go run ./examples/vidu/video q2-image-pro-audio
```

## Examples Overview

| Example | Description |
|---------|-------------|
| [examples/openai](examples/openai) | OpenAI-compatible chat: text, image, video, multi-video, function calling, thinking mode |
| [examples/gemini](examples/gemini) | Gemini chat + image generation / edit |
| [examples/kling](examples/kling) | Kling image & video: text2image, image2image, text2video, img2video, keyframe |
| [examples/sora](examples/sora) | Sora text-to-video, image-to-video, remix |
| [examples/veo](examples/veo) | Veo text-to-video, image-to-video, first+last frame, reference images |
| [examples/vidu](examples/vidu) | Vidu Q1/Q2/Q2 Pro/Turbo text-to-video, reference-to-video, image-to-video, audio-video |

## Backend Mode

- **Mock** (default): No API key. Returns placeholder URLs. Works in CI.
- **Real**: Set `QINIU_API_KEY` to call Qnagic API.

## Supported Models

| Category | Models |
|----------|--------|
| Image | kling-v1, kling-v1-5, kling-v2, kling-v2-new, kling-v2-1, kling-image-o1 |
| Video | kling-v2-1, kling-v2-5-turbo, kling-v2-6, kling-video-o1, kling-v3, kling-v3-omni |
| Veo | veo-2.0-generate-001, veo-2.0-generate-exp, veo-3.0-generate-preview, veo-3.1-generate-preview, ... |
| Sora | sora-2, sora-2-pro |
| Vidu | vidu-q1, vidu-q2, viduq2-pro, viduq2-turbo |
| Chat | gemini-3.0-pro-preview, deepseek-v3.2, etc. |

## API Usage

```go
package main

import (
"context"
"os"

xai "github.com/goplus/xai/spec"
"github.com/goplus/xai/spec/openai/provider/qiniu"
)

func main() {
svc := qiniu.NewService(os.Getenv("QINIU_API_KEY"))
ctx := context.Background()

// Chat (OpenAI-compatible)
resp, _ := svc.Gen(ctx, svc.Params().
Model(xai.Model("gemini-3.0-pro-preview")).
Messages(svc.UserMsg().Text("Hello")), svc.Options())

// Video generation (Sora): CallSync + Wait for async polling
op, _ := svc.Operation(xai.Model("sora-2"), xai.GenVideo)
op.Params().Set("Prompt", "A cat walking on the beach").Set("Seconds", "4")
opResp, _ := xai.CallSync(ctx, svc, op, svc.Options())
results, _ := xai.Wait(ctx, svc, opResp, nil)
}
```

## License

Apache-2.0
157 changes: 157 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# Examples

Runnable demos for multiple providers/models via the xai API.

## Quick Start

```bash
# Run Kling examples
go run ./examples/kling

# Run Audio examples (ASR/TTS)
go run ./examples/audio
go run ./examples/audio all

# List models, actions, and schema only
go run ./examples/kling models

# Run Veo examples
go run ./examples/veo
go run ./examples/veo all

# Run Sora examples
go run ./examples/sora
go run ./examples/sora all

# Run Vidu examples
go run ./examples/vidu/video
go run ./examples/vidu/video all

# Run by model (Kling)
go run ./examples/kling kling-v2-1
go run ./examples/kling/images kling-v2-1
go run ./examples/kling/video kling-v2-6
```

## Backend Mode

- **Mock** (default): No API key needed. Returns placeholder URLs. Works in CI.
- **Real**: Set `QINIU_API_KEY` to use the Qnagic API for actual generation.

```bash
export QINIU_API_KEY=your-key
go run ./examples/kling kling-v2-1
```

## Directory Structure

```
examples/
├── README.md
├── audio/
│ ├── README.md
│ ├── main.go
│ ├── service.go
│ ├── asr.go
│ ├── tts.go
│ └── list_voices.go
├── vidu/
│ ├── README.md
│ ├── output/
│ │ └── output.go
│ ├── shared/
│ │ └── service.go
│ └── video/
│ ├── main.go
│ ├── urls.go
│ ├── helpers.go
│ ├── call_sync_example.go
│ ├── vidu_q1_text_to_video.go
│ ├── vidu_q1_reference_urls.go
│ ├── vidu_q1_reference_subjects.go
│ ├── vidu_q1_reference_subjects_audio.go
│ ├── vidu_q2_text_to_video.go
│ ├── vidu_q2_reference_urls.go
│ ├── vidu_q2_reference_subjects.go
│ ├── vidu_q2_image_to_video_pro.go
│ ├── vidu_q2_image_to_video_pro_audio.go
│ ├── vidu_q2_image_to_video_turbo.go
│ └── vidu_q2_start_end_to_video_pro.go
├── sora/
│ ├── README.md
│ ├── main.go
│ └── urls.go
├── veo/
│ ├── README.md
│ ├── main.go
│ ├── veo_2_0_generate_001.go
│ ├── veo_2_0_generate_exp.go
│ ├── veo_2_0_generate_preview.go
│ ├── veo_3_0_generate_preview.go
│ ├── veo_3_0_fast_generate_preview.go
│ ├── veo_3_1_generate_preview.go
│ └── veo_3_1_fast_generate_preview.go
├── shared/
│ └── service.go # NewService, NewServiceForModels
└── kling/
├── main.go # Dispatches to images/ and video/ by model
├── models.go # RunModels: list models, actions, schema
├── example_test.go
├── images/
│ ├── main.go
│ ├── urls.go # DemoImageURLs, printImageResults
│ ├── call_sync_example.go # CallSync + TaskID + GetTask
│ ├── kling_v1.go
│ ├── kling_v15.go
│ ├── kling_v2.go
│ ├── kling_v2_new.go
│ ├── kling_v21.go
│ └── kling_image_o1.go
└── video/
├── main.go
├── urls.go # DemoVideoURLs, printVideoResults
├── kling_v21.go
├── kling_v25_turbo.go
├── kling_v26.go
├── kling_video_o1.go
├── kling_v3.go
└── kling_v3_omni.go
```

## Models

**Image models**: kling-v1, kling-v1-5, kling-v2, kling-v2-new, kling-v2-1, kling-image-o1

**Video models**: kling-v2-1, kling-v2-5-turbo, kling-v2-6, kling-video-o1, kling-v3, kling-v3-omni

**Veo models**: veo-2.0-generate-001, veo-2.0-generate-exp, veo-2.0-generate-preview, veo-3.0-generate-preview, veo-3.0-fast-generate-preview, veo-3.1-generate-preview, veo-3.1-fast-generate-preview

**Sora models**: sora-2, sora-2-pro

**Vidu models**: vidu-q1, vidu-q2, viduq2-pro, viduq2-turbo

**Audio models**: asr (ASR), tts-v1 (TTS)

## CallSync + TaskID

The `call-sync` demo shows async task persistence:

- `CallSync` starts the operation and returns resp
- `resp.TaskID()` gets the task ID to save to DB
- `xai.GetTask(ctx, svc, model, action, taskID)` restores OperationResponse from taskID
- `xai.Wait` polls until done

```bash
go run ./examples/kling/images call-sync
```

## Tests

```bash
go test ./examples/kling/... -v -run Example
```

## See Also

- [spec/kling/kling_image.md](../spec/kling/kling_image.md)
- [spec/kling/kling_video.md](../spec/kling/kling_video.md)
53 changes: 53 additions & 0 deletions examples/audio/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Audio Examples

ASR (speech-to-text) and TTS (text-to-speech) examples via `spec/audio` and Qiniu provider.

## Quick Start

```bash
# List available demos
go run ./examples/audio

# Run a single demo (mock mode, no API key)
go run ./examples/audio list-voices
go run ./examples/audio asr
go run ./examples/audio tts
go run ./examples/audio tts-voice

# Run all demos
go run ./examples/audio all
```

## Backend Mode

- **Mock** (default): No API key needed. Returns placeholder results. Works in CI.
- **Real**: Set `QINIU_API_KEY` to use the Qiniu API for actual ASR/TTS.

```bash
export QINIU_API_KEY=your-key
go run ./examples/audio asr
go run ./examples/audio tts
go run ./examples/audio list-voices
```

## Demos

| Demo | Description |
|------|-------------|
| list-voices | List available TTS voices (voice_type, category, sample URL) |
| asr | ASR: Transcribe audio URL to text |
| tts | TTS: Synthesize text to audio (default voice) |
| tts-voice | TTS with specific voice (qiniu_zh_female_wwxkjx) |

## API Mapping

| xai | Qiniu API |
|-----|-----------|
| `Operation(asr, Transcribe)` | POST /v1/voice/asr |
| `Operation(tts-v1, Synthesize)` | POST /v1/voice/tts |
| `Service.ListVoices(ctx)` | GET /v1/voice/list |

## See Also

- [spec/audio/README.md](../../spec/audio/README.md)
- [spec/audio/provider/qiniu/audio.md](../../spec/audio/provider/qiniu/audio.md)
61 changes: 61 additions & 0 deletions examples/audio/asr.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
/*
* Copyright (c) 2026 The XGo Authors (xgo.dev). All rights reserved.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

package main

import (
"context"
"fmt"

xai "github.com/goplus/xai/spec"
"github.com/goplus/xai/spec/audio"
)

func runASR() {
svc := newService()
ctx := context.Background()

op, err := svc.Operation(xai.Model(audio.ModelASR), xai.Transcribe)
if err != nil {
fmt.Println("Operation error:", err)
return
}
op.Params().Set(audio.ParamAudio, DemoAudioURL)
op.Params().Set(audio.ParamFormat, "mp3")

resp, err := xai.CallSync(ctx, svc, op, svc.Options())
if err != nil {
fmt.Println("Call error:", err)
return
}

if !resp.Done() {
fmt.Println("Unexpected: ASR is sync, should be done immediately")
return
}

results := resp.Results()
if results.Len() == 0 {
fmt.Println("No results")
return
}

out := results.At(0).(*xai.OutputText)
fmt.Println("Transcribed text:", out.Text)
if out.Duration != nil {
fmt.Printf("Duration: %.2f seconds\n", *out.Duration)
}
}
Loading