-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Feature Proposal
Move the mediaType field from FilePart to the base Part object, allowing all part types to specify a media type. Currently if you need to provide a media type, you have to send it as a FilePart which requires base64 encoding if you want to include the content inline. base64 encoding is not particularly efficient and is not great for debugging.
Current Structure
Currently, only FilePart has a mediaType field:
message Part {
oneof part {
string text = 1;
FilePart file = 2;
DataPart data = 3;
}
}
message FilePart {
string name = 1;
string uri = 2;
string mediaType = 3; // Only available on FilePart
}
message DataPart {
google.protobuf.struct data = 1;
}Proposed Structure
Move mediaType to the base Part object and make data an any type to hold structured data that conforms to the JSON object model:
message Part {
string mediaType = 4; // Available for all part types
oneof part {
string text = 1;
FilePart file = 2;
google.protobuf.any data = 3;
}
}
message FilePart {
string name = 1;
string uri = 2;
// mediaType removed from here
}Benefits
1. TextPart with Media Type
Text parts can specify their format explicitly:
text/plain- Plain texttext/markdown- Markdown formatted texttext/html- HTML contenttext/csv- CSV data
{
"mediaType": "text/markdown",
"text": "# Hello\n\nThis is **formatted** markdown."
}2. DataPart with Media Type
Data parts can indicate their content type:
application/json- JSON dataapplication/city+json- Standardized vocabularies based on JSON e.g. https://www.cityjson.org/specs/2.0.1/
{
"mediaType": "application/json",
"data": {
"x": 24,
"y": 32
}
}3. Consistent API
All parts have the same interface for specifying content type, making it easier for:
- Clients to handle content type detection uniformly
- Agents to specify content types for any part
- Middleware to process or transform content based on type
4. Better Content Negotiation
Clients and agents can negotiate content types for all message parts, not just files:
- Request text in markdown vs plain text
- Specify preferred data serialization format
- Handle multiple representations of the same content
Use Cases
Use Case 1: Formatted Text Responses
{
"role": "agent",
"parts": [
{
"mediaType": "text/markdown",
"text": "Here's the analysis:\n\n## Results\n\n- Item 1\n- Item 2"
}
]
}Use Case 2: Structured Data
{
"role": "agent",
"parts": [
{
"mediaType": "application/city+json",
"data": {
"CityObjects": {
"LondonTower": {
"type": "Bridge"
}
}
}
}
]
}Use Case 3: Mixed Content
{
"role": "agent",
"parts": [
{
"mediaType": "text/html",
"text": "<p>See the attached file:</p>"
},
{
"mediaType": "image/png",
"file": {
"name": "chart.png",
"uri": "https://example.com/chart.png"
}
}
]
}See also #1088 for the recommendation to change data to any from struct.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status