-
Notifications
You must be signed in to change notification settings - Fork 864
Python: Azure AI Search Citation Extraction - V2 #2588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR enhances the Azure AI Search sample by adding citation extraction functionality to demonstrate how to retrieve and display citation information from agent responses.
Key Changes:
- Added
extract_citations_from_responsehelper function to parse citations from agent responses - Modified the query to request detailed winter hotel information
- Changed
query_typefrom "simple" to "vector" mode - Added citation display logic to show extracted citation URLs
| if citations: | ||
| for i, citation in enumerate(citations, 1): | ||
| print(f"Citation {i}:") | ||
| print(f" URL: {citation['url']}") |
Copilot
AI
Dec 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extract_citations_from_response function has incomplete citation output. Currently only the URL is printed (line 94), but the function extracts title, file_id, and positions which are not displayed. This makes the citation information incomplete for users who may need these additional details.
Consider printing all extracted citation information:
print(f"Citation {i}:")
print(f" URL: {citation['url']}")
if citation.get('title'):
print(f" Title: {citation['title']}")
if citation.get('file_id'):
print(f" File ID: {citation['file_id']}")
if citation.get('positions'):
print(f" Positions: {citation['positions']}")| print(f" URL: {citation['url']}") | |
| print(f" URL: {citation['url']}") | |
| if citation.get('title'): | |
| print(f" Title: {citation['title']}") | |
| if citation.get('file_id'): | |
| print(f" File ID: {citation['file_id']}") | |
| if citation.get('positions'): | |
| print(f" Positions: {citation['positions']}") |
| """Extract citation information from an AgentRunResponse.""" | ||
| citations: list[dict[str, Any]] = [] | ||
|
|
||
| if hasattr(response, "messages") and response.messages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can skip a bunch of these checks (it's a sample, so we can be a less strict on typing, to improve readability) and inverse the logic for the rest:
| if hasattr(response, "messages") and response.messages: | |
| from itertools import chain | |
| if not response.messages: | |
| return citations | |
| for content in chain.from_iterable(message.contents for message in response.messages): | |
| if not content.annotations: | |
| continue | |
| # parse the annotations |
Also why not just return the citations instead of this dict?
Motivation and Context
Adds extraction method to sample
Description
Contribution Checklist