feat(uploads): guide agent using agentic search for uploaded documents (#1816)

* feat(uploads): guide agent to use grep/glob/read_file for uploaded documents

Add workflow guidance to the <uploaded_files> context block so the agent
knows to use grep and glob (added in #1784) alongside read_file when
working with uploaded documents, rather than falling back to web search.

This is the final piece of the three-PR PDF agentic search pipeline:
- PR1 (#1727): pymupdf4llm converter produces structured Markdown with headings
- PR2 (#1738): document outline injected into agent context with line numbers
- PR3 (this):  agent guided to use outline + grep + read_file workflow

* feat(uploads): add file-first priority and fallback guidance to uploaded_files context
This commit is contained in:
SHIYAO ZHANG 2026-04-04 11:08:31 +08:00 committed by GitHub
parent fd310582bd
commit bbd0866374
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -102,7 +102,13 @@ class UploadsMiddleware(AgentMiddleware[UploadsMiddlewareState]):
for file in historical_files: for file in historical_files:
self._format_file_entry(file, lines) self._format_file_entry(file, lines)
lines.append("You can read these files using the `read_file` tool with the paths shown above.") lines.append("To work with these files:")
lines.append("- Read from the file first — use the outline line numbers and `read_file` to locate relevant sections.")
lines.append("- Use `grep` to search for keywords when you are not sure which section to look at")
lines.append(" (e.g. `grep(pattern='revenue', path='/mnt/user-data/uploads/')`).")
lines.append("- Use `glob` to find files by name pattern")
lines.append(" (e.g. `glob(pattern='**/*.md', path='/mnt/user-data/uploads/')`).")
lines.append("- Only fall back to web search if the file content is clearly insufficient to answer the question.")
lines.append("</uploaded_files>") lines.append("</uploaded_files>")
return "\n".join(lines) return "\n".join(lines)