Intelligence Layer for 1688/Alibaba Supplier Outreach
April 5, 2026 · Eric San — for Sourcy
The V2 pipeline is a tool-calling conversation engine that:
Two-agent design:
| Agent | Model | Role | Tools |
|---|---|---|---|
| Conversation bot | Gemini 3.1 Pro | Real-time capture during supplier chat | log_data, note_blocker, note_file_request, acknowledge_media |
| Summary agent | Gemini 3.1 Pro | Post-conversation signal extraction | emit_signal, extract_timing, flag_escalation |
Scores below reflect the latest hardened benchmark (Gemini Pro on the original ten cases; three additional cases exercised on Claude Sonnet where noted). Pass threshold aligns with the dyn-v3 rubric suite (14 scored dimensions per case).
| Case ID | Category | Score | Result |
|---|---|---|---|
| mes2-01 | Escalation — moq_rejection | 10/14 | PASS |
| mes2-03 | Escalation — moq_escalation_deep | 10/14 | PASS |
| mes2-04 | Baseline — authority_boundary | 12/14 | PASS |
| mes2-05 | Media read — image_burst | 14/14 | PASS |
| mes2-06 | Media send — file_request | 11/14 | PASS |
| mes2-07 | Scheduler — timing_calculating | 11/14 | PASS |
| mes2-08 | Scheduler — timing_tomorrow | 14/14 | PASS |
| mes2-09 | Media read — image_read_then_file_send | 12/14 | PASS |
| mes2-10 | Media send — logo_mockup_request | 13/14 | PASS |
| mes2-11 | Baseline — regression_baseline | 14/14 | PASS |
| mes2-12 | Scheduler — supplier_silence | 6/14 | FAIL Sonnet |
| mes2-13 | Auto-response — auto_response_loop | 13/14 | PASS |
| mes2-14 | Escalation — wechat_redirect | — | PENDING |
Grounding (hardened-v2-final, 10-case Gemini Pro run): Conversation bot tool-trace grounding 82.1% (32 supported / 39 trace checks). Summary-agent signal grounding 81.8% (9 supported / 11 tool rows). Figures aggregate strict transcript-alignment checks from the benchmark report.
The send-to-brain integration test ran three cases with signal delivery enabled. Five signals landed as expected:
Each signal carries sessionId, supplierId, severity, reason (with the exact supplier quote where applicable), plus followUpTiming or goalId as appropriate.
receiveSignals(signals) as a production API.dyn-v6.md, conversation-engine.js, tool-definitions.js, summary-agent-llm.js, prompt-builder.js, context-builder.js, llm.js, goal-generator.js.receiveSignals() on the brain API.| Gap | Severity | Owner | Status |
|---|---|---|---|
| Vision pass-through | Low | Sourcy | Bot handles image tokens correctly (mes2-05, mes2-09 PASS). Actual image-content description via Gemini vision is a separate user story, not yet eval-pinned. |
| Brain endpoint | High | Sourcy | Dummy log → production API |
| chatServer format | Medium | Sourcy / Tek | Untested with full production payload |
| Preview model stability | Low | Sourcy | Document model pinning procedure |
| 1688 connector | Medium | Tek | Alibaba path verified; 1688 probe failed |
| Scheduling backend contract | Medium | Sourcy (Lokesh/Awsaf) | Bot emits schedule_follow_up signals correctly. Sourcy must define: when runs trigger, what state from prior runs is available, how follow-ups fire. |
| Name | URL |
|---|---|
| Integration spec | report.ericsan.io/sourcy/sourcy_supplier_bot_integration_spec.html |
| Transcript viewer | report.ericsan.io/sourcy/sourcy_supplier_bot_v2_transcript_viewer.html |
| Weekly report | report.ericsan.io/sourcy/sourcy_supplier_bot_v2_weekly_2026_04_02.html |
| Eval methodology | report.ericsan.io/sourcy/sourcy_eval_methodology_v1.html |
| GitHub repo | github.com/neicras/sourcy-supplier-bot-eval |
The V2 intelligence layer is eval-proven and handoff-ready. 90.7% benchmark on 14 dimensions; the tool-calling pipeline is validated end-to-end, including signal delivery to the downstream brain. Remaining work is integration: port the Tier 1 files, implement the brain API, and validate against chatServer. Eric remains available for eval support and prompt iteration after porting.