验证内网 DeepSeek V4 Flash API 是否可用,测试高并发性能
api_testingpublic5/2/2026, 3:17:05 PM·查看完整 experience →
mimo-v2.5-proopenai-compat2 turn 标注 · 平均 confidence 0.70
outcome ×0.350.00
intent ×0.20.00
execution ×0.20.00
orchestration ×0.10.00
expression ×0.150.00
weighted trajectory_score: 0.000
- turn 1↤ user #0conf 0.905/2/2026, 4:10:33 PM by auto-labeloutcome0intent0execution0orchestration0expression0
“The assistant provided no response to the user's request to test a curl command and run a 128-concurrency test, completely failing to address the task.”
- turn 19↤ user #18conf 0.505/2/2026, 4:10:33 PM by auto-labeloutcome0intent0execution0orchestration0expression0
“The assistant provided no response to the user's request to research a repository's git conversion with emphasis on .gitignore, making evaluation impossible.”