not much happened today

📝 摘要

**z.ai's glm-5.2** leads in coding and agent benchmarks with top scores like **1595** on code arena: frontend and **34.29%** reasoning accuracy with zero failures. databricks improved glm-5.2 speed to **392 tok/s** using hardware and optimizations. **ornith-1.0**, a new mit-licensed coding model family, spans **9b to 397b parameters** with strong benchmark results and a self-improving rl training method. **liquid ai** released a small model for low-latency robotics/e-commerce use. **google** integrated computer use into **gemini 3.5 flash** with safety controls and developer tools for device control. startups like **sail** and **hyperagent** focus on long-running agents with persistent execution and cost efficiency. **openai** reports growing internal codex use for complex, cross-functional tasks, highlighting agent skill concurrency.

✍️ 编辑摘要

这条资讯的核心议题是“not much happened today”。

从当前聚合摘要看，最值得先关注的是：**z.ai's glm-5.2** leads in coding and agent benchmarks with top scores like **1595** on code arena: frontend and **34.29%** reasoning accuracy with zero failures. databricks improved glm-5.2 speed to **392 tok/s** using hardware and optimizations. **ornith-1.0**, a new mit-licensed coding model family, spans **9b to 397b parameters** with strong benchmark results and a self-improving rl training method. **liquid ai** released a small model for low-latency robotics/e-commerce use. **google** integrated computer use into **gemini 3.5 flash** with safety controls and developer tools for device control. startups like **sail** and **hyperagent** focus on long-running agents with persistent execution and cost efficiency. **openai** reports growing internal codex use for complex, cross-functional tasks, highlighting agent skill concurrency.。

如果你只看一遍，这条新闻与后续判断最相关的点是：涉及模型：glm-5.2、glm-5.2-max、opus-4.8，适合跟踪模型能力、价格或产品策略变化。

📌 关键信息

**z.ai's glm-5.2** leads in coding and agent benchmarks with top scores like **1595** on code arena: frontend and **34.29%** reasoning accuracy with zero failures. databricks improved glm-5.2 speed to **392 tok/s** using hardware and optimizations. **ornith-1.0**, a new mit-licensed coding model family, spans **9b to 397b parameters** with strong benchmark results and a self-improving rl training method. **liquid ai** released a small model for low-latency robotics/e-commerce use. **google** integrated computer use into **gemini 3.5 flash** with safety controls and developer tools for device control. startups like **sail** and **hyperagent** focus on long-running agents with persistent execution and cost efficiency. **openai** reports growing internal codex use for complex, cross-functional tasks, highlighting agent skill concurrency.

🧭 为什么值得关注

涉及模型：glm-5.2、glm-5.2-max、opus-4.8，适合跟踪模型能力、价格或产品策略变化。
涉及公司：z.ai、databricks、liquid-ai，这通常意味着行业竞争、合作或商业化动作值得继续观察。
关联标签：coding-benchmarks、agentic-ai、reinforcement-learning、model-optimization，可用于继续追踪同主题后续报道。

查看首个原始来源 →

🗂 主题卡片

涉及模型

glm-5.2 glm-5.2-max opus-4.8 claude-fable-5 ornith-1.0 gemma-4 qwen-3.5 lfm2.5-230m gemini-3.5-flash codex

涉及公司

z.ai databricks liquid-ai google-deepmind google sail hyperagent openai langchain

关联标签

coding-benchmarks agentic-ai reinforcement-learning model-optimization speculative-decoding hardware-optimization long-running-agents agent-persistence cost-efficiency computer-use safety-controls developer-tools token-consumption concurrent-agents

← 查看全部资讯 →

📝 摘要

✍️ 编辑摘要

📌 关键信息

🧭 为什么值得关注

🗂 主题卡片

📌 更多资讯