Xtracker

海外 AI 大 V 动态 · 左右滑动切换人 更新 2026-07-02 06:07 JST · 9 人 336 条
Elon Musk
@elonmusk
大佬
1 天前
Neuralink has solved through-dura electrode implantation!

This is a very big deal, as it greatly improves the safety and ease of interfacing with the brain.
Neuralink @neuralink
The dura is the brain's armor: a membrane so tough that a surgeon normally cuts through it with a scalpel. For the first time in our clinical trials, we inserted the electrode threads of our implant straight through the dura and into the cortex, keeping the dura intact.

Here's h
Neuralink已经解决了硬脑膜电极植入问题!
♥ 23,275 · ↻ 4,507
2025-09-22
For Charlie
致查理
♥ 1,182,973 · ↻ 96,608
2025-09-22
Every seat in this giant arena that isn’t roped off for security is packed to the ceiling. Honored to be here.

All for Charlie Kirk.
这座巨大体育馆里所有没有被安全隔离绳围起来的座位都坐满了。很荣幸能在这里。
♥ 1,064,655 · ↻ 132,036
2025-07-06
😢
😢
♥ 1,091,304 · ↻ 96,915
2025-06-11
I regret some of my posts about President @realDonaldTrump last week. They went too far.
我对我上周关于@特朗普总统的一些推文感到后悔。它们太过分了。
♥ 1,016,999 · ↻ 85,484
2025-06-09
This is not ok
这不行
♥ 1,273,563 · ↻ 111,885
2025-01-30
♥ 1,291,030 · ↻ 138,513
2025-01-17
Success is uncertain, but entertainment is guaranteed! ✨
成功不确定,但娱乐保证!✨
♥ 1,240,133 · ↻ 119,228
2025-01-01
I have a good feeling about 2025
我对2025年有很好的感觉
♥ 1,071,357 · ↻ 50,737
2024-12-26
♥ 1,076,917 · ↻ 97,379
2024-12-07
♥ 1,203,793 · ↻ 79,069
2024-11-29
♥ 1,034,270 · ↻ 50,128
2024-11-18
Guess who?
猜猜是谁?
♥ 1,311,120 · ↻ 66,146
2024-11-11
Must be a coincidence 🙄
一定是巧合 🙄
♥ 1,032,074 · ↻ 161,179
2024-11-09
♥ 1,106,675 · ↻ 58,884
2024-11-08
They say red light helps you sleep better
他们说红光能帮助你更好地睡眠
♥ 1,617,466 · ↻ 168,594
2024-11-07
Huge thank you to everyone who supported this platform and our mutual quest to support freedom of speech!
非常感谢所有支持这个平台和我们共同追求支持言论自由的人!
♥ 1,306,264 · ↻ 101,246
2024-11-07
It is morning in America again
美国再次迎来了早晨
♥ 2,264,337 · ↻ 199,963
2024-11-06
The future is gonna be fantastic
未来将会非常精彩
♥ 1,357,914 · ↻ 123,958
2024-11-06
The people of America gave @realDonaldTrump a crystal clear mandate for change tonight
今晚,美国人民向@realDonaldTrump发出了明确的变革授权
♥ 1,027,534 · ↻ 102,299
2024-11-06
You are the media now
现在你就是媒体
♥ 1,206,644 · ↻ 120,597
2024-11-06
Let that sink in
好好想想这句话的含义
♥ 2,125,439 · ↻ 213,459
2024-11-06
🇺🇸🇺🇸The future is gonna be so 🔥 🇺🇸🇺🇸
🇺🇸🇺🇸未来将会如此火爆🔥 🇺🇸🇺🇸
♥ 2,504,125 · ↻ 195,240
2024-11-06
Game, set and match
比赛结束,胜负已定
♥ 1,034,345 · ↻ 92,906
2024-11-05
♥ 1,123,979 · ↻ 151,832
2024-10-31
♥ 1,002,619 · ↻ 77,700
2024-10-14
♥ 1,042,859 · ↻ 89,873
2024-10-13
The tower has caught the rocket!!
塔楼抓住了火箭!!
♥ 1,120,213 · ↻ 132,844
2024-10-06
♥ 1,227,095 · ↻ 101,937
2024-08-20
I am willing to serve
我愿意服务
♥ 1,095,656 · ↻ 109,022
2024-08-15
Haters will say this is AI 🕺🕺
讨厌的人会说这是AI 🕺🕺
♥ 1,736,513 · ↻ 204,735
2024-08-13
♥ 1,202,973 · ↻ 103,495
2024-08-06
♥ 2,019,005 · ↻ 126,900
2024-08-05
😬
😬
♥ 1,283,659 · ↻ 176,303
2024-08-02
Good
BRICS News @BRICSinfo
JUST IN: 🇺🇸 🏳️‍⚧️ Donald Trump says he will ban biological males from competing in women's sports. https://t.co/6gxS1dJf9d
好的
♥ 1,710,519 · ↻ 151,685
2024-08-01
Absolutely
Riley Gaines @Riley_Gaines_
Men don't belong in women's sports #IStandWithAngelaCarini

Let's get it trending 🔥 https://t.co/ljlJJwE0hM
当然
♥ 1,732,752 · ↻ 273,303
2024-07-31
This is real
这是真的
♥ 1,067,264 · ↻ 195,467
2024-07-30
El burro sabe mas que Maduro
Ian Miles Cheong @ianmiles
Illegitimate Venezuelan leader Nicolás Maduro, who very likely won reelection through fraud, is declaring war on Elon Musk, whom he calls his "arch-enemy." He claims Elon Musk wants to invade Venezuela with his space rockets. https://t.co/4t1GkKhimz
驴知道的比马杜罗还多
♥ 958,912 · ↻ 147,189
2024-07-30
The people of Venezuela have had enough of this clown 🤡
Mario Nawfal @MarioNawfal
🚨🇻🇪POSTERS OF MADURO ARE BEING TORN DOWN IN VENEZUELA

https://t.co/K5gny3bW2r
委内瑞拉人民受够了这个小丑🤡
♥ 1,017,085 · ↻ 143,205
2024-07-30
Adios Dictatora Maduro
End Wokeness @EndWokeness
Hugo Chávez's statue was just torn down in Venezuela. The people are rising up.

https://t.co/GvzyVUne9M
再见,独裁者马杜罗
♥ 1,086,804 · ↻ 148,488
2024-07-29
Wow, Google has a search ban on President Donald Trump!

Election interference?
哇,谷歌对唐纳德·特朗普总统实施了搜索禁令!
♥ 959,475 · ↻ 154,595
2024-07-22
High time for an AI fashion show
是时候举办一场AI时装秀了
♥ 1,106,778 · ↻ 138,614
2024-07-18
Texas
德克萨斯州
♥ 1,008,915 · ↻ 51,960
2024-07-16
♥ 1,162,977 · ↻ 50,044
2024-07-14
♥ 3,199,283 · ↻ 375,263
2024-07-14
I fully endorse President Trump and hope for his rapid recovery
我完全支持特朗普总统,并希望他早日康复
♥ 2,198,596 · ↻ 333,143
2024-06-22
♥ 1,237,569 · ↻ 105,040
2024-05-25
♥ 1,087,089 · ↻ 84,739
2024-04-13
To an exciting & inspiring future!
向着令人兴奋和鼓舞人心的未来!
♥ 1,048,789 · ↻ 93,287
在 X 上查看 @elonmusk 更多 →
Andrej Karpathy
@karpathy
大佬
2025-11-14
I am unreasonably excited about self-driving. It will be the first technology in many decades to visibly terraform outdoor physical spaces and way of life. Less parked cars. Less parking lots. Much greater safety for people in and out of cars. Less noise pollution. More space
我对自动驾驶感到异常兴奋。这将是几十年来第一个能够明显改变户外物理空间和生活方式的技术。更少的停车车辆。更少的停车场。车内和车外的人有更大的安全性。更少的噪音污染。更多的空间
♥ 21,705 · ↻ 2,013
2025-11-13
I took delivery of a beautiful new shiny HW4 Tesla Model X today, so I immediately took it out for an FSD test drive, a bit like I used to do almost daily for 5 years. Basically... I'm amazed - it drives really, really well, smooth, confident, noticeably better than what I'm used
Ashok Elluswamy @aelluswamy
Full video of the ICCV '25 presentation https://t.co/x7xWvYEUIa
今天我收到了一辆漂亮崭新的HW4特斯拉Model X,所以我立即带它出去进行了FSD测试驾驶,有点像我过去5年里几乎每天都会做的事情。基本上...我感到惊讶 - 它开得真的非常好,平稳、自信,明显比我习惯的要好
♥ 27,597 · ↻ 2,779
2025-10-21
I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter.

The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language
vLLM @vllm_project
🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping ht
我挺喜欢新的DeepSeek-OCR论文。这是一个很好的OCR模型(可能比dots差一点),是的,数据收集等等,但无论如何这并不重要。
♥ 13,271 · ↻ 1,557
2025-10-19
My pleasure to come on Dwarkesh last week, I thought the questions and conversation were really good.

I re-watched the pod just now too. First of all, yes I know, and I'm sorry that I speak so fast :). It's to my detriment because sometimes my speaking thread out-executes my
Dwarkesh Patel @dwarkesh_sp
The @karpathy interview

0:00:00 – AGI is still a decade away
0:30:33 – LLM cognitive deficits
0:40:53 – RL is terrible
0:50:26 – How do humans learn?
1:07:13 – AGI will blend into 2% GDP growth
1:18:24 – ASI
1:33:38 – Evolution of intelligence & culture
1:43:43 - Why self https:
上周我很高兴参加Dwarkesh的节目,我认为问题和对话都非常好。
♥ 16,775 · ↻ 1,966
2025-10-16
TV in the 90s: you turn it on, you watch.

TV 2025:
- turn on, wait for it to load
- popup: TV wants to update, 1.5GB. No.
- scroll sideways, find prime video app or etc
- popup: now app wants to update, 500MB. No!!
- App launching... App loading…
- select account screen
- 🫠
90年代的电视:你打开它,你观看。
♥ 22,225 · ↻ 1,193
2025-10-14
Excited to release new repo: nanochat!
(it's among the most unhinged I've written).

Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single,
很高兴发布新的仓库:nanochat!
♥ 24,120 · ↻ 3,353
2025-10-02
Finally had a chance to listen through this pod with Sutton, which was interesting and amusing.

As background, Sutton's "The Bitter Lesson" has become a bit of biblical text in frontier LLM circles. Researchers routinely talk about and ask whether this or that approach or idea
Dwarkesh Patel @dwarkesh_sp
.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled.

My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning.

And if we have continual learning, we don't need a special training http
终于有机会听完了Sutton的这个播客,很有趣且令人愉快。
♥ 9,493 · ↻ 1,234
2025-09-25
"AI isn't replacing radiologists" good article

Expectation: rapid progress in image recognition AI will delete radiology jobs (e.g. as famously predicted by Geoff Hinton now almost a decade ago). Reality: radiology is doing great and is growing.

There are a lot of imo naive
Deena Mousa @deenamousa
In 2016 Geoffrey Hinton said “we should stop training radiologists now" since AI would soon be better at their jobs.

He was right: models have outperformed radiologists on benchmarks for ~a decade.

Yet radiology jobs are at record highs, with an average salary of $520k.

Why? h
"AI并没有取代放射科医生"好文章
♥ 8,584 · ↻ 1,229
2025-09-06
I think congrats again to OpenAI for cooking with GPT-5 Pro. This is the third time I've struggled on something complex/gnarly for an hour on and off with CC, then 5 Pro goes off for 10 minutes and comes back with code that works out of the box. I had CC read the 5 Pro version
我想再次祝贺OpenAI在GPT-5 Pro上的出色表现。这是我第三次与CC断断续续地在某个复杂/棘手的问题上挣扎了一个小时,然后5 Pro花10分钟离开,回来就能直接使用的代码。我让CC阅读了5 Pro版本
♥ 12,534 · ↻ 711
2025-08-19
I get ~10 spam calls per day (various automated voicemails, "loan pre-approval" etc) and ~5 spam messages per day (usually phishing).

- I have AT&T Active Armor, all of the above still slips through.
- All of the above is always from new, unique numbers so blocking doesn't work.
我每天收到约10个垃圾电话(各种自动语音留言,"贷款预批准"等)和约5条垃圾短信(通常是钓鱼)。
♥ 16,654 · ↻ 570
2025-08-17
I am (slowly) re-reading the Tolkien legendarium (of which Lord of the Rings is a small part). The whole body of work is so incredible and there's nothing else like it... it dilutes other worlds of fiction. Wait - your story doesn't have a comprehensive history/mythology spanning
我正在(慢慢地)重读托尔金的传说集(《指环王》只是其中的一小部分)。整个作品集是如此令人难以置信,没有其他作品可以与之相比...它稀释了其他虚构世界。等等 - 你的故事没有跨越
♥ 15,518 · ↻ 1,260
2025-08-10
I'm noticing that due to (I think?) a lot of benchmarkmaxxing on long horizon tasks, LLMs are becoming a little too agentic by default, a little beyond my average use case.

For example in coding, the models now tend to reason for a fairly long time, they have an inclination to
我注意到,由于(我认为?)在长期任务上的大量基准测试最大化,LLMs默认情况下变得有点太智能,有点超出了我的平均用例。
♥ 10,188 · ↻ 709
2025-07-24
Love this! Supercharger, diner, … but really a kind of exhibit for the future. Plotting a road trip SF -> LA to charge Shadowfax
Tesla @Tesla
Tesla Diner & Supercharger in Hollywood, LA

Open 24/7, starting now https://t.co/nISRNoV89Y
太喜欢这个了!超级充电站、餐厅……但实际上是未来的一种展览。正在计划从旧金山到洛杉矶的公路旅行,给Shadowfax充电
♥ 13,615 · ↻ 1,068
2025-07-07
Knowledge makes the world so much more beautiful.
知识让世界变得更加美丽。
♥ 9,197 · ↻ 968
2025-07-06
How to build a thriving open source community by writing code like bacteria do 🦠. Bacterial code (genomes) are:

- small (each line of code costs energy)
- modular (organized into groups of swappable operons)
- self-contained (easily "copy paste-able" via horizontal gene
如何像细菌一样编写代码,建立一个繁荣的开源社区 🦠。细菌代码(基因组)是:
♥ 8,666 · ↻ 1,076
2025-06-28
The race for LLM "cognitive core" - a few billion param model that maximally sacrifices encyclopedic knowledge for capability. It lives always-on and by default on every computer as the kernel of LLM personal computing.
Its features are slowly crystalizing:

- Natively multimodal
Omar Sanseviero @osanseviero
I’m so excited to announce Gemma 3n is here! 🎉

🔊Multimodal (text/audio/image/video) understanding
🤯Runs with as little as 2GB of RAM
🏆First model under 10B with @lmarena_ai score of 1300+

Available now on @huggingface, @kaggle, llama.cpp, https://t.co/CNDy479EEv, and more https
争夺LLM"认知核心"的竞赛——一个最大化牺牲百科知识以换取能力的数十亿参数模型。它始终保持开启状态,默认存在于每台计算机上,作为LLM个人计算的核心。
♥ 10,640 · ↻ 1,242
2025-06-26
+1 for "context engineering" over "prompt engineering".

People associate prompts with short task descriptions you'd give an LLM in your day-to-day use. When in every industrial-strength LLM app, context engineering is the delicate art and science of filling the context window
tobi lutke @tobi
I really like the term “context engineering” over prompt engineering.

It describes the core skill better: the art of providing all the context for the task to be plausibly solvable by the LLM.
支持"context engineering"(上下文工程)而非"prompt engineering"(提示工程)。
♥ 14,335 · ↻ 2,050
2025-06-19
Nice - my AI startup school talk is now up! Chapters:

0:00 Imo fair to say that software is changing quite fundamentally again. LLMs are a new kind of computer, and you program them *in English*. Hence I think they are well deserving of a major version upgrade in terms of
Y Combinator @ycombinator
Andrej Karpathy's (@karpathy) keynote yesterday at AI Startup School in San Francisco. https://t.co/UM1wfFs98S
太好了 - 我的AI创业学校演讲现已上线!章节:
♥ 8,888 · ↻ 1,233
2025-06-19
Part 2 of this mystery. Spotted on reddit.
In my test not 100% reproducible but still quite reproducible.
🤔
Andrej Karpathy @karpathy
Not fully sure why all the LLMs sound about the same - over-using lists, delving into “multifaceted” issues, over-offering to assist further, about same length responses, etc. Not something I had predicted at first because of many independent companies doing the finetuning.
这个谜题的第二部分。在reddit上发现的。
♥ 9,179 · ↻ 685
2025-06-08
My sleep scores during recent travel were in the 90s. Now back in SF I am consistently back down to 70s, 80s.

I am increasingly convinced that this is due to traffic noise from a nearby road/intersection where I live - every ~10min, a car, truck, bus, or motorcycle with a very
我最近旅行期间的睡眠得分在90多分。现在回到旧金山,我的得分稳定回落到70多分、80多分。
♥ 12,102 · ↻ 755
2025-06-07
Making slides manually feels especially painful now that you know Cursor for slides should exist but doesn’t.
既然你知道应该存在用于幻灯片的Cursor但实际上没有,那么手动制作幻灯片就特别痛苦。
♥ 12,154 · ↻ 471
2025-06-03
An attempt to explain (current) ChatGPT versions.

I still run into many, many people who don't know that:
- o3 is the obvious best thing for important/hard things. It is a reasoning model that is much stronger than 4o and if you are using ChatGPT professionally and not using o3
尝试解释(当前)ChatGPT版本。
♥ 13,316 · ↻ 1,575
2025-05-11
We're missing (at least one) major paradigm for LLM learning. Not sure what to call it, possibly it has a name - system prompt learning?

Pretraining is for knowledge.
Finetuning (SL/RL) is for habitual behavior.

Both of these involve a change in parameters but a lot of human
我们缺少(至少一个)LLM学习的主要范式。不确定该称之为什么,可能它已经有了一个名称——系统提示学习?
♥ 10,331 · ↻ 1,041
2025-05-06
A major mistake I made in my undergrad is that I focused way too much on mathematical lens of computing - computability, decidability, asymptotic complexity etc. And too little on physical lens - energy/heat of state change, data locality, parallelism, computer architecture. The
我本科时犯的一个重大错误是我过分关注计算数学视角——可计算性、可判定性、渐近复杂度等。而对物理视角关注太少——状态变化的能量/热量、数据局部性、并行性、计算机架构。
♥ 13,332 · ↻ 976
2025-04-25
Noticing myself adopting a certain rhythm in AI-assisted coding (i.e. code I actually and professionally care about, contrast to vibe code).

1. Stuff everything relevant into context (this can take a while in big projects. If the project is small enough just stuff everything
将所有相关内容放入上下文(在大项目中这可能需要一段时间。如果项目足够小,只需放入所有内容
♥ 12,253 · ↻ 1,046
2025-03-27
The reality of building web apps in 2025 is that it's a bit like assembling IKEA furniture. There's no "full-stack" product with batteries included, you have to piece together and configure many individual services:

- frontend / backend (e.g. React, Next.js, APIs)
- hosting
2025年构建网络应用的现实情况有点像组装宜家家具。没有"全栈"产品即插即用,你必须组装和配置许多独立服务:
♥ 18,952 · ↻ 1,550
2025-03-23
I just vibe coded a whole iOS app in Swift (without having programmed in Swift before, though I learned some in the process) and now ~1 hour later it's actually running on my physical phone. It was so ez... I had my hand held through the entire process. Very cool.
我刚刚用Swift编写了一个完整的iOS应用(虽然之前没有用Swift编程过,但在过程中学了一些),大约1小时后它实际上在我的实体手机上运行了。太简单了...整个过程都有人引导。非常酷。
♥ 22,135 · ↻ 1,211
2025-03-19
I wrote a quick new post on "Digital Hygiene".

Basically there are some no-brainer decisions you can make in your life to dramatically improve the privacy and security of your computing and this post goes over some of them. Blog post link in the reply, but copy pasting below
我写了一篇关于"数字卫生"的快速新文章。
♥ 26,487 · ↻ 3,479
2025-03-13
It's 2025 and most content is still written for humans instead of LLMs. 99.9% of attention is about to be LLM attention, not human attention.

E.g. 99% of libraries still have docs that basically render to some pretty .html static pages assuming a human will click through them.
现在是2025年,但大多数内容仍然是为人类而不是为大型语言模型(LLM)编写的。99.9%的注意力即将是LLM的注意力,而不是人类的注意力。
♥ 12,647 · ↻ 1,319
2025-02-28
New 2h11m YouTube video: How I Use LLMs

This video continues my general audience series. The last one focused on how LLMs are trained, so I wanted to follow up with a more practical guide of the entire LLM ecosystem, including lots of examples of use in my own life.

Chapters
新的2小时11分钟YouTube视频:我如何使用大型语言模型
♥ 13,760 · ↻ 1,603
2025-02-27
This is interesting as a first large diffusion-based LLM.

Most of the LLMs you've been seeing are ~clones as far as the core modeling approach goes. They're all trained "autoregressively", i.e. predicting tokens from left to right. Diffusion is different - it doesn't go left to
Inception @_inception_ai
We are excited to introduce Mercury, the first commercial-grade diffusion large language model (dLLM)! dLLMs push the frontier of intelligence and speed with parallel, coarse-to-fine text generation. https://t.co/HfjDdoSvIC
作为首个基于扩散的大型语言模型,这很有趣。
♥ 11,436 · ↻ 1,503
2025-02-25
Agency > Intelligence

I had this intuitively wrong for decades, I think due to a pervasive cultural veneration of intelligence, various entertainment/media, obsession with IQ etc. Agency is significantly more powerful and significantly more scarce. Are you hiring for agency? Are
Garry Tan @garrytan
Intelligence is on tap now so agency is even more important
能力 > 智力
♥ 49,943 · ↻ 9,392
2025-02-18
I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check.

Thinking
✅ First, Grok 3 clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan
今天早些时候我获得了Grok 3的早期访问权限,让我成为我认为最早能进行快速氛围检查的人之一。
♥ 16,710 · ↻ 2,184
2025-02-08
Part of the reason for my 3hr general audience LLM intro video is I hope to inspire others to make equivalents in their own domains of expertise, as I’d love to watch them.
我制作3小时面向普通观众的大型语言模型介绍视频的部分原因是我希望激励其他人在他们自己的专业领域制作类似内容,因为我很乐意观看它们。
♥ 9,255 · ↻ 445
2025-02-06
New 3h31m video on YouTube:
"Deep Dive into LLMs like ChatGPT"

This is a general audience deep dive into the Large Language Model (LLM) AI technology that powers ChatGPT and related products. It is covers the full training stack of how the models are developed, along with mental
YouTube上的新视频3小时31分钟:
♥ 20,156 · ↻ 2,907
2025-02-03
There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper
有一种我称之为"氛围编码"的新编程方式,在这种方式中,你完全顺应氛围,拥抱指数级增长,甚至忘记代码的存在。这是可能的,因为大型语言模型(例如带有Sonnet的Cursor Composer)变得太好了。而且我只是用SuperWhisper与Composer对话
♥ 34,130 · ↻ 3,640
2025-01-31
We have to take the LLMs to school.

When you open any textbook, you'll see three major types of information:

1. Background information / exposition. The meat of the textbook that explains concepts. As you attend over it, your brain is training on that data. This is equivalent
背景信息/阐述。教科书中解释概念的核心内容。当你阅读它时,你的大脑正在训练这些数据。这相当于...
♥ 11,861 · ↻ 1,749
2025-01-30
For friends of open source: imo the highest leverage thing you can do is help construct a high diversity of RL environments that help elicit LLM cognitive strategies. To build a gym of sorts. This is a highly parallelizable task, which favors a large community of collaborators.
对于开源的朋友们:imo你能做的最高杠杆的事情是帮助构建多样化的强化学习环境,这些环境有助于激发大语言模型的认知策略。建立某种形式的"训练场"。这是一个高度可并行化的任务,有利于大型协作社区。
♥ 8,385 · ↻ 812
2025-01-29
"Move 37" is the word-of-day - it's when an AI, trained via the trial-and-error process of reinforcement learning, discovers actions that are new, surprising, and secretly brilliant even to expert humans. It is a magical, just slightly unnerving, emergent phenomenon only
"第37步"是今日热词 - 它指的是通过强化学习的试错过程训练的AI,发现了对专家人类来说新颖、令人惊讶且秘密 brilliant 的行动。这是一种神奇的、略带不安的、仅有的涌现现象...
♥ 9,526 · ↻ 1,397
2025-01-28
I don't have too too much to add on top of this earlier post on V3 and I think it applies to R1 too (which is the more recent, thinking equivalent).

I will say that Deep Learning has a legendary ravenous appetite for compute, like no other algorithm that has ever been developed
Andrej Karpathy @karpathy
DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M).

For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being
对于这篇关于V3的早期帖子,我没有太多可以补充的内容,我认为它也适用于R1(这是更近期的思维等效版本)。
♥ 14,265 · ↻ 2,114
2025-01-09
I still do this most days and I think it works great. My morning brain (right after 1hr exercise and 1 coffee) is quite eager to work and I go directly to the one top priority item. The energy decreases over time and with every distracting item loaded into the context window.
broker. @usualbroker
I read this often. https://t.co/4fzSs8AaNe
我大多数日子仍然这样做,而且效果很好。我早晨的大脑(在1小时锻炼和1杯咖啡之后)非常渴望工作,我会直接处理那个最高优先级的任务。能量会随着时间推移而下降,并且随着每个分散注意力的事项加载到上下文窗口中而减少。
♥ 9,797 · ↻ 777
2024-12-27
DeepSeek (Chinese AI co) making it look easy today with an open weights release of a frontier-grade LLM trained on a joke of a budget (2048 GPUs for 2 months, $6M).

For reference, this level of capability is supposed to require clusters of closer to 16K GPUs, the ones being
DeepSeek @deepseek_ai
🚀 Introducing DeepSeek-V3!

Biggest leap forward yet:
⚡ 60 tokens/second (3x faster than V2!)
💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🐋 1/n https://t.co/p1dV9gJ2Sd
DeepSeek(中国AI公司)今天发布了一个前沿级别的大语言模型的开放权重,这个模型是在一个笑话般的预算上训练的(2048个GPU,2个月,600万美元)。
♥ 18,801 · ↻ 2,349
2024-12-17
AI video generation today. When I was back in school, the story of the field of computer graphics (and physically based rendering etc.) was that we will carefully study and model all the object/scene geometry, physics, rendering etc., and after 1000 PhDs and 50 SIGGRAPHs get
Agrim Gupta @agrimgupta92
"A pair of hands skillfully slicing a ripe tomato on a wooden cutting board"

#veo https://t.co/VDuxnkvIa0
当前的AI视频生成。当我上学时,计算机图形学领域(以及基于物理的渲染等)的故事是,我们将仔细研究和建模所有对象/场景几何、物理、渲染等,在1000个博士学位和50个SIGGRAPH会议之后才能...
♥ 8,343 · ↻ 556
2024-12-15
The most bullish AI capability I'm looking for is not whether it's able to solve PhD grade problems. It's whether you'd hire it as a junior intern.

Not "solve this theorem" but "get your slack set up, read these onboarding docs, do this task and let's check in next week".
我最看好的AI能力不是它能否解决博士级别的问题。而是你是否会雇佣它作为初级实习生。
♥ 9,319 · ↻ 644
2024-12-09
"I love traveling the world" 😂
(I think I reference this meme a lot so)
"我爱环游世界" 😂
♥ 10,430 · ↻ 521
2024-12-09
Of ~200 books I've read, the few that stayed with me over time and I find myself often thinking back to or referring to, in ~random order:

All short stories by Ted Chiang, especially Exhalation, Division By Zero, Understand, The Story of Your Life, Liking What You See, The
在我读过的约200本书中,少数几本随着时间的推移一直留在我心中,我发现自己经常回顾或引用它们,顺序大致随机:
♥ 11,990 · ↻ 1,049
2024-12-02
The reality of the Turing test
图灵测试的现实
♥ 15,539 · ↻ 1,206
2024-11-30
People have too inflated sense of what it means to "ask an AI" about something. The AI are language models trained basically by imitation on data from human labelers. Instead of the mysticism of "asking an AI", think of it more as "asking the average data labeler" on the
人们对"向AI询问某事"的含义过于夸大。AI基本上是通过模仿人类标注者的数据进行训练的语言模型。与其将"向AI询问"神秘化,不如更多地将其视为"向普通数据标注者"询问...
♥ 13,190 · ↻ 1,820
2024-11-24
My Gladiator 2 review.
我的《角斗士2》影评。
♥ 11,198 · ↻ 1,003
2024-10-13
By chance I happened to watch this with the music of Interstellar playing in the background. Incredible. Huge 👏 to the team at SpaceX!!
SpaceX @SpaceX
Mechazilla has caught the Super Heavy booster! https://t.co/6R5YatSVJX
偶然间我观看了这部电影,背景音乐是《星际穿越》的配乐。太不可思议了。向SpaceX团队致以巨大的👏!!
♥ 11,669 · ↻ 431
在 X 上查看 @karpathy 更多 →
DeepSeek
@deepseek_ai
中国模型
2026-04-27
🔥DeepSeek Input Cache Price Drop!

Effective immediately, the price for input cache hits across the ENTIRE DeepSeek API series is reduced to just 1/10th of the original price! Build more efficiently for less.

📌Reminder: The DeepSeek-V4-Pro 75% OFF promotion is still active
🔥DeepSeek 输入缓存价格下调!

即刻起,整个 DeepSeek API 系列的输入缓存命中价格降至原价的十分之一!以更低成本构建更高效的应用。

📌提醒:DeepSeek-V4-Pro 75% 折扣促销仍在进行中
♥ 8,373 · ↻ 776
2025-09-29回复
💻 API Update

🎉 Lower costs, same access!
💰 DeepSeek API prices drop 50%+, effective immediately.

🔹 For comparison testing, V3.1-Terminus remains available via a temporary API until Oct 15th, 2025, 15:59 (UTC Time). Details: https://t.co/3RNKA89gHR
🔹 Feedback welcome:
💻 API更新

🎉 成本降低,访问权限不变!
💰 DeepSeek API价格降低50%+,立即生效。

🔹 为了进行对比测试,V3.1-Terminus将通过临时API提供,直到2025年10月15日15:59(UTC时间)。详情:https://t.co/3RNKA89gHR
🔹 欢迎提供反馈:
♥ 1,075 · ↻ 94
2025-09-29回复
⚡️ Efficiency Gains

🤖 DSA achieves fine-grained sparse attention with minimal impact on output quality — boosting long-context performance & reducing compute cost.
📊 Benchmarks show V3.2-Exp performs on par with V3.1-Terminus.

2/n
⚡️ 效率提升

🤖 DSA 实现了细粒度稀疏注意力,对输出质量影响最小 — 提升长上下文性能并降低计算成本。
📊 基准测试显示 V3.2-Exp 的表现与 V3.1-Terminus 相当。

2/n
♥ 679 · ↻ 49
2025-09-29
🚀 Introducing DeepSeek-V3.2-Exp — our latest experimental model!

✨ Built on V3.1-Terminus, it debuts DeepSeek Sparse Attention(DSA) for faster, more efficient training & inference on long context.
👉 Now live on App, Web, and API.
💰 API prices cut by 50%+!

1/n
🚀 介绍 DeepSeek-V3.2-Exp — 我们最新的实验模型!

✨ 基于 V3.1-Terminus 构建,首次推出 DeepSeek 稀疏注意力(DSA),实现长上下文更快、更高效的训练和推理。
👉 现已在 App、Web 和 API 上线。
💰 API 价格降低 50%+!

1/n
♥ 6,993 · ↻ 887
2025-09-22回复
📊 DeepSeek-V3.1-Terminus delivers more stable & reliable outputs across benchmarks compared to the previous version.

👉 Available now on: App / Web / API
🔗 Open-source weights here: https://t.co/Jh4RudofKm

Thanks to everyone for your feedback. It drives us to keep improving
📊 与前一个版本相比,DeepSeek-V3.1-Terminus 在各项基准测试中提供了更稳定和可靠的输出。

👉 现已可用:App / Web / API
🔗 开源模型权重地址:https://t.co/Jh4RudofKm

感谢大家的反馈。它推动我们不断改进
♥ 878 · ↻ 66
2025-09-22
🚀 DeepSeek-V3.1 → DeepSeek-V3.1-Terminus
The latest update builds on V3.1’s strengths while addressing key user feedback.

✨ What’s improved?
🌐 Language consistency: fewer CN/EN mix-ups & no more random chars.
🤖 Agent upgrades: stronger Code Agent & Search Agent performance.
🚀 DeepSeek-V3.1 → DeepSeek-V3.1-Terminus
最新更新基于V3.1的优势,同时解决了关键用户反馈。

✨ 有哪些改进?
🌐 语言一致性:减少中英文混用 & 不再出现随机字符。
🤖 Agent升级:更强的Code Agent和Search Agent性能。
♥ 4,591 · ↻ 521
2025-08-21回复
Pricing Changes 💳

🔹 New pricing starts & off-peak discounts end at Sep 5th, 2025, 16:00 (UTC Time)
🔹 Until then, APIs follow current pricing
📝 Pricing page: https://t.co/IyYitNzedg

5/5
价格变动 💳

🔹 新价格将于2025年9月5日16:00(UTC时间)开始,非高峰期折扣同时结束
🔹 在此之前,API将遵循当前价格
📝 价格页面:https://t.co/IyYitNzedg

5/5
♥ 944 · ↻ 56
2025-08-21回复
Model Update 🤖

🔹 V3.1 Base: 840B tokens continued pretraining for long context extension on top of V3
🔹 Tokenizer & chat template updated — new tokenizer config: https://t.co/r3y717EVFp
🔗 V3.1 Base Open-source weights: https://t.co/5wlDui34hH
🔗 V3.1 Open-source weights:
模型更新 🤖

🔹 V3.1 Base: 在V3基础上继续进行840B tokens的预训练,以扩展长文本上下文
🔹 Tokenizer 和聊天模板已更新 — 新的tokenizer配置:https://t.co/r3y717EVFp
🔗 V3.1 Base 开源权重:https://t.co/5wlDui34hH
🔗 V3.1 开源权重:
♥ 880 · ↻ 48
2025-08-21回复
Tools & Agents Upgrades 🧰

📈 Better results on SWE / Terminal-Bench
🔍 Stronger multi-step reasoning for complex search tasks
⚡️ Big gains in thinking efficiency

3/5
工具与代理升级 🧰

📈 SWE / Terminal-Bench 上获得更好的结果
🔍 复杂搜索任务的多步推理能力增强
⚡️ 思维效率大幅提升

3/5
♥ 822 · ↻ 49
2025-08-21回复
API Update ⚙️

🔹 deepseek-chat → non-thinking mode
🔹 deepseek-reasoner → thinking mode
🧵 128K context for both
🔌 Anthropic API format supported: https://t.co/DcWmJMA1CP
✅ Strict Function Calling supported in Beta API: https://t.co/jFhJQ4wyN3
🚀 More API resources, smoother
API更新 ⚙️

🔹 deepseek-chat → 非思考模式
🔹 deepseek-reasoner → 思考模式
🧵 两者都支持128K上下文
🔌 支持Anthropic API格式:https://t.co/DcWmJMA1CP
✅ Beta API支持严格函数调用:https://t.co/jFhJQ4wyN3
🚀 更多API资源,更流畅
♥ 798 · ↻ 40
2025-08-21
Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀

🧠 Hybrid inference: Think & Non-Think — one model, two modes
⚡️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528
🛠️ Stronger agent skills: Post-training boosts tool use and
推出DeepSeek-V3.1:迈向智能体时代的第一步!🚀

🧠 混合推理:思考与非思考 — 一个模型,两种模式
⚡️ 更快的思考速度:DeepSeek-V3.1-Think相比DeepSeek-R1-0528能更快得出答案
🛠️ 更强的智能体技能:训练后提升了工具使用和
♥ 14,818 · ↻ 1,769
2025-05-29
🚀 DeepSeek-R1-0528 is here!

🔹 Improved benchmark performance
🔹 Enhanced front-end capabilities
🔹 Reduced hallucinations
🔹 Supports JSON output & function calling

✅ Try it now: https://t.co/IMbTch8Pii
🔌 No change to API usage — docs here: https://t.co/Qf97ASptDD
🔗
🚀 DeepSeek-R1-0528 来了!
♥ 9,708 · ↻ 1,474
2025-03-25
🚀 DeepSeek-V3-0324 is out now!

🔹 Major boost in reasoning performance
🔹 Stronger front-end development skills
🔹 Smarter tool-use capabilities

✅ For non-complex reasoning tasks, we recommend using V3 — just turn off “DeepThink”
🔌 API usage remains unchanged
📜 Models are
🚀 DeepSeek-V3-0324 现已发布!
♥ 11,680 · ↻ 1,891
2025-03-01
🚀 Day 6 of #OpenSourceWeek: One More Thing – DeepSeek-V3/R1 Inference System Overview

Optimized throughput and latency via:
🔧 Cross-node EP-powered batch scaling
🔄 Computation-communication overlap
⚖️ Load balancing

Statistics of DeepSeek's Online Service:
⚡ 73.7k/14.8k
🚀 #OpenSourceWeek 第6天:还有一件事 – DeepSeek-V3/R1 推理系统概述
♥ 9,108 · ↻ 1,207
2025-02-28
🚀 Day 5 of #OpenSourceWeek: 3FS, Thruster for All DeepSeek Data Access

Fire-Flyer File System (3FS) - a parallel file system that utilizes the full bandwidth of modern SSDs and RDMA networks.

⚡ 6.6 TiB/s aggregate read throughput in a 180-node cluster
⚡ 3.66 TiB/min
🚀 #OpenSourceWeek 第5天:3FS,所有DeepSeek数据访问的加速器
♥ 10,175 · ↻ 1,234
2025-02-27
🚀 Day 4 of #OpenSourceWeek: Optimized Parallelism Strategies

✅ DualPipe - a bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
🔗 https://t.co/GBtxSvWLT4

✅ EPLB - an expert-parallel load balancer for V3/R1.
🔗
🚀 #OpenSourceWeek 第4天:优化的并行策略
♥ 5,816 · ↻ 797
2025-02-26
🚨 Off-Peak Discounts Alert!

Starting today, enjoy off-peak discounts on the DeepSeek API Platform from 16:30–00:30 UTC daily:

🔹 DeepSeek-V3 at 50% off
🔹 DeepSeek-R1 at a massive 75% off

Maximize your resources smarter — save more during these high-value hours!
🚨 非高峰时段折扣提醒!
♥ 6,622 · ↻ 628
2025-02-26
🚀 Day 3 of #OpenSourceWeek: DeepGEMM

Introducing DeepGEMM - an FP8 GEMM library that supports both dense and MoE GEMMs, powering V3/R1 training and inference.

⚡ Up to 1350+ FP8 TFLOPS on Hopper GPUs
✅ No heavy dependency, as clean as a tutorial
✅ Fully Just-In-Time compiled
🚀 #OpenSourceWeek 第3天:DeepGEMM
♥ 6,387 · ↻ 857
2025-02-25
🚀 Day 2 of #OpenSourceWeek: DeepEP

Excited to introduce DeepEP - the first open-source EP communication library for MoE model training and inference.

✅ Efficient and optimized all-to-all communication
✅ Both intranode and internode support with NVLink and RDMA
🚀 #OpenSourceWeek 第2天:DeepEP
♥ 8,101 · ↻ 1,052
2025-02-24
🚀 Day 1 of #OpenSourceWeek: FlashMLA

Honored to share FlashMLA - our efficient MLA decoding kernel for Hopper GPUs, optimized for variable-length sequences and now in production.

✅ BF16 support
✅ Paged KV cache (block size 64)
⚡ 3000 GB/s memory-bound & 580 TFLOPS
🚀 #OpenSourceWeek 第1天:FlashMLA
♥ 10,148 · ↻ 1,325
2025-02-21
🚀 Day 0: Warming up for #OpenSourceWeek!

We're a tiny team @deepseek_ai exploring AGI. Starting next week, we'll be open-sourcing 5 repos, sharing our small but sincere progress with full transparency.

These humble building blocks in our online service have been documented,
🚀 第0天:为#OpenSourceWeek预热!
♥ 20,412 · ↻ 2,594
2025-02-18
🚀 Introducing NSA: A Hardware-Aligned and Natively Trainable Sparse Attention mechanism for ultra-fast long-context training & inference!

Core components of NSA:
• Dynamic hierarchical sparse strategy
• Coarse-grained token compression
• Fine-grained token selection

💡 With
🚀 介绍NSA:一种硬件对齐且原生可训练的稀疏注意力机制,用于超快速长上下文训练和推理!
♥ 15,318 · ↻ 2,101
2025-02-14
🎉 Excited to see everyone’s enthusiasm for deploying DeepSeek-R1! Here are our recommended settings for the best experience:

• No system prompt
• Temperature: 0.6
• Official prompts for search & file upload: https://t.co/TtjEvldTz5
• Guidelines to mitigate model bypass
🎉 很高兴看到大家部署DeepSeek-R1的热情!以下是我们为获得最佳体验推荐的设置:
♥ 15,656 · ↻ 1,649
2025-02-05回复
📢 Terminology Correction: DeepSeek-R1’s code and models are released under the MIT License.
📢 术语更正:DeepSeek-R1的代码和模型是以MIT许可证发布的。
♥ 878 · ↻ 72
2025-01-28回复
@dapangdun This is a typical impersonation account. Please do not trust any information from this account.
@dapangdun 这是一个典型的冒充账户。请不要相信此账户的任何信息。
♥ 3,620 · ↻ 64
2025-01-28
To prevent any potential harm, we reiterate that @deepseek_ai is our sole official account on Twitter/X.

Any accounts:
- representing us
- using identical avatars
- using similar names
are impersonations.

Please stay vigilant to avoid being misled!
为防止任何潜在危害,我们重申@deepseek_ai是我们在Twitter/X上的唯一官方账户。
♥ 74,596 · ↻ 5,950
2025-01-20回复
🌐 API Access & Pricing

⚙️ Use DeepSeek-R1 by setting model=deepseek-reasoner
💰 $0.14 / million input tokens (cache hit)
💰 $0.55 / million input tokens (cache miss)
💰 $2.19 / million output tokens

📖 API guide: https://t.co/Qf97ASptDD

🐋 5/n
🌐 API访问与定价
♥ 3,418 · ↻ 324
2025-01-20回复
🛠️ DeepSeek-R1: Technical Highlights

📈 Large-scale RL in post-training
🏆 Significant performance boost with minimal labeled data
🔢 Math, code, and reasoning tasks on par with OpenAI-o1
📄 More details: https://t.co/jWMxMVhGAQ

🐋 4/n
🛠️ DeepSeek-R1:技术亮点
♥ 4,897 · ↻ 770
2025-01-20回复
📜 License Update!

🔄 DeepSeek-R1 is now MIT licensed for clear open access
🔓 Open for the community to leverage model weights & outputs
🛠️ API outputs can now be used for fine-tuning & distillation

🐋 3/n
📜 许可证更新!
♥ 4,731 · ↻ 395
2025-01-20回复
🔥 Bonus: Open-Source Distilled Models!

🔬 Distilled from DeepSeek-R1, 6 small models fully open-sourced
📏 32B & 70B models on par with OpenAI-o1-mini
🤝 Empowering the open-source community

🌍 Pushing the boundaries of **open AI**!

🐋 2/n
🔥 额外奖励:开源蒸馏模型!
♥ 3,176 · ↻ 325
2025-01-20
🚀 DeepSeek-R1 is here!

⚡ Performance on par with OpenAI-o1
📖 Fully open-source model & technical report
🏆 MIT licensed: Distill & commercialize freely!

🌐 Website & API are live now! Try DeepThink at https://t.co/v1TFy7LHNy today!

🐋 1/n
🚀 DeepSeek-R1来了!
♥ 35,434 · ↻ 6,771
2025-01-15回复
⚠️ Important Notice:

✅ 100% FREE - No ads, no in-app purchases
🛡️ Download only from official channels to avoid being misled
📲 Search "DeepSeek" in your app store or visit our website for direct links

🌟 3/3
⚠️ 重要通知:
♥ 1,508 · ↻ 120
2025-01-15回复
✨ Key Features of DeepSeek App:

🔐 Easy login: E-mail/Google Account/Apple ID
☁️ Cross-platform chat history sync
🔍 Web search & Deep-Think mode
📄 File upload & text extraction

🌟 2/3
✨ DeepSeek应用的主要功能:
♥ 1,070 · ↻ 121
2025-01-15
🎉 Introducing DeepSeek App!

💡 Powered by world-class DeepSeek-V3
🆓 FREE to use with seamless interaction
📱 Now officially available on App Store & Google Play & Major Android markets
🔗Download now: https://t.co/DIwqqkbK93

🌟 1/3
🎉 推出DeepSeek应用!
♥ 4,051 · ↻ 731
2025-01-10回复
DeepSeek has not issued any cryptocurrency. Currently, there is only one official account on the Twitter platform. We will not contact anyone through other accounts.Please stay vigilant and guard against potential scams.
DeepSeek未发行任何加密货币。目前,在Twitter平台上只有一个官方账户。我们不会通过其他账户联系任何人。请保持警惕,防范潜在诈骗。
♥ 6,409 · ↻ 671
2024-12-26回复
💰 API Pricing Update

🎉 Until Feb 8: same as V2!
🤯 From Feb 8 onwards:
Input: $0.27/million tokens ($0.07/million tokens with cache hits)
Output: $1.10/million tokens

🔥 Still the best value in the market!

🐋 3/n
💰 API定价更新
♥ 1,603 · ↻ 126
2024-12-26回复
🌌 Open-source spirit + Longtermism to inclusive AGI

🌟 DeepSeek’s mission is unwavering. We’re thrilled to share our progress with the community and see the gap between open and closed models narrowing.

🚀 This is just the beginning! Look forward to multimodal support and
🌌 开源精神 + 长期主义 = 包容性AGI
♥ 1,221 · ↻ 91
2024-12-26回复
🎉 What’s new in V3?

🧠 671B MoE parameters
🚀 37B activated parameters
📚 Trained on 14.8T high-quality tokens

🔗 Dive deeper here:
Model 👉 https://t.co/9iwEF6aLuk
Paper 👉 https://t.co/ruzwMFYAAH

🐋 2/n
🎉 V3有什么新功能?
♥ 977 · ↻ 78
2024-12-26
🚀 Introducing DeepSeek-V3!

Biggest leap forward yet:
⚡ 60 tokens/second (3x faster than V2!)
💪 Enhanced capabilities
🛠 API compatibility intact
🌍 Fully open-source models & papers

🐋 1/n
🚀 介绍DeepSeek-V3!
♥ 13,006 · ↻ 2,110
2024-12-13回复
🎉 What Else Can DeepSeek-VL2 Do?

😂 Understand memes like a pro!
🔍 Identify objects from image context!
📖 Craft stories from images—let it be your storyteller!

🧵 5/n
🎉 DeepSeek-VL2还能做什么?
♥ 289 · ↻ 34
2024-12-13回复
💼 Let DeepSeek-VL2 Boost Your Productivity!

🧠 Easily understand complex charts and diagrams
💻 Instantly generate plot code with precision

🧵 4/n
💼 让DeepSeek-VL2提升你的生产力!
♥ 274 · ↻ 41
2024-12-13回复
🔧 Technical Improvements

📊 Data: 2x high-quality training data vs. DeepSeek-VL1, unlocking new capabilities
🏗️ Architecture: Dynamic image tiling for flexible resolutions + efficient DeepSeek-MoE for language
⚙️ Training: Retains 3-stage training + new multi-modal parallel
🔧 技术改进
♥ 59 · ↻ 4
2024-12-13回复
💡 Still Fully Open-Source! Come explore our model and check out the code—technical report coming soon!

💾 Hugging Face: https://t.co/5CnlWKCq5t
💻 Github Page: https://t.co/2skukDeIjW

🧵 2/n
💡 仍然完全开源!来探索我们的模型并查看代码——技术报告即将发布!
♥ 88 · ↻ 7
2024-12-13
🎉 DeepSeek-VL2 is here! Our next-gen vision-language model enters the MoE era.

🤖 DeepSeek-MoE arch + dynamic image tilling
⚡ 3B/16B/27B sizes for flexible use
🏆 Outstanding performance across all benchmarks

🧵 1/n
🎉 DeepSeek-VL2来了!我们的下一代视觉语言模型进入MoE时代。
♥ 1,591 · ↻ 263
2024-12-10回复
🙌 With the release of DeepSeek-V2.5-1210, the V2 series comes to an end.
💪 Since May, the DeepSeek V2 series has brought 5 impactful updates, earning your trust and support along the way.
✨ As V2 closes, it’s not the end—it’s the beginning of something greater. DeepSeek is
🙌 随着DeepSeek-V2.5-1210的发布,V2系列结束。
♥ 323 · ↻ 24
2024-12-10回复
📊 DeepSeek-V2.5-1210 raises the bar across benchmarks like math, coding, writing, and roleplay—built to serve all your work and life needs.

🔧 Explore the open-source model on Hugging Face: https://t.co/PaHvAE92ly

🧵(2/3)
📊 DeepSeek-V2.5-1210在数学、编程、写作和角色扮演等基准测试中提高了标准——旨在满足您所有工作和生活需求。
♥ 353 · ↻ 43
2024-12-10
🚀 DeepSeek-V2.5-1210: The Grand Finale 🎉

🌐 Internet Search is now live on the web! Visit https://t.co/IMbTch8Pii and toggle “Internet Search” for real-time answers. 🕒

🧵(1/3)
🚀 DeepSeek-V2.5-1210:盛大收官 🎉
♥ 1,692 · ↻ 221
2024-11-20回复
🌟 Inference Scaling Laws of DeepSeek-R1-Lite-Preview
Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score improvements on AIME as thought length increases.
🌟 DeepSeek-R1-Lite-Preview的推理扩展定律
♥ 739 · ↻ 87
2024-11-20回复
🌟 Impressive Results of DeepSeek-R1-Lite-Preview Across Benchmarks!
🌟 DeepSeek-R1-Lite-Preview在各项基准测试中取得的令人印象深刻的结果!
♥ 720 · ↻ 79
在 X 上查看 @deepseek_ai 更多 →
Kimi.ai
@Kimi_Moonshot
中国模型
7 天前
The Kimi API is now live on AWS Marketplace. 🚀

If your team is already running on AWS, you can now access Kimi with consolidated billing. Plus, eligible customers can apply Kimi API usage directly toward their AWS EDP commitments.

Build and scale with Kimi today:
Kimi API现已登陆AWS市场。🚀
♥ 541 · ↻ 42
13 天前
Introducing Goal Mode in Kimi Work

Goal lets your desktop agent run 24/7 until the task is done, built for long-horizon tasks and complex multi-step workflows.
Kimi Work推出目标模式
♥ 2,663 · ↻ 183
14 天前转推
RT @prz_chojecki: Kimi 2.7 ranked 2nd after Fable 5 and before GPT-5 xhigh

We have re-run our ErdosBench smoke test on 14 problems with Ki…
RT @prz_chojecki: Kimi 2.7在Fable 5之后、GPT-5 xhigh之前排名第2。
♥ 0 · ↻ 561
16 天前
🌘 Meet Kimi K2.7 Code HighSpeed!
A high-speed mode of our latest open-source multimodal coding model, Kimi K2.7 Code.

⚡️ Up to 6× faster: Around 180 tok/s on coding tasks with median-length inputs, and up to 260 tok/s on shorter-context tasks.

🔷 Rolling out to Kimi Code Beta
🌘 见识一下Kimi K2.7 Code HighSpeed!
♥ 3,912 · ↻ 326
18 天前
Extra API quota for Kimi K2.7 Code builders 🎉

If you're building with Kimi API, get 20%–30% extra quota when you top up $100+ by July 2!

🔷 $100–$299 → +20% quota
🔷 $300–$999 → +25% quota
🔷 $1,000+ → +30% quota

(One bonus per account.)

- Details: https://t.co/HLRaFecpoN
Kimi K2.7 Code开发者额外API配额 🎉
♥ 382 · ↻ 24
19 天前转推
RT @vllm_project: 🎉 Congrats to @Kimi_Moonshot on Kimi K2.7-Code, a coding-focused agentic model built on K2.6.

✨ 1T-parameter Mixture-of-…
RT @vllm_project: 🎉 祝贺@Kimi_Moonshot推出Kimi K2.7-Code,这是一个基于K2.6构建的专注于编码的代理模型。
♥ 0 · ↻ 45
19 天前回复
🎸 We're also launching the Kimi Code Beta Program today. Apply now if you'd like to try upcoming models and features before public release 👉 https://t.co/ArGpJcaYma
🎸 我们今天也推出了Kimi Code测试计划。如果您想在公开发布前试用即将推出的模型和功能,请立即申请 👉 https://t.co/ArGpJcaYma
♥ 519 · ↻ 23
19 天前
🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced!

🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite.
🔷 Reasoning efficiency: Less overthinking, with 30% lower
🌘 Kimi-K2.7-Code,我们最新的编码模型,现已发布并开源!
♥ 13,971 · ↻ 1,747
23 天前回复
Agent Swarm & Instant Document Delivery

Kimi will automatically coordinate 300 sub-agents to break down and execute your tasks. Delivers production-ready output in PPTX, Word, PDF, and Excel, straight to your desktop.
代理集群与即时文档交付
♥ 86 · ↻ 4
23 天前回复
Built for Finance, with Data Source Tool Call

Kimi Desktop paired with agent swarm and data tool call, 300 specialized agents in parallel, each with access to tools:
• Yahoo Finance for market data
• World Bank for economic research
• Binance for crypto analysis
And more
专为金融打造,配备数据源工具调用
♥ 101 · ↻ 7
23 天前回复
Pair it with WebBridge and your agent will navigate websites in your browser: search, scroll, click, type and complete tasks.

Try it 👉 https://t.co/dFMx5atf2e
将其与WebBridge配对,您的代理将在浏览器中导航网站:搜索、滚动、点击、输入并完成任务。
♥ 117 · ↻ 6
23 天前
Meet Kimi Work - a local AI agent on your desktop that does the work for you.

🔹Native agent swarm: Up to 300 AI agents running in parallel on your local machine.
🔹Browser use: Paired with WebBridge extension, your agent will navigate websites in your browser: search, scroll,
介绍Kimi Work - 您桌面上的本地AI代理,为您完成工作。
♥ 2,770 · ↻ 282
23 天前转推
RT @KimiDevs: Kimi Code, our open-source coding agent, just got a major upgrade!

🔹One-line CLI install, zero setup, fast startup​
🔹Drag in…
RT @KimiDevs: Kimi Code,我们的开源编码代理,刚刚进行了重大升级!
♥ 0 · ↻ 243
2026-05-25转推
RT @Designarena: An open source model has returned to #1 on the 3D Design leaderboard by Design Arena.

Kimi K2.6 has reached the top of th…
RT @Designarena: 一个开源模型已回到Design Arena的3D设计排行榜榜首。
♥ 0 · ↻ 57
2026-05-20转推
RT @cerebras: Cerebras is now running Kimi K2.6 – a trillion parameter model – in enterprise trials.

At ~1,000 tokens/s, this is the fast…
RT @cerebras: Cerebras现在在企业试用中运行Kimi K2.6 - 一个万亿参数模型。
♥ 0 · ↻ 325
2026-05-19转推
RT @cursor_ai: Composer 2.5 is built on the same open-source base as Composer 2, Moonshot’s Kimi K2.5.
RT @cursor_ai: Composer 2.5与Composer 2、Moonshot的Kimi K2.5建立在相同的开源基础上。
♥ 0 · ↻ 62
2026-05-14回复
Supports Kimi Code CLI, Claude Code, Cursor, Codex, Hermes, and more.

Try it at: https://t.co/sUqDpi0HQr and the Chrome Web Store.
支持Kimi Code CLI、Claude Code、Cursor、Codex、Hermes等。
♥ 100 · ↻ 14
在 X 上查看 @Kimi_Moonshot 更多 →
MiniMax (official)
@MiniMax_AI
中国模型
14 小时前
Day two of @aiDotEngineer started with a conversation anyone serious about open weights should be paying attention to.

@olive_jy_song, research lead RL, joined @Thom_Wolf to dig into sparse attention, native multimodal training from day zero, and why open-weights matter for
Olive Song @olive_jy_song
Backstage and onstage with @Thom_Wolf and @swyx . I really enjoyed the fireside chat! Thanks for having me back at @aiDotEngineer!

And always proud to be part of these conversations to share our work on sparse attention and native multimodality trained from the start, and why h
@aiDotEngineer 的第二天开始了一场对话,任何认真对待开源权重的人都应该关注。
♥ 56 · ↻ 3
1 天前
Finallyyy with @LambdaAPI
Zach Mueller @TheZachMueller
New model card up, @MiniMax_AI M3! (Working through the Colorado backlog)

At 400B+ parameters, using the unquantized weights ends up needing a full HGX B200 (and I don't believe we can run the MXFP4 on hopper)

Nice addition (on top of performance) is the multi-modality 😍 http
终于与@LambdaAPI在一起了
♥ 66 · ↻ 4
2 天前
Catch us tomorrow at 1 PM UTC as we recap the inaugural BGI Sprint.
Artificial Superintelligence Alliance @ASI_Alliance
Join us this Tuesday, June 30th, 2026, at 1 pm UTC for a new Technical Tuesdays session to reflect on the inaugural BGI Sprint, a multi-track hackathon to build the future of Beneficial General Intelligence. https://t.co/czIA6EQAnK
明天UTC下午1点来找我们,我们将回顾首届BGI冲刺活动。
♥ 52 · ↻ 1
2 天前
This is a glimpse of where local AI is heading and we are glad to be part of it.

Really impressive work by all the teams involved @Gradient_HQ, @tryParallax, and @GA_agent_ai
Gradient @Gradient_HQ
A self-evolving agent + a 428B model + 3 Macs = ?

Your own AI lab.

We ran @MiniMax_AI M3 locally with @tryParallax, right on our desk.

Then @GA_agent_ai took over to create a 5-stock portfolio and write it to disk.

No cloud. No API bills. Nothing left the machine.

Wild to ht
这是本地AI发展方向的一瞥,我们很高兴能成为其中一部分。
♥ 612 · ↻ 51
2 天前
we’ll be at Lab #1 during @aiDotEngineer World’s Fair.

hope to see you there!
AI Engineer @aiDotEngineer
The 2026 World's Fair is completely sold out 🫡

✅ The largest AI industry expo on earth
✅ Sold out on Leadership track for CTOs & VP AI's
✅ Sold out on Workshops tomorrow
✅ Sold out on ALL late bird tickets
🙌 65 side events still FREE all over SF (see website)
What we will https:
在@aiDotEngineer世界博览会期间,我们将在实验室#1。
♥ 36 · ↻ 0
2 天前
Congrats to all the winners of our cohosted hackathon with @cysic_xyz

Check out the stellar projects built with M3 👇
Cysic @cysic_xyz
1/ CyOps Arena has officially ended.

Over the past two weeks, 450+ builders put CyOps to the test, using AI agents to build, iterate, debug, and ship real software.

43 completed projects made it to the leaderboard.

Today, we’re announcing the top 10. 🧵 https://t.co/7knHkvM6ex
祝贺我们与@cysic_xyz联合主办的黑客马拉松的所有获奖者
♥ 54 · ↻ 1
3 天前
next week at @aiDotEngineer, we are joining @togethercompute for a conversation on what goes into running agents at scale.

@olive_jy_song, Research Lead, RL at MiniMax, and
@realDanFu, VP of Kernels at Together AI, will walk through both sides of M3: the training decisions
下周在@aiDotEngineer,我们将加入@togethercompute,讨论大规模运行代理所需考虑的因素。
♥ 59 · ↻ 3
4 天前转推
RT @questflow: New free model on Questflow 🎁

MiniMax M3 is now live on the Questflow and in our AI Trader Arena.

Access is 100% free, ful…
RT @questflow: Questflow上新免费模型 🎁
♥ 0 · ↻ 6
4 天前
👀 Looking forward to seeing builders give it a try tomorrow.

Curious what model is powering it, @browser_use
Alexander Yue @Alezander907
Our new cloud agent (live tomorrow) can make posters!

Its so much nicer to see information in a styled page than plaintext. It can actually do a lot more than just posters, but more on that later...

Try it tomorrow! https://t.co/YZYZmmGIPq
👀 期待看到构建者们明天尝试它。
♥ 106 · ↻ 1
5 天前
we’ll be at AI Engineer After Dark on July 1st with @vercel , @merge_api, @FactoryAI , and a room full of people building the AI engineering stack.

our Research Lead, RL Training @olive_jy_song will be giving a lightning talk on post-training MiniMax M3 as part of a lineup
7月1日,我们将与@vercel、@merge_api、@FactoryAI以及一屋子构建AI工程栈的人一起参加AI Engineer After Dark活动。
♥ 82 · ↻ 9
6 天前转推
RT @CreaoAI: Last night in SF, we gave a room full of creators one challenge: come with an idea, leave with a working demo.

Our CTO @intui
RT @CreaoAI: 昨晚在旧金山,我们给一屋子创作者们出了一个挑战:带着想法来,带着可运行的演示离开。
♥ 0 · ↻ 5
2026-06-01
Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities

- Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas
- MiniMax Sparse Attention scales context to 1M
-
介绍MiniMax M3:首个结合三种前沿能力的开放权重模型
♥ 11,787 · ↻ 1,154
在 X 上查看 @MiniMax_AI 更多 →
jietang
@jietang
智谱/清华
2025-11-13
for GLM-4.6, which features do you want most? Speed up to 100t/s? Stability? Lower price?
Arena.ai @arena
🚀Introducing Code Arena: the next generation of live coding evals for frontier AI models. Built to test how models plan, scaffold, debug, and build real web apps step-by-step.

Try Claude, GPT-5, GLM-4.6 and Gemini in Code Arena today! https://t.co/0OU57FOI8V
对于GLM-4.6,你最想要哪些功能?速度提升到100t/s?稳定性?更低的价格?
♥ 349 · ↻ 18
2025-11-12
We need to try our best to close the gap
Bindu Reddy @bindureddy
Open source achievements this year!

- Deepseek launched!
- GLM and Kimi are top notch agentic models
- Cost of inference is 10x lower than closed SOTA models
- only 8 points behind closed source

Probability of open source closing the gap - 72.34%
我们需要尽力缩小差距
♥ 80 · ↻ 4
2025-10-28
add oil
AICodeKing @aicodeking
Minimax M2 on KingBench:

M2 scores 48%, ranking #12 on the leaderboard.

GLM-4.6 still scores slightly higher.

In my Agentic Benchmarks, M2 ranks #5 just 0.3% below GLM-4.6 and it handles long-running tasks much better than GLM.

Similar to GPT-5 (rather than Claude). h
加油
♥ 115 · ↻ 4
2025-10-27
try 4.6 very soon
mconcat @monoidconcat
GLM 4.5 air running on 4x 3090 with 28 tokens/sec
Pipeline parallelism over PCIe(no nvlink) https://t.co/ARcv2G8seP
很快就会尝试4.6版本
♥ 375 · ↻ 11
2025-10-26
Great job
0xSero @0xSero
Running GLM-4.6 at home on my 6x 3090s using exllamav3! 120k context window at FP16 kv_cache, 200k context at Q8 kv_cache!

This is actually a breakthrough, cerebras are doing god's work

https://t.co/aCtgxbpTAm https://t.co/HxGS7LqqxV
干得好
♥ 25 · ↻ 0
2025-10-24
今天是1024程序员节,在此智谱为大家带来一个「为期 8 天」的小惊喜。

GLM Coding Plan 限时特惠,将SOTA的 Agentic Coding能力,带给更多个人/企业开发者:
- 新人首单 5 折;
- 成功邀请新人下单即可返 40% 平台赠金;
- 被邀好友通过链接下单还可获 10% 优惠。
今天是1024程序员节,在此智谱为大家带来一个「为期 8 天」的小惊喜。
♥ 76 · ↻ 4
2025-10-23
great
Dr. Daniel Bender @drdanielbender
GLM 4.6 is the most powerful open-weight model for coding.

With 375B params, it's likely too large to run it on your own hardware.

Subscriptions from z. ai start at $2.70 (code in comment) for the first month.

Use it on platforms like Claude Code and OpenCode. 👇 https://t.co/d
很棒
♥ 46 · ↻ 2
2025-10-22
Glyph: Scaling Context Windows via Visual-Text Compression
Xiao Liu (Shaw) @ShawLiu12
Why feed 1M tokens when ~250k visual tokens do? 🚀👀

Concurrent to DeepSeek-OCR, today we’re releasing Glyph, a visual-text compression paradigm that turns long text into images and lets a VLM read them.

Paper: https://t.co/dvYaKjWoXW

@karpathy may be you will be also https://t.
Glyph:通过视觉文本压缩扩展上下文窗口
♥ 12 · ↻ 0
2025-10-22
run in home
Ahmad @TheAhmadOsman
best 3 opensource Agentic LLMs you can run at home

> GLM 4.5 Air

> GPT OSS 120B

> GPT OSS 20B

these model excel at executing commands and running tasks in the background on your behalf

and they can run on hardware ranging from

4x RTX 3090s (~$3k) to 1x RTX 3090 (~$700)
在家运行
♥ 9 · ↻ 0
2025-10-21
The best open model
Arena.ai @arena
🚨 WebDev Arena: Top 15 Disrupted!

4 new models have been added to the WebDev leaderboard:

🔸 #4 Claude Sonnet 4.5 Thinking 32k by @AnthropicAI
🔸 #4 GLM 4.6 (the new #1 open model) by @Zai_org
🔸 #11 Qwen3 235B A22B Instruct (and #7 open model) by @Alibaba_Qwen
🔸 #14 Claude http
最好的开源模型
♥ 11 · ↻ 0
2025-10-21
hello world
Melvin Vivas @donvito
Tutorial on how to configure GLM 4.6 in Claude Code (bookmark it)

1. Install the latest Claude Code
npm install -g @anthropic-ai/claude-code

2. Create an account in https://t.co/3OuDd3Y5j1 and buy a coding plan at $3/mo. You can always upgrade later! Use my link https://t.co/5f
hello world
♥ 12 · ↻ 1
2025-10-21
GLM 4.6 #1 open model
Arena.ai @arena
🚨 WebDev Arena: Top 15 Disrupted!

4 new models have been added to the WebDev leaderboard:

🔸 #4 Claude Sonnet 4.5 Thinking 32k by @AnthropicAI
🔸 #4 GLM 4.6 (the new #1 open model) by @Zai_org
🔸 #11 Qwen3 235B A22B Instruct (and #7 open model) by @Alibaba_Qwen
🔸 #14 Claude http
GLM 4.6 #1 开源模型
♥ 40 · ↻ 5
2025-10-15
cool
Md Riyazuddin @riyazmd774
SORA + GLM-4.6 is next-level 🔥

This duo is insane for producing cinematic, high-quality videos no studios, no teams, no $1K/hour directors.

Just two AI tools working together like a full creative department.

Here’s how it works:

→ GLM-4.6 expands your concept into full https:
♥ 12 · ↻ 0
2025-10-15
On webdev arena, GLM is ranked #4 and #1 in all open models
在webdev竞技场中,GLM排名第4,在所有开源模型中排名第1
♥ 234 · ↻ 13
2025-10-15
a picture from LMArena
来自LMArena的一张图片
♥ 11 · ↻ 2
2025-10-14
GLM
Melvin Vivas @donvito
glm code is not real. just playing around 🤣

just use @opencode or claude code for GLM coding plans

that's better than reinventing the wheel https://t.co/z43tYva2sw
GLM
♥ 28 · ↻ 1
2025-10-13
cool
Kai @hqmank
🤖Claude Code vs Droid

I tried a real dev test today building a React Flow mind map. Guess what?

Droid felt way smarter, even though they’re based on the same model. Really curious, why is Droid so much better?

Droid>CC>Codex

@FactoryAI https://t.co/d7LK2dUU8y
♥ 29 · ↻ 0
2025-10-13
GLM写生产级代码
karminski-牙医 @karminski3
都是开放权重,GLM-4.6 能写生产级代码?

开放权重大模型的前端差距正在逐渐拉大, 我最近在打磨大模型测试 Q4 的内容, 给大家带来测试前瞻, 本次包括 DeepSeek-V3.2 和 GLM-4.6, 来看下两个模型差距有多大.

#GLM #智谱 #国产大模型 #GLM46 https://t.co/Rvar81uKGD
GLM写生产级代码
♥ 23 · ↻ 0
2025-10-11
great
Lincoln 🇿🇦 @Presidentlin
GLM is going to make Factory (Droids) so much money, 4.6 is so token efficient.

4.6 Air will be an interesting model to watch.

TLDR one of my predictions for 2026 is that AI coding agent companies are all going to have to become tech shrek.

===

Differentiation for AI Coding
太棒了
♥ 21 · ↻ 1
2025-10-11
Who invented residual neural networks?
残差神经网络是谁发明的?
♥ 24 · ↻ 3
2025-10-11
RL is probably one of the only things that can be easily published ...
Siva Reddy @sivareddyg
Lot of insights in @YejinChoinka's talk on RL training. Rip for next token prediction training (NTP) and welcome to Reinforcement Learning Pretraining (RLP). #COLM2025

No place to even stand in the room. https://t.co/lLMN2YTf0o
强化学习可能是少数能够轻易发表的研究领域之一...
♥ 195 · ↻ 11
2025-10-10
coool
Z.ai @Zai_org
The World’s Top-Ranked AI Coding Solution is coming to Dubai!

Meet https://t.co/IQMnfBczAT at #GITEXGLOBAL 2025 - Booth CC1-89
Explore: Sovereign AI • AI Coding • Industry Agents

📍 Dubai World Trade Centre | Oct 13–17
📧 enterprise@z.ai

#SovereignAI #AICoding https://t.co/6VcjD
♥ 19 · ↻ 0
2025-10-10
less is more
Sebastian Raschka @rasbt
From the Hierarchical Reasoning Model (HRM) to a new Tiny Recursive Model (TRM).

A few months ago, the HRM made big waves in the AI research community as it showed really good performance on the ARC challenge despite its small 27M size. (That's about 22x smaller than the https:/
少即是多
♥ 39 · ↻ 0
2025-10-09
glm #1
Lisan al Gaib @scaling01
DeepSeek is no longer the #1 chinese lab
glm #1
♥ 148 · ↻ 3
2025-10-09
Will do. Thanks a lot.
0xWulf @hexawulf
@Zai_org GLM-4.6 is strong, but these will make it stronger:
1️⃣ Function calling (JSON schemas)
2️⃣ Async SDKs (not just sync requests)
3️⃣ Reasoning trace access (thinking flag exists but output hidden)
4️⃣ Vision API (200K context, but text only)
5️⃣ Batch processing
会做的。非常感谢。
♥ 61 · ↻ 2
2025-10-09
Cool
Melvin Vivas @donvito
I did a simple test of GLM 4.6 in Droid and Claude Code

Here are some observations:
- @FactoryAI Core(GLM 4.6) is faster than the @Zai_org GLM coding plan in Droid
- @Zai_org GLM coding plan is faster in Claude Code compared to Droid

My suggestions:
Droid - use Factory Core(GLM
♥ 19 · ↻ 2
2025-10-09
We are developing the next version of GLM. What features do you want the most?
我们正在开发GLM的下一个版本。你最想要哪些功能?
♥ 1,057 · ↻ 41
2025-10-08
great
Emamul Andalib @Emamul_Andalib
Today, I tried @FactoryAI’s Droid and @Zai_org GLM-4.6. I bought the GLM Coding Max Plan and am satisfied with it. I mainly use the setup to debug issues. Recently, I tested @microsoft’s Playwright and @ChromeDevTools MCP sever to automate some tasks using the browser. Satisfied!
太棒了
♥ 12 · ↻ 0
2025-10-08
coool
Bessi @LLMpsycho
I'm assembling a team.
I just recruited the 2nd member.
Meet my new devops: GLM 4.6 on Droid.
One shooted the deployment. https://t.co/Kjsmj3DmtU
超酷
♥ 31 · ↻ 1
2025-10-08
self-evolution is the next step
Siwei Han @lillianwei423
🚨 Introducing ATP — Alignment Tipping Process!
🔥 Beware! Self-Evolution is gradually pushing LLM Agents off the rails! Even perfect alignment at deployment can gradually forget human alignment and shift toward self-serving strategies.

#AI #LLM #Agents #SelfEvolving #Alignment ht
自我进化是下一步
♥ 16 · ↻ 1
2025-10-08
Thanks for the support. Shall we have the model on Cerebras soon?
Cerebras @cerebras
It's a good model
感谢支持。我们很快就能在Cerebras上使用这个模型吗?
♥ 27 · ↻ 1
2025-10-08
GLM-4.6 the highest score
Factory @FactoryAI
Starting today, you can use any open-source model to power your Droids.

Droids achieve the highest scores across all open-source models on Terminal-Bench. We find GLM 4.6 to be the most performant, remarkably achieving a score in Droid that beats Sonnet 4 in Claude Code. https:/
GLM-4.6 最高分
♥ 224 · ↻ 8
2025-10-07
We are #1 on the trending of OpenRouter today. Thanks for the support.
今天我们在OpenRouter趋势榜上排名第一。感谢支持。
♥ 484 · ↻ 17
2025-10-07
"GLM-4.6 - top of the charts, open-source"
Bindu Reddy @bindureddy
We're dropping four new models today!

GLM- 4.6 - top of the charts, open-source
GPT-5 Pro - experience the luxury
Sora 2 - the best video model in the world
image-gen mini - cheap image model

Will be available on ChatLLM by the end of the day.

MOST EXCITED FOR SORA 2!
"GLM-4.6 - 排行榜第一,开源"
♥ 25 · ↻ 2
2025-10-07
GLM-4.6 is #1 in all open models!
Arena.ai @arena
🚨 New Top Open Model Update!

A relative newcomer to the Arena, @zai_org's GLM-4.6 takes the clear, undisputed #1 spot for Top Open Model. 🏆

It also ranks #4 overall, which is not an easy feat! The next top open model, DeepSeek R1 0528, has been the standing champion for https:/
GLM-4.6 在所有开源模型中排名第一!
♥ 201 · ↻ 10
2025-10-06
Finally, our open-source GLM-4.6 are trending no. 1 on HF. Thanks to all for the support. We are working on the next version and stay tuned!
终于,我们的开源GLM-4.6在HF上成为趋势第一名。感谢大家的支持。我们正在开发下一个版本,敬请期待!
♥ 1,318 · ↻ 61
在 X 上查看 @jietang 更多 →
Hugging Face
@huggingface
海外基建
2025-11-15
This might be the biggest AI hackathon ever:

* >6,300 registrants
* Runs for 2 weeks (Nov. 14-30)
* Open to anyone, anywhere virtually
* $20,000 in cash prizes + $3.5M+ in sponsor credits

Hosted by @Anthropic and @Gradio, along with 10 sponsors, join kickoff in 30 minutes 👇
Gradio @Gradio
Join us LIVE at MCP's first Birthday kickoff at 10 am PT today!🎂

Don't miss out on details about the celebration from the co-hosts, @Gradio and @AnthropicAI.

🔥 We've also got an exciting lineup of speakers from @Huggingface, @OpenAI, @GoogleDeepMind, @modal, @blaxelAI, https://
这可能是史上最大的AI黑客马拉松:
♥ 479 · ↻ 48
2025-08-20
Sun's out, models out. 😎
@IBM & @NASA dropped Surya, an open-source heliophysics model trained on 14 years of observations from NASA’s Solar Dynamics Observatory, and it's 🔥🔥🔥.
太阳出来了,模型也出来了。😎
♥ 578 · ↻ 115
2025-07-13
Kimi K2 is number one trending on HF, congrats!
Kimi K2在HF上排名第一趋势,恭喜!
♥ 970 · ↻ 85
2025-04-14
Super happy to announce that we are acquiring @pollenrobotics to bring open-source robots to the world! 🤖

Since @RemiCadene joined us from Tesla, we’ve become the most widely used software platform for open robotics thanks to @LeRobotHF and the Hugging Face Hub. Now, we’re
非常高兴地宣布,我们正在收购@pollenrobotics,向世界带来开源机器人!🤖
♥ 906 · ↻ 172
2025-04-06
We are excited to partner with @AIatMeta to welcome Llama 4 Maverick (402B) & Scout (109B) natively multimodal Language Models on the Hugging Face Hub with Xet 🤗

Both MoE models trained on up-to 40 Trillion tokens, pre-trained on 200 languages and significantly outperforms its
我们很荣幸与@AIatMeta合作,通过Xet在Hugging Face Hub上原生支持Llama 4 Maverick(402B)和Scout(109B)等多模态语言模型🤗
♥ 686 · ↻ 114
2024-11-30
QwQ is #1 trending!
QwQ是#1趋势!
♥ 768 · ↻ 83
2024-10-03
GoogleからGemma-2-JPNがリリースされました!このモデルはGemma 2 2Bを日本語でfine-tuneしたものです。Gemma 2の英語での性能と同レベルの性能で日本語をサポートします。
モデル一覧:

https://t.co/LgyHREpCOf
Google发布了Gemma-2-JPN!这个模型是对Gemma 2 2B进行日语微调的结果。它支持日语,性能与Gemma 2的英语性能处于同一水平。
♥ 567 · ↻ 169
2024-08-20
We passed 5 million users.

🥳That's 5 million of you who have signed up on the Hub 🚀 thank you for contributing to the ecosystem and making open Machine Learning happen!

We're just getting started 🤗
我们突破了500万用户。
♥ 1,956 · ↻ 236
2024-08-16
Which is your favorite open LLM? Why? 🤗
你最喜欢的开源LLM是什么?为什么?🤗
♥ 461 · ↻ 33
2024-03-04
The Open Source community is amazing 🤗
开源社区太棒了🤗
♥ 469 · ↻ 44
2024-02-29
We're having some infra issues; we're working on it. Please send hugs! 🤗

In the meantime,

import os
os.environ['HF_HUB_OFFLINE']=1
我们遇到了一些基础设施问题;我们正在处理中。请发送拥抱!🤗
♥ 590 · ↻ 55
2023-12-21
Hugging Face 🫶 @GoogleColab

With the latest release of huggingface_hub, you don't need to manually log in anymore. Create a secret once and share it with every notebook you run. 🤗

pip install --upgrade huggingface_hub

Check it out!👇
Hugging Face 🫶 @GoogleColab
♥ 553 · ↻ 105
2023-09-29
How to train a Llama 2 chatbot (a step-by-step guide, designed for non-coders)

https://t.co/hBy35eihxE
如何训练Llama 2聊天机器人(面向非程序员的分步指南)
♥ 948 · ↻ 221
2023-08-28
Code Llama: Now on Hugging Chat 💻🦙

Try out the 34B Instruct model for free with super fast inference!

👉 https://t.co/TQ0ZaVZcdi
Code Llama:现已登陆Hugging Chat 💻🦙
♥ 1,240 · ↻ 265
2023-08-10
TRL 🤗 Hugging Face

Excited to announce that we're doubling down on our efforts to democratize RLHF and reinforcement learning with TRL, new addition to the @huggingface family, developed and led by team member @lvwerra 🎉🎉

Train your first RLHF model 👉https://t.co/sQRUPllJWT
TRL 🤗 Hugging Face
♥ 510 · ↻ 119
2023-08-04
Hugging Face is now part of the PyTorch Foundation as a premier member 🤝

We have been collaborating with the PyTorch team for the past four years and are committed to supporting the project.

We share an objective: to lower the barrier of entry to ML.

https://t.co/Sq3eL9xhVl
Hugging Face现在是PyTorch基金会的高级成员 🤝
♥ 780 · ↻ 129
2023-07-20
Llama 2: Now on Hugging Chat 🤗🦙

Try out the 70B Chat model for free with super fast inference, web search, and powered by open-source tools!

👉 https://t.co/TQ0ZaVZcdi
Llama 2:现已登陆Hugging Chat 🤗🦙
♥ 1,691 · ↻ 426
2023-07-17
At Hugging Face, we are working to enable you to easily build and serve your own LLMs 🧑‍💻👨‍💻👩‍💻
In this blog, we talk about the amazing world of open-source LLMs, the challenges, and how the Hugging Face ecosystem can help you 🪐
Read about them here 👉https://t.co/cOgLmtBQS7
在Hugging Face,我们正努力让您能够轻松构建和部署自己的LLM 🧑‍💻👨‍💻👩‍💻
♥ 465 · ↻ 111
2023-07-02
We are looking into an incident where a malicious user took control over the Hub organizations of Meta/Facebook & Intel via reused employee passwords that were compromised in a data breach on another site. We will keep you updated 🤗
我们正在调查一起事件,恶意用户通过在其他网站数据泄露中被泄露的重复使用员工密码,控制了Meta/Facebook和Intel的Hub组织。我们将向您更新最新情况 🤗
♥ 783 · ↻ 135
2023-06-14
📣 Calling all game dev and AI enthusiasts!🎮

Already 400 people signed up for the first Open Source AI Game Jam, where you'll use AI tools to make a game in a weekend🔥

Sign up here 👉 https://t.co/iDyTq9aXmA

What AI tools? Let's focus today on Audio tools 🔊
⬇️
📣 呼唤所有游戏开发和AI爱好者!🎮
♥ 496 · ↻ 116
2023-05-17
🤗 Transformers has been built by, with, and for the community.

Reaching 100k ⭐ on GitHub is a testament to ML's reach and the community's will to innovate and contribute.

To celebrate, we highlight 100 incredible projects in transformers' vicinity.

https://t.co/gMI1WGRhTm
一行代码加载,一行代码运行
♥ 1,522 · ↻ 277
2023-05-15
The first RNN in transformers! 🤯
Announcing the integration of RWKV models in transformers with @BlinkDL_AI and RWKV community!
RWKV is an attention free model that combines the best from RNNs and transformers.
Learn more about the model in this blogpost: https://t.co/0FQmsaRVZw
高效的批处理支持以生成多个掩码
♥ 1,105 · ↻ 254
2023-05-11
We just released Transformers' boldest feature: Transformers Agents.

This removes the barrier of entry to machine learning

Control 100,000+ HF models by talking to Transformers and Diffusers

Fully multimodal agent: text, images, video, audio, docs...🌎

https://t.co/OILVxIX44I
管道支持以便更轻松的使用
♥ 3,223 · ↻ 780
2023-04-21
SAM, the groundbreaking segmentation model from @Meta is now in available in 🤗 Transformers!
What does this mean?

1. One line of code to load it, one line to run it
2. Efficient batching support to generate multiple masks
3. pipeline support for easier usage

More details: 🧵
来自@Meta的开创性分割模型SAM现已可在🤗 Transformers中使用!
♥ 1,284 · ↻ 236
2023-03-29
THIS IS BIG! 👀

It's now possible to take any of the >30,000 ML apps from Spaces and run them locally (or on your own infrastructure) with the new "Run with @Docker" feature. 🔥🐳

See an app you like? Run it yourself in just 2 clicks🤯
这很重大!👀
♥ 1,616 · ↻ 327
2023-02-22
Today we are excited to announce a new partnership with @awscloud! 🔥

Together, we will accelerate the availability of open-source machine learning 🤝

Read the post 👉 https://t.co/fb77d1J2qX
今天,我们很高兴地宣布与@awscloud建立新的合作伙伴关系!🔥
♥ 691 · ↻ 151
2022-12-31
It's been an exciting year for 🤗Transformers. We tripled the number of weekly active users over 2022, with over 1M users most weeks now and 300k daily pip installs on average🤯
对于🤗Transformers来说,这是令人兴奋的一年。我们使每周活跃用户数量在2022年翻了两番,现在大多数周都有超过100万用户,平均每天有30万次pip安装🤯
♥ 613 · ↻ 82
2022-11-15
Hugging Faceから日本へのお知らせです!

Hugging Faceコースの日本語翻訳を始めました。東北大学のStudent Ambassadorsの皆さんのお陰で第一章の翻訳が終了しました。
今後もコツコツと翻訳していきます。
是非コースを読んでHugging Face Tranformersについて学んで、使ってみてください!
这是Hugging Face给日本的公告!
♥ 786 · ↻ 218
2022-10-18
Scikit-Learn and 🤗 join forces!

With a growing number of tabular classification & regression checkpoints, we believe statistical ML has its place on the HF Hub.

We're excited to partner with sklearn, statistical ML champion, and move forward together.

https://t.co/j3IVGYjFPv
Scikit-Learn和🤗联手!
♥ 483 · ↻ 88
2022-09-20
日本からの嬉しいお知らせです!rinnaが日本語で学習したJapanese Stable DiffusionがHugging Face Spacesでデモ化されました! https://t.co/Mp8sXdGDV0
来自日本的好消息!rinna用日语训练的Japanese Stable Diffusion已在Hugging Face Spaces上进行了演示! https://t.co/Mp8sXdGDV0
♥ 915 · ↻ 264
2022-09-16
Transformers v4.22 is out, and includes the first VIDEO models! 🎥

💥VideoMAE: masked auto-encoders for video
💥X-CLIP: CLIP for video-language

Other nice goodies:
💥Swin Transformer v2
💥Pegasus-X
💥Donut
💥MobileViT

... and MacOS support (device="mps")!
Transformers v4.22已发布,并包含第一个视频模型!🎥
♥ 526 · ↻ 93
2022-09-04
Open Source
开源
♥ 465 · ↻ 42
2022-08-17
🖌️ Stable Diffusion meets 🧨Diffusers!

Releasing diffusers==0.2.2 with full support of @StabilityAI's Stable Diffusion & schedulers 🔥

Google colab:
👉 https://t.co/H3cnpuHVXN

Code snippet 👇
🖌️ Stable Diffusion 遇见 🧨Diffusers!
♥ 564 · ↻ 115
2022-07-22
🧨Diffusion models have been powering impressive ML apps, enabling DALL-E or Imagen

Introducing 🤗 diffusers: a modular toolbox for diffusion techniques, with a focus on:

🚄Inference pipelines
⏰Schedulers
🏭Models
📃Training examples

https://t.co/BEboa1yQyE
🧨扩散模型一直在为令人印象深刻的 ML 应用提供支持,使 DALL-E 或 Imagen 成为可能
♥ 892 · ↻ 190
2022-07-21
Last week, @MetaAI introduced NLLB-200: a massive translation model supporting 200 languages.

Models are now available through the Hugging Face Hub, using 🤗Transformers' main branch.

Models on the Hub: https://t.co/DTz4kXcUuR

Learn about NLLB-200: https://t.co/9R5Mh5JPSO
上周,@MetaAI 推出了 NLLB-200:一个支持 200 种语言的大型翻译模型。
♥ 453 · ↻ 137
2022-05-18
Machine learning demos are increasingly a vital part of releasing a model. Demos allow anyone, not just ML engineers, to try a model, give feedback on predictions, and build trust

That's why we are thrilled to announce @Gradio 3.0: a grounds-up redesign of the Gradio library 🥳
机器学习演示越来越成为发布模型的重要部分。演示让任何人,而不仅仅是 ML 工程师,能够尝试模型、对预测提供反馈并建立信任
♥ 451 · ↻ 91
2022-05-13
Last week @MetaAI publicly released huge LMs, with up to ☄️30B parameters. Great win for Open-Source🎉

These checkpoints are now in 🤗transformers!
But how to use such big checkpoints?

Introducing Accelerate and
⚡️BIG MODEL INFERENCE⚡️

Load & USE the 30B model in colab (!)⬇️
上周 @MetaAI 公开了大型语言模型,参数高达 ☄️300 亿。开源的伟大胜利🎉
♥ 1,132 · ↻ 231
2021-12-16
💫 Perceiver IO by @DeepMind is now available in 🤗 Transformers!

A general purpose deep learning model that works on any modality and combinations thereof
📜text
🖼️ images
🎥 video
🔊 audio
☁️ point clouds
...
Read more in our blog post: https://t.co/5rXpKkgbh2
💫 @DeepMind 的 Perceiver IO 现已在 🤗 Transformers 中可用!
♥ 585 · ↻ 108
2021-12-10
Transformers v4.13.0 is out and it is *big*:

Vision:
- 🖼️ SegFormer
- 🖨️ ImageGPT

Audio:
- 🔡 Language model support for ASR

Multimodal:
- ⚖️ Vision-Text dual encoders

NLP:
- 🔣 mLUKE
- 🏅 DeBERTa-v3

Trainer:
- 1⃣6⃣ The Trainer now supports BF16/TF32!

🌠New doc frontend 🌠
Transformers v4.13.0 已发布,而且它非常 *大*:
♥ 483 · ↻ 87
2021-11-05
TODAY'S A BIG DAY

Spaces are now publicly available

Build, host, and share your ML apps on @huggingface in just a few minutes.

There's no limit to what you can build. Be creative, and share what you make with the community.

🙏 @streamlit and @gradio

https://t.co/KyehQt3Z8u
今天是个大日子
♥ 486 · ↻ 135
2021-10-21
We're thrilled to partner with https://t.co/6jcElMyqbj to create some great new content for their NLP Specialization on Coursera!

With this update, you can access exciting new material and lectures that cover the state of the art in NLP 🧑‍🏫

https://t.co/iMJQxmGvNU
我们很高兴与 https://t.co/6jcElMyqbj 合作,为他们在 Coursera 上的 NLP 专业课程创建一些精彩的新内容!
♥ 464 · ↻ 76
2021-09-30
EleutherAI's GPT-J is now in 🤗 Transformers: a 6 billion, autoregressive model with crazy generative capabilities!

It shows impressive results in:
- 🧮Arithmetics
- ⌨️Code writing
- 👀NLU
- 📜Paper writing
- ...

Play with it to see how powerful it is:
https://t.co/v784T0qNOT
EleutherAI 的 GPT-J 现已进入 🤗 Transformers:一个拥有 60 亿参数、具有惊人生成能力的自回归模型!
♥ 711 · ↻ 168
2021-09-18
🥁 We can't wait to share our new inference product with you! 🤩

- it achieves 1ms latency on Transformer models 🏎
- you can deploy it in your own infrastructure ⚡️
- we call it: 🤗 Infinity 🚀

📅 Join us for a live event and demo on 9/28!
https://t.co/fvhb86gsG7
🥁 我们迫不及待想与您分享我们的新推理产品!🤩
♥ 475 · ↻ 62
2021-08-31
Document parsing meets 🤗 Transformers!

📄#LayoutLMv2 and #LayoutXLM by @MSFTResearch are now available! 🔥

They're capable of parsing document images (like PDFs) by incorporating text, layout, and visual information, as in the @gradio demo below ⬇️

https://t.co/Hr0kFIPHXW
文档解析遇见 🤗 Transformers!
♥ 672 · ↻ 184
在 X 上查看 @huggingface 更多 →
Unsloth AI
@UnslothAI
海外基建
6 天前
What’s your go-to local model right now?
你目前首选的本地模型是什么?
♥ 582 · ↻ 19
8 天前
1-bit GLM-5.2 GGUF vs. Claude 4.8 Opus vs. GPT-5.5

We gave 3 models the same prompt and compared one-shot outputs.

The 1-bit GLM-5.2 GGUF ran locally on a Mac Studio M3 Ultra with 256GB RAM at ~21.6 tok/s.

Which output do you like best?
GGUF: https://t.co/BMkxswdj5N
Unsloth AI @UnslothAI
GLM-5.2 can now be run locally!🔥

The 2-bit model retains ~82% accuracy after we shrunk it from 1.51TB to 238GB (-84% size).

Run on a 256GB Mac or RAM/VRAM setups.

GLM-5.2 is the strongest open model to date.

Guide: https://t.co/bI7FeeKHDd
GGUF: https://t.co/BMkxswdj5N https:/
1-bit GLM-5.2 GGUF 与 Claude 4.8 Opus 与 GPT-5.5 的对比

我们给三个模型提供了相同的提示,并比较了它们的单次输出结果。

1-bit GLM-5.2 GGUF 在配备256GB RAM的Mac Studio M3 Ultra上本地运行,速度约为21.6 tok/s。

你最喜欢哪个输出结果?
GGUF: https://t.co/BMkxswdj5N
♥ 3,564 · ↻ 411
13 天前
GLM-5.2 can now be run locally!🔥

The 2-bit model retains ~82% accuracy after we shrunk it from 1.51TB to 238GB (-84% size).

Run on a 256GB Mac or RAM/VRAM setups.

GLM-5.2 is the strongest open model to date.

Guide: https://t.co/bI7FeeKHDd
GGUF: https://t.co/BMkxswdj5N
GLM-5.2 现在可以在本地运行了!🔥

2位模型在从1.51TB压缩到238GB(体积减少84%)后,仍保留了约82%的准确率。

可在256GB Mac或RAM/VRAM配置上运行。

GLM-5.2 是迄今为止最强大的开源模型。

指南:https://t.co/bI7FeeKHDd
GGUF:https://t.co/BMkxswdj5N
♥ 7,363 · ↻ 896
16 天前
You can now run Kimi K2.7 Code locally! 🌘

We shrank the 1T model to 325GB (-48%) via Dynamic 2-bit where important layers are upcasted.

Run at >40 tok/s on 330GB RAM/VRAM setups.

Run full precision on 610 GB.

Guide: https://t.co/SXZJ3IHMpY
GGUF: https://t.co/2lpUx7u0r8
Kimi.ai @Kimi_Moonshot
🌘 Kimi-K2.7-Code, our latest coding model, is now released and open-sourced!

🔷 Improved coding & agent performance over K2.6: +21.8% on Kimi Code Bench v2, +11.0% on Program Bench, and +31.5% on MLS Bench Lite.
🔷 Reasoning efficiency: Less overthinking, with 30% lower https://t.
您现在可以在本地运行 Kimi K2.7 Code!🌘

我们通过动态2位技术将1T模型压缩到325GB(减少48%),其中重要层进行了上转换。

在330GB RAM/VRAM配置下运行速度可达>40 tok/s。

在610 GB上可运行全精度版本。

指南:https://t.co/SXZJ3IHMpY
GGUF:https://t.co/2lpUx7u0r8
♥ 2,911 · ↻ 303
18 天前转推
RT @ivanfioravanti: Local AI in action! MiniMax M3 unning locally on a single M3 Ultra 512GB in Unsloth Studio! 🔥

Here UD-Q5_K_XL decoding…
RT @ivanfioravanti: 本地AI实战!MiniMax M3在Unsloth Studio的单个M3 Ultra 512GB上本地运行!🔥

这里UD-Q5_K_XL解码…
♥ 0 · ↻ 15
19 天前
MiniMax M3 can now be run locally!🔥

MiniMax-M3 is a new 428B (23B active) open model with 1M context that performs on par with Gemini 3.1 Pro.

Run Dynamic 2-bit GGUF on 138GB RAM/VRAM or 3-bit on 165GB.

GGUF: https://t.co/lwfWsOBNKl
Guide: https://t.co/EP62nmKK0R
MiniMax (official) @MiniMax_AI
MiniMax M3, Open-Weight, Now On Hugging Face , with only ~428B parameters and ~23B activated parameters

Weights:
https://t.co/g4Ybfa2kWH
MiniMax Sparse Attention:
https://t.co/HcTlWRotG3
MiniMax M3 现在可以在本地运行了!🔥

MiniMax-M3 是一个新型 428B(23B 活跃参数)开放模型,具有 1M 上下文能力,性能与 Gemini 3.1 Pro 相当。

在 138GB RAM/VRAM 上运行动态 2-bit GGUF,或在 165GB 上运行 3-bit。

GGUF: https://t.co/lwfWsOBNKl
指南: https://t.co/EP62nmKK0R
♥ 800 · ↻ 99
19 天前
DiffusionGemma can now run at 2000+ tokens/sec! ⚡

We made local DiffusionGemma inference 1.8× faster.

Run it on 18GB RAM via Unsloth Studio.

GitHub: https://t.co/aZWYAtakBP
Guide: https://t.co/wYLfJWE6kG
Unsloth AI @UnslothAI
Google releases DiffusionGemma.✨
The new 26B-A4B diffusion text model runs locally on 18GB RAM.

It supports high-speed text generation, thinking, image, video and 256K context.

Run and train via Unsloth Studio.

GGUF: https://t.co/ZH0dCJQ59P
Guide: https://t.co/wYLfJWE6kG https
DiffusionGemma 现在可以以 2000+ tokens/sec 的速度运行!⚡

我们将本地 DiffusionGemma 推理速度提高了 1.8 倍。

通过 Unsloth Studio 在 18GB RAM 上运行它。

GitHub: https://t.co/aZWYAtakBP
指南: https://t.co/wYLfJWE6kG
♥ 1,733 · ↻ 186
20 天前
Gemma 4 now runs 2x faster with MTP GGUFs! Run locally on just 6GB RAM. ⚡️

MTP enables Google Gemma 4 run ~1.4–2.2× faster with no accuracy loss.

Gemma 4 12B MTP can run at 162 t/s vs. 52 t/s without MTP. 31B reaches 101 t/s.

GGUFs + Guide: https://t.co/c4gAUlb6YE
Gemma 4 现在使用 MTP GGUFs 运行速度提升 2 倍!仅需 6GB RAM 即可在本地运行。⚡️

MTP 使 Google Gemma 4 的运行速度提升约 1.4-2.2 倍,且不会损失准确性。

Gemma 4 12B MTP 的运行速度可达 162 t/s,而未使用 MTP 时为 52 t/s。31B 版本可达 101 t/s。

GGUFs + 指南:https://t.co/c4gAUlb6YE
♥ 2,165 · ↻ 259
21 天前
Google releases DiffusionGemma.✨
The new 26B-A4B diffusion text model runs locally on 18GB RAM.

It supports high-speed text generation, thinking, image, video and 256K context.

Run and train via Unsloth Studio.

GGUF: https://t.co/ZH0dCJQ59P
Guide: https://t.co/wYLfJWE6kG
Google发布DiffusionGemma。✨
新的26B-A4B扩散文本模型可在18GB RAM上本地运行。

它支持高速文本生成、思考、图像、视频和256K上下文。

通过Unsloth Studio运行和训练。

GGUF: https://t.co/ZH0dCJQ59P
指南: https://t.co/wYLfJWE6kG
♥ 1,848 · ↻ 250
26 天前
Google releases Gemma 4 QAT. ✨
You can now run Gemma 4 at 3x less memory with near original performance.

Quantization-Aware Training (QAT) makes it possible to run Gemma 4 26B-A4B on 16GB RAM.

GGUFs: https://t.co/wQgEocxUId
QAT Guide: https://t.co/Nsm1yeGEHx
Google Gemma @googlegemma
We just dropped Gemma 4 Quantization-Aware Training (QAT) checkpoints on Hugging Face!

All Gemma 4 model sizes and their drafters are now optimized with QAT to cut memory requirements and maximize on-device performance!
Google 发布了 Gemma 4 QAT。✨
现在您可以用少 3 倍的内存运行 Gemma 4,性能接近原始水平。

量化感知训练 (QAT) 使在 16GB RAM 上运行 Gemma 4 26B-A4B 成为可能。

GGUFs: https://t.co/wQgEocxUId
QAT 指南: https://t.co/Nsm1yeGEHx
♥ 2,896 · ↻ 411
27 天前
You can now run NVIDIA Nemotron 3 Ultra, a new 550B open model.

Nemotron-3-Ultra-550B-A55B is NVIDIA's largest LLM yet, with 1M context, frontier coding & chat.

Run 2-bit on 200GB RAM, 3-bit on 256GB, 8-bit on 600GB.

GGUF: https://t.co/5WLT7AqUBA
Guide: https://t.co/APXPndWg5V
NVIDIA AI @NVIDIAAI
Today we're shipping Nemotron 3 Ultra.

A 550B MoE frontier-intelligence open model built for long-running agents.

It delivers 5x faster inference and lowers the cost of complex agentic tasks by up to 30% versus other open frontier models. https://t.co/FEXqvfzQFO
你现在可以运行 NVIDIA Nemotron 3 Ultra,这是一个新的 550B 开源模型。

Nemotron-3-Ultra-550B-A55B 是 NVIDIA 目前最大的 LLM,具有 1M 上下文、前沿编程和聊天功能。

在 200GB RAM 上运行 2-bit,在 256GB 上运行 3-bit,在 600GB 上运行 8-bit。

GGUF: https://t.co/5WLT7AqUBA
指南: https://t.co/APXPndWg5V
♥ 414 · ↻ 46
27 天前
2-bit Gemma 4 12B GGUF, only 4.66 GB on disk, managed to cite 15 sites from a single prompt.

Try this locally on >6GB RAM via Unsloth Studio.

GitHub: https://t.co/aZWYAtakBP
Unsloth AI @UnslothAI
Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs.

Google's new model, Gemma 4 12B Unified supports image, audio and 256K context.

You can run and train the model via Unsloth Studio.

GGUF: https://t.co/8cL321pVDh
Guide: https://t.co/odRo9WjRpA https://t.co/Ax09
2-bit Gemma 4 12B GGUF,仅需4.66 GB磁盘空间,成功从单个提示中引用了15个网站。
♥ 1,627 · ↻ 193
27 天前回复
Vision and audio support for Gemma 4 12B GGUF is now added.

Please update to the latest version of Unsloth and llama.cpp. 🙏
Gemma 4 12B GGUF现已添加视觉和音频支持。
♥ 70 · ↻ 7
28 天前
Gemma 4 12B can now run locally on just 8GB RAM via Dynamic GGUFs.

Google's new model, Gemma 4 12B Unified supports image, audio and 256K context.

You can run and train the model via Unsloth Studio.

GGUF: https://t.co/8cL321pVDh
Guide: https://t.co/odRo9WjRpA
Gemma 4 12B现在只需8GB RAM即可通过Dynamic GGUFs在本地运行。
♥ 2,814 · ↻ 379
28 天前
Local models are coming to your laptop soon! 🚀

We're excited to partner with @Microsoft to enable millions of developers run local models on Windows!
Windows Developer @windowsdev
Aion 1.0 Plan represents an evolution of what the Windows on-device AI platform is capable of at scale!

Thrilled to partner with @UnslothAI on optimization across our silicon ecosystem.

More #MSBuild news here: https://t.co/EYyEFuBze7 https://t.co/tZyxLJUtJk
本地模型即将登陆您的笔记本电脑!🚀
♥ 542 · ↻ 49
2026-03-18
Introducing Unsloth Studio ✨
A new open-source web UI to train and run LLMs.

• Run models locally on Mac, Windows, Linux
• Train 500+ models 2x faster with 70% less VRAM
• Supports GGUF, vision, audio, embedding models
• Auto-create datasets from PDF, CSV, DOCX
推出Unsloth Studio ✨
♥ 5,453 · ↻ 888
在 X 上查看 @UnslothAI 更多 →
Arena.ai
@arena
海外基建
13 分钟前回复
Fable 5 is now in Agent Mode - we’ll see if it holds the lead as new traces come in. https://t.co/8ujN06t7FN
Fable 5 现已进入代理模式 - 我们将随着新线索的加入,看看它是否能保持领先地位。https://t.co/8ujN06t7FN
♥ 2 · ↻ 0
13 分钟前
Fable 5 is back in the Arena!

When it first debuted, Fable 5 ranked #1 in Agent Arena: our benchmark for real-world, long-horizon agentic performance. Agent Arena evaluates models on millions of real tasks submitted by a global community of users, with access to web search,
Claude @claudeai
Fable 5 is back. https://t.co/9RTGUCcPHy
Fable 5 回到了竞技场!
♥ 18 · ↻ 2
2 小时前转推
RT @felicis: What began as a UC Berkeley research project has become one of the fastest-growing and most trusted AI infrastructure companie…
RT @felicis:始于加州大学伯克利分校的一个研究项目,现已成为增长最快、最受信赖的AI基础设施公司之一...
♥ 0 · ↻ 13
16 小时前转推
RT @arena: Check out first impressions of Claude Sonnet 5 in the Agent Arena on YouTube with @petergostev.

Scores coming soon.

https://…
RT @arena: 在YouTube上与@petergostev一起查看Agent Arena中Claude Sonnet 5的初步印象。
♥ 0 · ↻ 3
17 小时前回复
Check out first impressions of Claude Sonnet 5 in the Agent Arena on YouTube with @petergostev.

Scores coming soon.

https://t.co/ULnLYdYjFX
在YouTube上与@petergostev一起查看Agent Arena中Claude Sonnet 5的初步印象。
♥ 20 · ↻ 3
1 天前
Gemini Omni Flash ranks #2 for Video Edit in the Video Arena!

With only seven models ranked for this capability, @GoogleDeepMind delivers a strong model (1347) that is nearly +40 points above the next best model: HappyHorse 1.0 (1308)

Congrats to @GoogleDeepMind on the release
Google DeepMind @GoogleDeepMind
We’re shipping 2 major releases:

🔘 Nano Banana 2 Lite: our fastest and cheapest Gemini Image model
🔘 Gemini Omni Flash: now available via the Gemini API and in @GoogleAIStudio to help developers generate and edit high-quality videos. https://t.co/fqB2sA5Xyl
Gemini Omni Flash在视频编辑竞技场中排名第2!
♥ 223 · ↻ 15
1 天前
Claude Sonnet 5 is ready for you in the Agent Arena!

In Agent Arena, we measure models on millions of real-world, long-horizon agentic tasks from a global community of users. Models can access web search, filesystem, and terminal tools to complete complex workflows. The
Claude @claudeai
Introducing Claude Sonnet 5, our most agentic Sonnet yet.

It makes plans, uses tools like browsers and terminals, and runs autonomously at a level that just a few months ago required larger and more expensive models. https://t.co/UKK8G7ww5h
Claude Sonnet 5已在Agent Arena中为您准备就绪!
♥ 211 · ↻ 10
1 天前回复
Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image) ranks #5 overall for Text-to-Image. It has also landed in the Image Edit Arena as:
- #9 Multi-Image Edit
- #15 Single-Image Edit
Nano Banana 2 Lite (Gemini 3.1 Flash Lite Image)在文本到图像总体排名中位列第5。它也已进入图像编辑竞技场:
♥ 23 · ↻ 2
1 天前
Gemini 3.1 Flash Lite Image (Nano Banana 2 Lite) has entered the Text-to-Image Arena. It ranks #5 overall.

Google's latest Gemini-3.1-Flash-Lite-Image lands on the Pareto frontier: scoring 1251 at just $0.034/image. This is near flagship quality at a fraction of the price.

The
Google DeepMind @GoogleDeepMind
We’re shipping 2 major releases:

🔘 Nano Banana 2 Lite: our fastest and cheapest Gemini Image model
🔘 Gemini Omni Flash: now available via the Gemini API and in @GoogleAIStudio to help developers generate and edit high-quality videos. https://t.co/fqB2sA5Xyl
Gemini 3.1 Flash Lite Image (Nano Banana 2 Lite)已进入文本到图像竞技场。总体排名第5。
♥ 303 · ↻ 22
1 天前转推
RT @ml_angelopoulos: So, how does Arena make money?

Our platform serves as a real-world CI/CD system for AI models on post-deployment user…
RT @ml_angelopoulos:那么,Arena如何赚钱?
♥ 0 · ↻ 2
2 天前转推
RT @ml_angelopoulos: Arena has crossed $100M in annualized revenue run rate, eight months after launching our evaluation product.

With our…
RT @ml_angelopoulos:Arena的年化收入运行率已突破1亿美元,在我们发布评估产品八个月后。
♥ 0 · ↻ 23
2 天前
Arena reached a $100M annual revenue run rate just 8 months after launching our evaluation product. We started as a research project at UC Berkeley with a simple mission: measure AI progress through real-world use. As AI shifts from chatbots to agents taking on longer,
Anastasios Nikolas Angelopoulos @ml_angelopoulos
Arena has crossed $100M in annualized revenue run rate, eight months after launching our evaluation product.

With our recent release of Agent Mode, millions of users on Arena are doing real work with agents, from coding to document analysis, in long-running, multi-turn sessions
Arena在推出我们的评估产品仅8个月后,就达到了1亿美元的年收入运行率。我们最初是加州大学伯克利分校的一个研究项目,有着简单的使命:通过实际使用来衡量AI的进步。随着AI从聊天机器人转向承担更长任务的智能体,
♥ 466 · ↻ 40
4 天前
HappyHorse 1.1 by @HappyHorseATH is in the Video Arena. (Text-to-Video, Image-to-Video & Video Edit)

HappyHorse 1.0 currently holds top 2-4 ranks across the Video Arena, so let's see how the latest version stacks up. Bring your most creative prompts and get voting. Scores coming
@HappyHorseATH 开发的 HappyHorse 1.1 已进入视频竞技场。(文本转视频、图像转视频和视频编辑)
♥ 110 · ↻ 6
4 天前回复
Listen in as our engineering team walks through best practices around how to handle billions of data points to keep Arena running: https://t.co/UhLaQlncIE
请听我们的工程团队讲解如何处理数十亿个数据点以保持Arena运行的最佳实践:https://t.co/UhLaQlncIE
♥ 3 · ↻ 0
4 天前
Millions of people worldwide bring real-world tasks to Arena - and at that scale, hot/cold storage becomes a hard problem fast.

In this clip, our engineering team walks through some best practices: CDC replication, ephemeral storage tradeoffs, and what it takes to build data
全球数百万人将现实世界的任务带到Arena上 - 在这个规模下,热/冷存储很快就会成为一个难题。
♥ 57 · ↻ 4
27 天前
Introducing Agent Mode: Agentic AI is now measured in the Arena.

Agent Mode can do deep research, create reports, generate images, build websites, debug code, and more.

It completes more complex tasks by using tools like web search, bash in a sandbox environment, image
推出智能体模式:智能体AI现在在竞技场中进行评估。
♥ 529 · ↻ 55
在 X 上查看 @arena 更多 →