HackerNews中文版

我理解，使用像 Chat、Cursor 和 Claude Code 这样的工具进行软件开发，很可能是在提供训练数据，帮助这些大型语言模型（LLMs）在编码方面变得更好（我意识到这很讽刺，因为我可能正在助长自己的失业……）。<p>但我对实际的机制感到好奇：这个反馈循环到底是如何运作的？当我接受、拒绝或修改这些模型生成的代码时，这个信号是否会直接反馈到训练中？<p>我并不是反对这一点，只是对制作过程感到真心好奇。

查看原文

I understand that using tools like Chat, Cursor, and Claude Code for software development is likely providing training data to help these LLMs get better at coding (the irony isn't lost on me that I might be contributing to making myself obsolete...)<p>But I'm curious about the actual mechanics: How exactly does this feedback loop work? When I accept, reject, or modify the code that these models spit out, is that signal fed directly back into training?<p>Not necessarily against this, just genuinely curious about how the sausage is made.

请问HN：我使用大型语言模型（LLMs）会如何影响底层模型的训练？