Hacker News 中文摘要

文章摘要

文章揭露了GitHub上存在虚假星标经济，有600万假星标被以每次点击0.06美元的价格买卖，风投机构将GitHub人气作为项目热度的证明。作者通过分析20个代码库发现了这一现象的痕迹。

虚假星标规模
- 卡内基梅隆大学研究显示：GitHub上存在600万虚假星标，涉及18,617个代码库和30.1万个账户，其中AI/LLM项目是最大的非恶意刷星类别。
- 2024年刷星行为激增：16.66%的50星以上代码库涉及虚假星标，而2022年前几乎为零。
黑色产业链
- 明码标价：星标售价0.03至0.85美元/个，通过Fiverr、Telegram等平台交易，无需暗网。
- 账户分级：廉价账户（0.03-0.1美元/星）使用新注册空账户，高端服务（0.8-0.9美元/星）提供有历史活动的老账户。
VC推波助澜
- 投资标准：Redpoint数据显示，种子轮融资项目中位数星标数为2,850，风投机构用自动化工具追踪高增长代码库。
- 典型案例：区块链项目Union Labs被Runa Capital列为"最热开源初创企业"，但47.4%星标疑似虚假。
检测信号
- 分叉/星标比：健康项目约为0.16，刷星项目低至0.02（如FreeDomain的157k星标仅2,676次分叉）。
- 账户特征：刷星代码库中36-76%的标星者零关注者，28%为完全空账户。

GitHub虽删除90%涉事代码库，但仅清理57%刷星账户，未实施研究者建议的"加权流行度算法"等根本性改革。

（注：保留核心数据与案例，剔除重复说明、作者介绍、网站导航等非关键信息，压缩比例约70%）

以下是评论内容的总结：

主要观点：星标数量容易被操纵，不能准确反映项目质量或活跃度。
关键引用：
- "Seen this firsthand, repos with hundreds of stars and zero meaningful commits or issues."（评论1）
- "GitHub stars are a meaningless metric... People star things because they want to be seen as part of the in-crowd."（评论19）

主要观点：风险投资（VC）过度依赖星标作为项目评估指标，忽视了更重要的因素。
关键引用：
- "Many VCs write internal scraping programs to identify fast growing github projects for sourcing, and the most common metric they look toward is stars."（评论18）
- "VCs explicitly use stars as sourcing signals... GitHub's own ratings are easily manipulated."（评论11）

主要观点：开发者应关注项目的实际活跃度、代码质量和问题解决能力。
关键引用：
- "I look at the list of contributors, their activities and the bug reports / issues."（评论4）
- "The real metric is: does it solve my problem, and is the maintainer still responding to issues?"（评论10）

主要观点：星标系统被广泛操纵，甚至成为黑市交易的一部分。
关键引用：
- "Seen this happen first-hand with mid-to-large open source projects that sometimes 'sponsor' hackathons, literally setting a task to 'star the repo'."（评论8）
- "All the numbers seem fake; whether it's number of users, number of likes, number of stars..."（评论16）

主要观点：建议采用更复杂的评估方法，如加权网络中心性指标。
关键引用：
- "The CMU researchers recommended GitHub adopt a weighted popularity metric based on network centrality rather than raw star counts."（评论17）
- "It’s time we focus on qualitative metrics instead."（评论16）

主要观点：部分开发者将星标作为书签或初步筛选工具，而非质量指标。
关键引用：
- "I usually use stars as a bookmark list to visit later."（评论9）
- "I look at the starts when choosing dependencies, it's a first filter for sure."（评论5）

主要观点：星标操纵只是更广泛的信号操纵问题的一部分。
关键引用：
- "If a metric or signal matters, there is already an ecosystem built to fake it."（评论29）
- "This all smells like BS... This smells like bait for hating on people that get investment."（评论18）

总结：评论普遍认为GitHub星标系统存在严重缺陷，容易被操纵且不能准确反映项目质量。开发者更倾向于关注实际活跃度、代码质量和问题解决能力，而投资者对星标的依赖被认为是一种懒惰或不专业的评估方式。建议采用更复杂的评估方法，并警惕信号操纵的普遍性问题。