HackerNews中文版

嗨，HN：我正在开发一个名为 QSS（量化相似性搜索）的向量搜索引擎。它是用 C 语言编写的，探索了将嵌入向量每个维度量化到 1 位的想法。它使用 XOR 和 popcount 进行快速近似搜索，随后使用原始向量的余弦相似度进行重新排序。主要目标是看看在不牺牲太多搜索质量的情况下，量化可以推得多远，同时在内存使用和速度上获得显著提升。工作原理：嵌入向量被量化为每个维度 1 位（例如，300D → 300 位 → 约 40 字节）。搜索使用按位 XOR 和 popcount（汉明距离）进行。短名单使用原始（浮点）嵌入的余弦相似度进行重新排序。支持 GloVe、Word2Vec 和 fastText 格式。目标：分析量化与搜索准确性之间的权衡。测量潜在的速度和内存提升。探索这种方法在更大数据集上的扩展性。初步测试：到目前为止，我只进行了几次小规模测试，但早期的迹象令人鼓舞：对于某些查询（例如“hello”、“italy”），前 30 个结果与全精度的余弦搜索匹配。在 Word2Vec 嵌入上，量化管道的速度比标准的余弦相似度循环快了多达 18 倍。这些结果目前还只是个案——我提前分享这个项目是为了在深入基准测试之前获取反馈。其他说明：目前，单词查找是线性且未优化的——重点在于相似性搜索逻辑。测试是在一台 2018 年的 iMac（3.6 GHz Intel i3）上单线程进行的。如果你对向量搜索、量化或低级性能技巧感兴趣，我很想听听你的想法：你认为这种激进的量化方法在大规模应用中可行吗？你是否有其他快速近似搜索技术推荐探索？项目仓库在这里：https://github.com/buddyspencer/QSS 感谢阅读！

查看原文

Hi HN,I've been working on a vector search engine called QSS (Quantized Similarity Search). It's written in C and explores the idea of aggressively quantizing embedding vectors to 1-bit per dimension. It uses XOR + popcount for fast approximate search, followed by re-ranking using cosine similarity on the original vectors.The main goal is to see how far you can push quantization without sacrificing too much search quality—while gaining significantly in memory usage and speed.How it works Embeddings are quantized to 1 bit per dimension (e.g. 300D → 300 bits → ~40 bytes).Search is done using bitwise XOR and popcount (Hamming distance).A shortlist is re-ranked using cosine similarity on the original (float) embeddings.Supports GloVe, Word2Vec, and fastText formats.Goals Analyze the trade-offs between quantization and search accuracy.Measure potential speed and memory gains.Explore how this approach scales with larger datasets.Preliminary tests I’ve only run a few small-scale tests so far, but the early signs are encouraging:For some queries (e.g. "hello", "italy"), the top 30 results matched the full-precision cosine search.On Word2Vec embeddings, the quantized pipeline was up to 18× faster than the standard cosine similarity loop.These results are anecdotal for now—I’m sharing the project early to get feedback before going deeper into benchmarks.Other notes Word lookup is linear and unoptimized for now—focus is on the similarity search logic.Testing has been done single-threaded on a 2018 iMac (3.6 GHz Intel i3).If you're interested in vector search, quantization, or just low-level performance tricks, I'd love your thoughts:Do you think this kind of aggressive quantization could work at scale?Are there other fast approximate search techniques you'd recommend exploring?Repo is here: https://github.com/buddyspencer/QSSThanks for reading!

问HN：关于“QSS”的反馈——一个用C语言实现的量化向量搜索引擎