问HN:关于“QSS”的反馈——一个用C语言实现的量化向量搜索引擎

3作者: wmolino7 天前原帖
嗨,HN: 我正在开发一个名为 QSS(量化相似性搜索)的向量搜索引擎。它是用 C 语言编写的,探索了将嵌入向量每个维度量化到 1 位的想法。它使用 XOR 和 popcount 进行快速近似搜索,随后使用原始向量的余弦相似度进行重新排序。 主要目标是看看在不牺牲太多搜索质量的情况下,量化可以推得多远,同时在内存使用和速度上获得显著提升。 工作原理: 嵌入向量被量化为每个维度 1 位(例如,300D → 300 位 → 约 40 字节)。 搜索使用按位 XOR 和 popcount(汉明距离)进行。 短名单使用原始(浮点)嵌入的余弦相似度进行重新排序。 支持 GloVe、Word2Vec 和 fastText 格式。 目标: 分析量化与搜索准确性之间的权衡。 测量潜在的速度和内存提升。 探索这种方法在更大数据集上的扩展性。 初步测试: 到目前为止,我只进行了几次小规模测试,但早期的迹象令人鼓舞: 对于某些查询(例如“hello”、“italy”),前 30 个结果与全精度的余弦搜索匹配。 在 Word2Vec 嵌入上,量化管道的速度比标准的余弦相似度循环快了多达 18 倍。 这些结果目前还只是个案——我提前分享这个项目是为了在深入基准测试之前获取反馈。 其他说明: 目前,单词查找是线性且未优化的——重点在于相似性搜索逻辑。 测试是在一台 2018 年的 iMac(3.6 GHz Intel i3)上单线程进行的。 如果你对向量搜索、量化或低级性能技巧感兴趣,我很想听听你的想法: 你认为这种激进的量化方法在大规模应用中可行吗? 你是否有其他快速近似搜索技术推荐探索? 项目仓库在这里:https://github.com/buddyspencer/QSS 感谢阅读!
查看原文
Hi HN,<p>I&#x27;ve been working on a vector search engine called QSS (Quantized Similarity Search). It&#x27;s written in C and explores the idea of aggressively quantizing embedding vectors to 1-bit per dimension. It uses XOR + popcount for fast approximate search, followed by re-ranking using cosine similarity on the original vectors.<p>The main goal is to see how far you can push quantization without sacrificing too much search quality—while gaining significantly in memory usage and speed.<p>How it works Embeddings are quantized to 1 bit per dimension (e.g. 300D → 300 bits → ~40 bytes).<p>Search is done using bitwise XOR and popcount (Hamming distance).<p>A shortlist is re-ranked using cosine similarity on the original (float) embeddings.<p>Supports GloVe, Word2Vec, and fastText formats.<p>Goals Analyze the trade-offs between quantization and search accuracy.<p>Measure potential speed and memory gains.<p>Explore how this approach scales with larger datasets.<p>Preliminary tests I’ve only run a few small-scale tests so far, but the early signs are encouraging:<p>For some queries (e.g. &quot;hello&quot;, &quot;italy&quot;), the top 30 results matched the full-precision cosine search.<p>On Word2Vec embeddings, the quantized pipeline was up to 18× faster than the standard cosine similarity loop.<p>These results are anecdotal for now—I’m sharing the project early to get feedback before going deeper into benchmarks.<p>Other notes Word lookup is linear and unoptimized for now—focus is on the similarity search logic.<p>Testing has been done single-threaded on a 2018 iMac (3.6 GHz Intel i3).<p>If you&#x27;re interested in vector search, quantization, or just low-level performance tricks, I&#x27;d love your thoughts:<p>Do you think this kind of aggressive quantization could work at scale?<p>Are there other fast approximate search techniques you&#x27;d recommend exploring?<p>Repo is here: https:&#x2F;&#x2F;github.com&#x2F;buddyspencer&#x2F;QSS<p>Thanks for reading!