问HN:关于“QSS”的反馈——一个用C语言实现的量化向量搜索引擎
嗨,HN:
我正在开发一个名为 QSS(量化相似性搜索)的向量搜索引擎。它是用 C 语言编写的,探索了将嵌入向量每个维度量化到 1 位的想法。它使用 XOR 和 popcount 进行快速近似搜索,随后使用原始向量的余弦相似度进行重新排序。
主要目标是看看在不牺牲太多搜索质量的情况下,量化可以推得多远,同时在内存使用和速度上获得显著提升。
工作原理:
嵌入向量被量化为每个维度 1 位(例如,300D → 300 位 → 约 40 字节)。
搜索使用按位 XOR 和 popcount(汉明距离)进行。
短名单使用原始(浮点)嵌入的余弦相似度进行重新排序。
支持 GloVe、Word2Vec 和 fastText 格式。
目标:
分析量化与搜索准确性之间的权衡。
测量潜在的速度和内存提升。
探索这种方法在更大数据集上的扩展性。
初步测试:
到目前为止,我只进行了几次小规模测试,但早期的迹象令人鼓舞:
对于某些查询(例如“hello”、“italy”),前 30 个结果与全精度的余弦搜索匹配。
在 Word2Vec 嵌入上,量化管道的速度比标准的余弦相似度循环快了多达 18 倍。
这些结果目前还只是个案——我提前分享这个项目是为了在深入基准测试之前获取反馈。
其他说明:
目前,单词查找是线性且未优化的——重点在于相似性搜索逻辑。
测试是在一台 2018 年的 iMac(3.6 GHz Intel i3)上单线程进行的。
如果你对向量搜索、量化或低级性能技巧感兴趣,我很想听听你的想法:
你认为这种激进的量化方法在大规模应用中可行吗?
你是否有其他快速近似搜索技术推荐探索?
项目仓库在这里:https://github.com/buddyspencer/QSS
感谢阅读!
查看原文
Hi HN,<p>I've been working on a vector search engine called QSS (Quantized Similarity Search). It's written in C and explores the idea of aggressively quantizing embedding vectors to 1-bit per dimension. It uses XOR + popcount for fast approximate search, followed by re-ranking using cosine similarity on the original vectors.<p>The main goal is to see how far you can push quantization without sacrificing too much search quality—while gaining significantly in memory usage and speed.<p>How it works
Embeddings are quantized to 1 bit per dimension (e.g. 300D → 300 bits → ~40 bytes).<p>Search is done using bitwise XOR and popcount (Hamming distance).<p>A shortlist is re-ranked using cosine similarity on the original (float) embeddings.<p>Supports GloVe, Word2Vec, and fastText formats.<p>Goals
Analyze the trade-offs between quantization and search accuracy.<p>Measure potential speed and memory gains.<p>Explore how this approach scales with larger datasets.<p>Preliminary tests
I’ve only run a few small-scale tests so far, but the early signs are encouraging:<p>For some queries (e.g. "hello", "italy"), the top 30 results matched the full-precision cosine search.<p>On Word2Vec embeddings, the quantized pipeline was up to 18× faster than the standard cosine similarity loop.<p>These results are anecdotal for now—I’m sharing the project early to get feedback before going deeper into benchmarks.<p>Other notes
Word lookup is linear and unoptimized for now—focus is on the similarity search logic.<p>Testing has been done single-threaded on a 2018 iMac (3.6 GHz Intel i3).<p>If you're interested in vector search, quantization, or just low-level performance tricks, I'd love your thoughts:<p>Do you think this kind of aggressive quantization could work at scale?<p>Are there other fast approximate search techniques you'd recommend exploring?<p>Repo is here: https://github.com/buddyspencer/QSS<p>Thanks for reading!