Built a memory-efficient Python library for large-scale TF-IDF
Posted by jspuri 13 hours ago
Comments
Comment by jspuri 13 hours ago
I've been playing around with C++ since last few months and wanted to scale this specific library that we usually use for NLP or text analysis.
The library is of high value but often fails when running on datasets larger than our local RAM since it needs entire context of dataset in memory.
This library has it's constraints but can still do the job on as small as 4GB RAM machines