Reviving Papers with Code
Posted by nielz_r 2 days ago
Comments
Comment by jeffreysmith 2 hours ago
It was super sad to see FB/M abandon the original mission of what PwC was building towards and let the original community resource rot. During the good times, we always talked about how PwC related to HF. So, I think there is a sort poetry to PwC winding up as part of HF, where they probably always belonged. No company is perfect, but HF has been a better than average steward of open source and community resources.
For the younger folks on this thread, you probably have no real feel for just how frustratingly inefficient AI/ML research used to be before people like Robert and Ross of PwC came along to start to bring structure, sanity, and reproducibility to the information needed to work of this kind. And of course, Clem, Julien, and Thomas of HF kicked off an even bigger effort to tame the previously scattered workflow of open AI research into some sort of sane stack.
It's clear that, in 2026, what PwC could be is something much more evolved than what we were able to do back in the day. LLMs + PwC is a huge design space. I hope nielz_r and friends at HF are able to make something truly useful for the community. AI research has both gotten way easier and much harder. e.g. We have a Fable, but Anthro won't let us use it forward our science. Community resources for research are still very much needed.
Best of luck Son of PwC. May you thrive.
Comment by peterfirefly 41 seconds ago
Comment by nielz_r 2 days ago
Niels here from the open-source team at Hugging Face. Like many others, I was a huge fan of paperswithcode, a website which allowed to easily find the state-of-the-art (SOTA) across any domain of AI, from computer vision to language models to time-series forecasting. Sadly, that website is no longer maintained after its acquisition by Meta.
Hence, I've been working on reviving it. I obviously use AI agents to parse papers at scale and automatically generate leaderboards (for now I'm the one verifying results). So far, I've only parsed high-impact papers for which I know they're SOTA, like Qwen 3.5 and 3.6, RF-DETR for object detection, DINOv3, SOTA embedding models from the MTEB leaderboard, the Open ASR Leaderboard for automatic speech recognition models, etc.
For now, it includes the following:
> trending papers by default based on Github star velocity
> categorization by domain, e.g., [OCR](https://paperswithcode.co/tasks/ocr)
> methods, popular techniques used across AI papers, which PwC used to have as well, like [RLVR](https://paperswithcode.co/methods/rlvr) and
> eval results for high-impact papers, see e.g., Qwen 3.5 at the bottom
> leaderboards for each domain, e.g., MMTEB or COCO val 2017
> conferences, like [CVPR 2026](https://paperswithcode.co/conferences/cvpr-2026)
> support for citation counts (you can also see the most cited papers by domain!)
> automated linked Github, project page URLs, and artifacts (+ multiple repos are supported on a paper page)
> support for external papers beyond Arxiv, see e.g., [DeepSeek v4](https://paperswithcode.co/paper/82956)
> Harness reports for coding agent benchmarks, e.g., Terminal Bench
> "Sign in with HF" and Storage Buckets are used to store humbnails, paper PDFs, and overall data backups.
I'm curious about your feedback + feature requests!
Try it at https://paperswithcode.co
Comment by cyril_st_john 3 hours ago
Comment by vjsrinivas 4 hours ago
Comment by addandsubtract 1 hour ago
Wait, I thought it was aquired by Huggingface, because that's where the domain points to: https://huggingface.co/papers/trending
Anyway, as a huge fan of PWC, I'm glad to see it revived! One of the main annoyances of the old PWC was the searchability / discoverability of papers. I hope that now, you can create embeddings of the paper (summaries) to improve the search and make finding related papers easier.
Comment by somethingsome 4 hours ago
It would be lovely to parse which datasets/benchmarks were used in the comparisons and select papers by dataset!
In many fields the datasets vary greatly depending on the subfield and its very difficult to find what other benchmarks could be used.
Comment by 2ap 5 hours ago
Comment by Ajoha 6 hours ago
Comment by barrenko 5 hours ago
Comment by caldarons 7 hours ago
One feature I would love is to get notified via email when new papers are added (or periodically, once a week/daily).
Comment by abidlabs 3 minutes ago
Comment by adithyaharish 5 hours ago
Comment by adithyaharish 5 hours ago
Comment by wanderlust123 7 hours ago
Comment by sairali123 6 hours ago
Comment by Sharlin 4 hours ago
Comment by jekude 3 hours ago
Comment by imadr 1 hour ago
Comment by quibono 4 hours ago
Comment by henrythewasp 4 hours ago
Comment by addandsubtract 1 hour ago
Comment by lalaland1125 2 hours ago
Comment by nicce 2 hours ago
Comment by steinvakt2 5 hours ago
Comment by jamoio 5 hours ago
Comment by kozzion 6 hours ago