Nvidia Nemotron 3 Family of Models
Posted by ewt-nv 22 hours ago
Comments
Comment by red2awn 15 hours ago
* Hybrid MoE: 2-3x faster than pure MoE transformers
* 1M context length
* Trained on NVFP4
* Open Source! Pretraining, mid-training, SFT and RL dataset released (SFT HF link is 404...)
* Open model training recipe (coming soon)
Really appreciate Nvidia being the most open lab but they really should make sure all the links/data are available on day 0.
Also interesting that the model is trained in NVFP4 but the inference weights are FP8.
Comment by wcallahan 13 hours ago
As someone else mentioned, the GPT-OSS models are also quite good (though I haven’t found how to make them great yet, though I think they might age well like the Llama 3 models did and get better with time!).
But for a defined task, I’ve found task compliance, understanding, and tool call success rates to be some of the highest on these Nvidia models.
For example, I have a continuous job that evaluates if the data for a startup company on aVenture.vc could have overlapping/conflated two similar but unrelated companies for news articles, research details, investment rounds, etc… which is a token hungry ETL task! And I recently retested this workflow on the top 15 or so models today with <125b parameters, and the Nvidia models were among the best performing for this type of work, particularly around non-hallucination if given adequate grounding.
Also, re: cost - I run local inference on several machines that run continuously, in addition to routing through OpenRouter and the frontier providers, and was pleasantly surprised to find that if I’m a paying customer of OpenRouter otherwise, the free variant there from Nvidia is quite generous for limits, too.
Comment by max002 4 hours ago
Comment by pants2 19 hours ago
However, this looks like it has great potential for cost-effectiveness. As of today it's free to use over API on OpenRouter, so a bit unclear what it'll cost when it's not free, but free is free!
Comment by viraptor 18 hours ago
That's temporary. Cerebras speeds up everything, so if Nemotron is good quality, it's just a matter of time until they add it.
Comment by credit_guy 13 hours ago