Chris Manning (top 3 NLP/Machine Learning researchers in the world) believes the Deepseek 6m dollar training costs due to the optimizations discussed in their paper
Tranquil Eyes
'we're in this bizarre world where the best way to learn about llms... is to read papers by chinese companies. i do not think this is a good state of the world' - us labs keeping their architectures and algorithms secret is ultimately hurting ai development in the us.' - Dr Chris Manning
Sonnet best model for minimizing hallucinations? "if you use Sonnet 3.5 as the model choice within Perplexity, it's very difficult to find a hallucination. I'm not saying it's impossible, but it dramatically reduced the rate of hallucinations'
Sources for conflict resolution for engineers course/seminar?
How do you think Midjourney's new personalization parameter works?
New Personalization (--p) Feature Release!
What's the most effective training for multigpu? Deepspeed vs Unsloth multigpu training?
[D] ACL 2024, my first major conference. What should I know?
The Truth About LLMs
[D] Is the tech industry still not recovered or I am that bad?
GPTFast: Accelerate your Hugging Face Transformers 6-7x. Native to Hugging Face and PyTorch.
Best practice for mass, multi-gpu inference?
How do how handle cases where you already have lora weights and want to re-apply them to the model?
Unsloth question, How do how handle cases where you already have lora weights and want to re-apply them to the model?
Unsloth, what's the catch? Seems too good to be true.
Is Mistral-Medium ever going to be open source, is it that closed for good?
How to identify escape character indices in a string for python?
[D]How to fine tune LLMs using deepspeed without OOM issues
[D] Is a virtual pass worth it for NeurIPS at this point?
[D] Alternatives to this sub?
[D] What industries/sectors do you think could still benefit from ML that don't already have much ML application?
[Article] Ayahuasca improves emotion dysregulation in a community sample and in individuals with borderline-like traits
[D] 2020 Residencies Applicants Discussion Thread
Advice wanted, new to NLP and need to classify emails at work in Python