Curated Resource ( ? )

A negative view on o1

my notes ( ? )

Classic example of a negative view on LinkedIn: "Unlike many others, I don’t think any reinforcement learning or reward algorithm is at play... appears to be a generic Chain-of-Thought (CoT) process that breaks tasks into several steps... Subsequent steps ... generated based on context... subsequent interactions concatenated into the context... fed back into the model along with an updated CoT "script."... essentially the same GPT-4 model, fine-tuned for specific issues... giving the illusion of more sophisticated intelligence... There’s no actual thinking or reasoning happening. It’s still contextual inference, now wrapped in a CoT controller that follows a mostly scripted process with dynamic headings".

A lot of supportive replies, before this: "there are a few misunderstandings here... Q-learning in reinforcement learning (RL) is used during training, not during inference... Once trained... there’s no active RL during inference...
CoT is not "smoke and mirrors... [used]... to structure reasoning steps ... enhances the model’s ability to handle complex, multi-step tasks.. [along with] In-Context Learning ... allow for more coherent, dynamic problem-solving ... These techniques add real value, not just superficial improvements."

Then the whole thing just degenerates into camped positions, Pro and Con. A real identity-based polarisation seems to be emerging. Possibly the only other valuable remark I saw before giving up was:

"strange argument... saying to Usain Bolt: “yeah sure you just ran the fastest 100 of any human ever, but you used the legs the same way as everyone else, so I’m not that impressed.”
The O1 is outperforming older models ... call that smokes and mirrors if you like, but it won’t make the real world impact any less."

followup post

Read the Full Post

The above notes were curated from the full post

Related reading

More Stuff I Like

More Stuff tagged ai , identity , understanding , chatgpt , llm , o1

See also: Digital Transformation , Innovation Strategy , Social Web , Politics , Science&Technology , Large language models

Cookies disclaimer saves very few cookies onto your device: we need some to monitor site traffic using Google Analytics, while another protects you from a cross-site request forgeries. Nevertheless, you can disable the usage of cookies by changing the settings of your browser. By browsing our website without changing the browser settings, you grant us permission to store that information on your device. More details in our Privacy Policy.