Abstract Reasoning Testing

JakeDontDraw on MSN

How well does ChatGPT know art history? A full test across artists and centuries

ChatGPT vs the exam. We ask tough questions about famous paintings, sculpture, architecture, and movements, then score the AI ...

ChatGPT 5 Surpasses Human Score on ARC AGI 2, Thanks to an Unhobbling Manager Layer

Learn how chain-of-thought and a guided meta-system boosted ChatGPT 5’s abstract thinking, so you can pick better tools for complex tasks.

A Psychologist Shares A Science-Inspired Quiz To Test If You’re A ‘Pattern-Seer’

There's a line of thought that equates intelligence with “pattern recognition.” How do you stack up on this unique cognitive ...

Open HeartOpinion

When cardiology forgets to ask why

Contemporary cardiology faces a paradox: unprecedented technological capability coincides with declining scientific curiosity ...

10d

A Psychologist Shares A Science-Inspired Quiz That Reveals Your Philosophical Orientation

Are you an objectivist or a nihilist? A mystic or a communitarian? Or something else entirely? Here's a fun way to find out.

BMJ Open

A protocol for a rapid realist policy review (RRPR) of the impact of social determinants on self-harm and suicidal thoughts and behaviours in England

Introduction Self-harm and suicidal thoughts and behaviours are a significant public health concern. While individual risk factors have been widely studied, the role of social determinants in shaping ...

GitHub

Show inaccessible results

How well does ChatGPT know art history? A full test across artists and centuries

ChatGPT 5 Surpasses Human Score on ARC AGI 2, Thanks to an Unhobbling Manager Layer

A Psychologist Shares A Science-Inspired Quiz To Test If You’re A ‘Pattern-Seer’

When cardiology forgets to ask why

A Psychologist Shares A Science-Inspired Quiz That Reveals Your Philosophical Orientation

A protocol for a rapid realist policy review (RRPR) of the impact of social determinants on self-harm and suicidal thoughts and behaviours in England

Support Abstract Reasoning in LLMs

AbstRaL: Teaching LLMs Abstract Reasoning via Reinforcement to Boost Robustness on GSM Benchmarks

New paper pushes back on Apple’s LLM ‘reasoning collapse’ study

AI flunks logic test: Multiple studies reveal illusion of reasoning