Would you like to receive notifications on latest updates of the following headlines?

AI experts ready 'Humanity's Last Exam' to stump powerful tech

POSTED ON September 17, 2024 •   Technology      BY Abiodun Saheed Omodara
Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024.y l. Credit: Reuters

A team of technology experts issued a global call on Monday seeking the toughest questions to pose to artificial intelligence systems, which increasingly have handled popular benchmark tests like a child's play.

Dubbed, 'Humanity's Last Exam,' the project seeks to determine when expert-level AI has arrived. 

It aims to stay relevant even as capabilities advance in future years, according to the organisers, a non-profit called the Centre for AI Safety (CAIS) and the startup Scale AI.

The call comes days after the maker of ChatGPT previewed a new model, known as OpenAI o1, which "destroyed the most popular reasoning benchmarks," said Dan Hendrycks, executive director of CAIS and an advisor to Elon Musk's xAI startup.

Hendrycks co-authored two 2021 papers that proposed tests of AI systems that are now widely used, one quizzing them on undergraduate-level knowledge of topics like US history, the other probing models' ability to reason through competition-level math. 

The undergraduate-style test has more downloads from the online AI hub Hugging Face than any such dataset.

At the time of those papers, AI was giving almost random answers to questions on the exams. "They're now crushed," Hendrycks told Reuters.

As one example, the Claude models from the AI lab Anthropic have gone from scoring about 77% on the undergraduate-level test in 2023, to nearly 89% a year later, according to a prominent capabilities leaderboard.

These common benchmarks have less meaning as a result.

AI has appeared to score poorly on lesser-used tests involving plan formulation and visual pattern-recognition puzzles, according to Stanford University’s AI Index Report from April. 

OpenAI o1 scored around 21% on one version of the pattern-recognition ARC-AGI test, for instance, the ARC organisers said on Friday.

Some AI researchers argue that results like this show planning and abstract reasoning to be better measures of intelligence, though Hendrycks said the visual aspect of ARC makes it less suited to assessing language models. "Humanity’s Last Exam" will require abstract reasoning, he said.

Answers from common benchmarks may also have ended up in data used to train AI systems, industry observers have said. 

Hendrycks said some questions on "Humanity's Last Exam" would remain private to make sure AI systems' answers are not from memorisation.

The exam will include at least 1,000 crowd-sourced questions due on November 1 that are hard for non-experts to answer. 

These will undergo peer review, with winning submissions offered co-authorship and up to $5,000 prizes sponsored by Scale AI.

"We desperately need harder tests for expert-level models to measure the rapid progress of AI," said Alexandr Wang, Scale's CEO.

One restriction: the organisers want no questions about weapons, which some say would be too dangerous for AI to study.

0
READ ALSO
Digital Inequality in Africa: High costs, infrastructure gaps leave millions offline
BY Abiodun Saheed Omodara April 21, 2025 0

Despite significant investments in Nigeria and various regions of Africa, only 38 percent of the pop...

READ ALSO
CBEX unregistered digital assets exchange in Nigeria, SEC warns of investment risks
BY Abiodun Saheed Omodara April 19, 2025 0

The Securities and Exchange Commission (SEC) has announced that Crypto Bridge Exchange, also referre...

READ ALSO
NDPC launches initiative to combat cyberbullying, financial fraud through data protection
BY Abiodun Saheed Omodara April 7, 2025 0

The National Data Protection Commission (NDPC) on Monday reiterated its dedication to enhancing data...

READ ALSO
UNCTAD highlights risks of AI disparities as market approaches $4.8trn
BY Abiodun Saheed Omodara April 7, 2025 0

The widespread adoption of artificial intelligence (AI) worldwide, along with the emergence of new t...

READ ALSO
AI's Role in Spiritual Guidance: Enhancing teachings while upholding values
BY Abiodun Saheed Omodara April 5, 2025 0

Artificial Intelligence (AI), a collection of technologies programmed into computers to execute vari...

READ ALSO
U.S. shows highest anxiety over AI Job loss amidst technological advancements
BY Abiodun Saheed Omodara April 3, 2025 0

Despite its advanced status, research indicates that the United States of America (USA) has the high...

READ ALSO
NITDA Partners Afrovision technologies to bridge job gap for Nigeria’s Tech Talent
BY Abiodun Saheed Omodara April 3, 2025 0

In an effort to tackle the ongoing challenge of job placement for Nigeria’s expanding tech tal...

READ ALSO
OpenAI valuation hits $300 billion after SoftBank-led fund
BY Abiodun Saheed Omodara April 2, 2025 0

The Japanese telecommunications company, alongside a group of investors, has recently announced yet...

OUR CHANNELS:

Trade Tensions Rise as South Korea and U.S. Prepare for Key Discussions
BY Abiodun Saheed Omodara May 16, 2025 0

AFP - South Korea is prepared to restart tariff discussions with Washington during an important APEC...


Obi Urges Tougher Penalties for Corruption amid Nigeria's Leadership Crisis
BY Abiodun Saheed Omodara May 16, 2025 0

Peter Obi, a prominent opposition figure in Nigeria, has called for harsher penalties for corrupt po...


ECOWAS Court Orders Nigeria to Release Businessman Imprisoned Since 2009
BY Abiodun Saheed Omodara May 16, 2025 0

The Economic Community of West African States Court on Thursday has mandated the Nigerian Federal Go...


Sanwo-Olu Establishes Apapa Central Business Office to Enhance Infrastructure, Business Climate
BY Abiodun Saheed Omodara May 16, 2025 0

Lagos State Governor, Babajide Sanwo-Olu, has sanctioned the creation of the Apapa Central Business...


NBS Reports April 2025 Inflation Rate Falls to 23.71%, Highlighting Food Safety Risks
BY Abiodun Saheed Omodara May 16, 2025 0

The inflation rate for April 2025 dropped to 23.71 percent, down from 24.23 percent in March 2025, m...


Police Re-Arraign VeryDarkMan for Alleged Cyberbullying of  Nollywood Actress, others
BY Abiodun Saheed Omodara May 15, 2025 0

Controversial social media personality Martins Otse, popularly known as VeryDarkMan (VDM), was on Th...


Actress Jumoke George Shares Emotional Journey as Daughter Adeola is Found
BY Abiodun Saheed Omodara May 15, 2025 0

Adeola, the 41-year-old daughter of renowned Yoruba actress Jumoke George, has been located in Mali...


House of Representatives Advances Bill to Make Voting Mandatory for Nigerians
BY Abiodun Saheed Omodara May 15, 2025 0

Abuja, Nigeria- The House of Representatives has passed for second reading a bill seeking to am...


NANS Calls for JAMB Registrar Ishaq Oloyede's Resignation Over 2025 UTME Errors
BY Abiodun Saheed Omodara May 16, 2025 0

The National Association of Nigerian Students (NANS) has called for the resignation of Ishaq Oloyede...


Peter Obi Commends JAMB’s Accountability, Warns of Institutional Fragility Over UTME Glitches
BY ROCKETPARROT.com staff May 16, 2025 0

Former Anambra State Governor and Labour Party presidential candidate in the 2023 elections, ...


More Articles

Load more...

Menu