Would you like to receive notifications on latest updates of the following headlines?

AI experts ready 'Humanity's Last Exam' to stump powerful tech

POSTED ON September 17, 2024 •   Technology      BY Abiodun Saheed Omodara
Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024.y l. Credit: Reuters

A team of technology experts issued a global call on Monday seeking the toughest questions to pose to artificial intelligence systems, which increasingly have handled popular benchmark tests like a child's play.

Dubbed, 'Humanity's Last Exam,' the project seeks to determine when expert-level AI has arrived. 

It aims to stay relevant even as capabilities advance in future years, according to the organisers, a non-profit called the Centre for AI Safety (CAIS) and the startup Scale AI.

The call comes days after the maker of ChatGPT previewed a new model, known as OpenAI o1, which "destroyed the most popular reasoning benchmarks," said Dan Hendrycks, executive director of CAIS and an advisor to Elon Musk's xAI startup.

Hendrycks co-authored two 2021 papers that proposed tests of AI systems that are now widely used, one quizzing them on undergraduate-level knowledge of topics like US history, the other probing models' ability to reason through competition-level math. 

The undergraduate-style test has more downloads from the online AI hub Hugging Face than any such dataset.

At the time of those papers, AI was giving almost random answers to questions on the exams. "They're now crushed," Hendrycks told Reuters.

As one example, the Claude models from the AI lab Anthropic have gone from scoring about 77% on the undergraduate-level test in 2023, to nearly 89% a year later, according to a prominent capabilities leaderboard.

These common benchmarks have less meaning as a result.

AI has appeared to score poorly on lesser-used tests involving plan formulation and visual pattern-recognition puzzles, according to Stanford University’s AI Index Report from April. 

OpenAI o1 scored around 21% on one version of the pattern-recognition ARC-AGI test, for instance, the ARC organisers said on Friday.

Some AI researchers argue that results like this show planning and abstract reasoning to be better measures of intelligence, though Hendrycks said the visual aspect of ARC makes it less suited to assessing language models. "Humanity’s Last Exam" will require abstract reasoning, he said.

Answers from common benchmarks may also have ended up in data used to train AI systems, industry observers have said. 

Hendrycks said some questions on "Humanity's Last Exam" would remain private to make sure AI systems' answers are not from memorisation.

The exam will include at least 1,000 crowd-sourced questions due on November 1 that are hard for non-experts to answer. 

These will undergo peer review, with winning submissions offered co-authorship and up to $5,000 prizes sponsored by Scale AI.

"We desperately need harder tests for expert-level models to measure the rapid progress of AI," said Alexandr Wang, Scale's CEO.

One restriction: the organisers want no questions about weapons, which some say would be too dangerous for AI to study.

0
READ ALSO
Federal ministry launches consultation for Nationwide Fiber Network initiative
BY Abiodun Saheed Omodara March 26, 2025 0

In anticipation of the deployment of 90,000 kilometers of fiber optic cables scheduled for the fourt...

READ ALSO
NDPC launches investigation into TikTok, Truecaller data breaches
BY Abiodun Saheed Omodara March 21, 2025 0

Nigeria Data Protection Commission (NDPC) has commenced an investigation into alleged data breaches...

READ ALSO
Transforming Nigeria's Gaming Landscape: GamrLab opens training center in Lagos
BY Abiodun Saheed Omodara March 21, 2025 0

LAGOS, Nigeria - A new center for video game development and esports training established by GamrLab...

READ ALSO
Meta launches Llama grant to empower African startups, researchers
BY Abiodun Saheed Omodara March 20, 2025 0

The social media giant Meta, in partnership with Data Science Africa, has introduced the Llama Impac...

READ ALSO
Bridging Digital Divide: USPF plans 1,000 base stations for underserved communities
BY Abiodun Saheed Omodara March 19, 2025 0

In a bid to enhance connectivity across the nation, particularly in rural regions, the Universal Ser...

READ ALSO
Social media platform to have Physical office as senate pass bill Giants amidst free speech concerns
BY Abiodun Saheed Omodara March 18, 2025 0

ABUJA, Nigeria - The Senate on Tuesday approved for a second reading a Bill that required social med...

READ ALSO
Tech expert advocate AI benefit for career growth, Societal Change
BY Abiodun Saheed Omodara March 16, 2025 0

LAGOS, Nigeria- Specialists in the Information Communication and Technology (ICT) industry with the...

READ ALSO
FG launches digital skills training program to combat unemployment, foster economic growth
BY Abiodun Saheed Omodara March 13, 2025 0

ABUJA, Nigeria -  In an effort to address unemployment and stimulate economic growth, the Feder...

OUR CHANNELS:

Emir Sunusi's Durbar highlights Nigeria's Rich cultural legacy amidst challenges
BY Abiodun Saheed Omodara April 2, 2025 0

DUTSE, Nigeria - Women with veils cried out joyfully as the sounds of trumpets and hunting guns reso...


APC caution Cross River members against unauthorize legal suits
BY Abiodun Saheed Omodara April 1, 2025 0

RIVERS, Nigeria - The cross Rivers chapter of the All Progressives Congress has wa...


Sahel States' 0.5% Levy on Imports threatens Regional Trade and Economic Unity
BY Abiodun Saheed Omodara April 1, 2025 0

Mali, Burkina Faso, and Niger have intensified their separation from the regional economic bloc by i...


NSIA reports ₦3.74Trn profit, celebrating 12 years of continuous profitability
BY Abiodun Saheed Omodara April 1, 2025 0

ABUJA, Nigeria - THE Nigeria Sovereign Investment Authority (NSIA) reported generating approximately...


Army arrest 39 Oil thieves, dismantled 18 illegal refining sites in Niger Delta
BY Abiodun Saheed Omodara April 1, 2025 0

NIGER DELTA - In a significant setback for oil thieves in the Niger Delta Region, the troops of the...


FG announce independence bridge closure begins April 1
BY Abiodun Saheed Omodara April 1, 2025 0

LAGOS, Nigeria - The Federal Government has announced the closure of the Independence Bridge (Marina...


Three Nigerians in US plead guilty in $4.5m Money laundering case
BY Abiodun Saheed Omodara March 31, 2025 0

Three Nigerians residing in the United States Olumide Olorunfunmi, Samson Amos, and Emmanuel Unuigbe...


Fuel price crisis looms as Naira-for-Crude arrangement remains unresolved
BY Abiodun Saheed Omodara March 31, 2025 0

There is growing concern in the downstream segment of the oil and gas industry as operators anticipa...


NCDC confirms 645 Lassa Fever Cases, 807 suspected meningitis cases
BY Abiodun Saheed Omodara March 31, 2025 0

The Nigeria Centre for Disease Control and Prevention (NCDC) has reported 645 confirmed cases of Las...


Super Eagles coach aims to revamp team ahead of September qualifiers
BY Abiodun Saheed Omodara April 1, 2025 0

Still feeling the effects of a lackluster performance by some of his players during the recent 2026...


More Articles

Load more...

Menu