Would you like to receive notifications on latest updates of the following headlines?

AI experts ready 'Humanity's Last Exam' to stump powerful tech

POSTED ON September 17, 2024 •   Technology      BY Abiodun Saheed Omodara
Figurines with computers and smartphones are seen in front of the words "Artificial Intelligence AI" in this illustration taken, February 19, 2024.y l. Credit: Reuters

A team of technology experts issued a global call on Monday seeking the toughest questions to pose to artificial intelligence systems, which increasingly have handled popular benchmark tests like a child's play.

Dubbed, 'Humanity's Last Exam,' the project seeks to determine when expert-level AI has arrived. 

It aims to stay relevant even as capabilities advance in future years, according to the organisers, a non-profit called the Centre for AI Safety (CAIS) and the startup Scale AI.

The call comes days after the maker of ChatGPT previewed a new model, known as OpenAI o1, which "destroyed the most popular reasoning benchmarks," said Dan Hendrycks, executive director of CAIS and an advisor to Elon Musk's xAI startup.

Hendrycks co-authored two 2021 papers that proposed tests of AI systems that are now widely used, one quizzing them on undergraduate-level knowledge of topics like US history, the other probing models' ability to reason through competition-level math. 

The undergraduate-style test has more downloads from the online AI hub Hugging Face than any such dataset.

At the time of those papers, AI was giving almost random answers to questions on the exams. "They're now crushed," Hendrycks told Reuters.

As one example, the Claude models from the AI lab Anthropic have gone from scoring about 77% on the undergraduate-level test in 2023, to nearly 89% a year later, according to a prominent capabilities leaderboard.

These common benchmarks have less meaning as a result.

AI has appeared to score poorly on lesser-used tests involving plan formulation and visual pattern-recognition puzzles, according to Stanford University’s AI Index Report from April. 

OpenAI o1 scored around 21% on one version of the pattern-recognition ARC-AGI test, for instance, the ARC organisers said on Friday.

Some AI researchers argue that results like this show planning and abstract reasoning to be better measures of intelligence, though Hendrycks said the visual aspect of ARC makes it less suited to assessing language models. "Humanity’s Last Exam" will require abstract reasoning, he said.

Answers from common benchmarks may also have ended up in data used to train AI systems, industry observers have said. 

Hendrycks said some questions on "Humanity's Last Exam" would remain private to make sure AI systems' answers are not from memorisation.

The exam will include at least 1,000 crowd-sourced questions due on November 1 that are hard for non-experts to answer. 

These will undergo peer review, with winning submissions offered co-authorship and up to $5,000 prizes sponsored by Scale AI.

"We desperately need harder tests for expert-level models to measure the rapid progress of AI," said Alexandr Wang, Scale's CEO.

One restriction: the organisers want no questions about weapons, which some say would be too dangerous for AI to study.

0
READ ALSO
NEOM invest in automated robotic technology for construction projects
BY Abiodun Saheed Omodara December 13, 2024 0

NEOM has signed a landmark investment agreement with GMT Robotics, one of Europe’s emerging in...

READ ALSO
Meta offers $1m to Trump’s inaugural fund
BY Benedicta Bassey December 12, 2024 0

UNITED STATES —  Meta has donated $1 million to President-elect Donald Trump’s inau...

READ ALSO
LASG to boost revenue with technology
BY Abiodun Saheed Omodara December 4, 2024 0

Lagos State Governor, Babajide Sanwo-Olu has stated that his administration would boost the state&rs...

READ ALSO
Microsoft faces legal action over cloud computing licenses
BY Abiodun Saheed Omodara December 4, 2024 0

Microsoft faces legal action in Britain over a claim that thousands of businesses using cloud comput...

READ ALSO
Google, Meta advocate delay of Bill banning under-16 from social media
BY Benedicta Bassey November 26, 2024 0

Australian, Oceanic —Tech giants Google and Meta have urged the Australian government to...

READ ALSO
Brain chip: Elon Musk’s 'Neuralink' gets Canadian approval for trial
BY Benedicta Bassey November 21, 2024 0

CANADA — Elon Musk’s has announced the approval of his Neuralink in Canada to conduct it...

READ ALSO
Nigeria need technocrats to revamp economy, says Obasanjo
BY Benedicta Bassey October 31, 2024 0

There are many technocrats in Nigeria that can be called upon to improve and revamp the economy, for...

READ ALSO
Global stock markets rebounce as investors await tech giants' earnings
BY Benedicta Bassey October 29, 2024 0

Major global stock market on Tuesday gained ground as investors await earnings of tech giants and US...

OUR CHANNELS:

NUC approves 8 new degree programmes for Bichi technical college
BY Abiodun Saheed Omodara December 20, 2024 0

BICHI, Kano (NAN) - The National Universities Commission has approved eight new degree programm...


Group takes advocacy campaign on ACJL to Lagos Market
BY Benedicta Bassey December 20, 2024 0

LAGOS STATE, Nigeria — The Rights Enforcement and Public Law Centre (REPLACE) has again v...


Okpebholo announces interest-free loans for Edo traders January
BY Benedicta Bassey December 20, 2024 0

EDO STATE, Nigeria — Edo State Governor, Senator Monday Okpehbolo, has announced that his...


2025: Kaduna commits to fighting malnutrition
BY Abiodun Saheed Omodara December 20, 2024 0

The Kaduna State Government has reaffirmed its dedication to combating malnutrition in the state. T...


Economic hardship: Reps to return N704m 50% salary cut to Tinubu
BY Benedicta Bassey December 21, 2024 0

ABUJA, Nigeria — The House of Representatives will on December 31, 2024, present the sum...


Ekiti opens airport for private jets landing January
BY Benedicta Bassey December 21, 2024 0

EKITI, Nigeria —  Ekiti State Governor, Mr. Biodun Oyebanji, on Friday announced that the...


FG begins campaign to curb irregular migration among youths
BY Abiodun Saheed Omodara December 20, 2024 0

The federal government has embarked on a campaign to create awareness on the effects of irregular mi...


Government must prioritise best brains to drive growth, devt says Kumuyi
BY Benedicta Bassey December 20, 2024 0

LAGOS STATE, Nigeria —  The General Superintendent of the Deeper Life Bible Church, Pasto...


Nigeria achieves 80% success in combating kidnapping- Nuhu Ribadu
BY Abiodun Saheed Omodara December 21, 2024 0

ABUJA, Nigeria- The National Security Adviser (NSA), Nuhu Ribadu, has stated that the country has ma...


ACJL: Aliko, Group champion Human Rights for women
BY Benedicta Bassey December 20, 2024 0

LAGOS, Nigeria — Mrs Mariam Aliko, a renowned rights activist as well as a non-government...


More Articles

Load more...

Menu