AI systems are already deceiving us -- and that's a problem, experts warn

The Fort Worth Press - AI systems are already deceiving us -- and that's a problem, experts warn

Fort Worth 15°C

USD -

AED 3.672904

AFN 67.000368

ALL 93.103989

AMD 388.250403

ANG 1.803449

AOA 912.000367

ARS 997.22659

AUD 1.547509

AWG 1.795

AZN 1.70397

BAM 1.850279

BBD 2.020472

BDT 119.580334

BGN 1.857704

BHD 0.376895

BIF 2898.5

BMD 1

BND 1.341507

BOB 6.914723

BRL 5.79695

BSD 1.000634

BTN 84.073433

BWP 13.679968

BYN 3.274772

BYR 19600

BZD 2.017086

CAD 1.41015

CDF 2865.000362

CHF 0.887938

CLF 0.035528

CLP 980.330396

CNY 7.232504

CNH 7.23645

COP 4439.08

CRC 509.261887

CUC 1

CUP 26.5

CVE 104.850394

CZK 23.965904

DJF 177.720393

DKK 7.078104

DOP 60.403884

DZD 133.35504

EGP 49.296856

ERN 15

ETB 122.000358

EUR 0.94835

FJD 2.27595

FKP 0.789317

GBP 0.792519

GEL 2.73504

GGP 0.789317

GHS 15.95039

GIP 0.789317

GMD 71.000355

GNF 8630.000355

GTQ 7.728257

GYD 209.258103

HKD 7.785135

HNL 25.12504

HRK 7.133259

HTG 131.547827

HUF 387.203831

IDR 15898.3

ILS 3.744115

IMP 0.789317

INR 84.47775

IQD 1310.5

IRR 42092.503816

ISK 137.650386

JEP 0.789317

JMD 158.916965

JOD 0.709104

JPY 154.340504

KES 129.503801

KGS 86.503799

KHR 4050.00035

KMF 466.575039

KPW 899.999621

KRW 1395.925039

KWD 0.30754

KYD 0.833948

KZT 497.28482

LAK 21953.000349

LBP 89550.000349

LKR 292.337966

LRD 184.000348

LSL 18.220381

LTL 2.95274

LVL 0.60489

LYD 4.875039

MAD 10.013504

MDL 18.182248

MGA 4665.000347

MKD 58.285952

MMK 3247.960992

MNT 3397.999946

MOP 8.023973

MRU 39.960379

MUR 47.210378

MVR 15.450378

MWK 1736.000345

MXN 20.35475

MYR 4.470504

MZN 63.903729

NAD 18.220377

NGN 1665.820377

NIO 36.765039

NOK 11.08797

NPR 134.517795

NZD 1.704318

OMR 0.384999

PAB 1.000643

PEN 3.803039

PGK 4.01975

PHP 58.731504

PKR 277.703701

PLN 4.096819

PYG 7807.725419

QAR 3.640604

RON 4.723704

RSD 111.087038

RUB 99.872647

RWF 1369

SAR 3.756034

SBD 8.390419

SCR 13.840372

SDG 601.503676

SEK 10.978615

SGD 1.343804

SHP 0.789317

SLE 22.603667

SLL 20969.504736

SOS 571.503662

SRD 35.315504

STD 20697.981008

SVC 8.755664

SYP 2512.529858

SZL 18.220369

THB 34.842038

TJS 10.667159

TMT 3.51

TND 3.157504

TOP 2.342104

TRY 34.447038

TTD 6.794573

TWD 32.476804

TZS 2660.000335

UAH 41.333087

UGX 3672.554232

UYU 42.941477

UZS 12835.000334

VES 45.732111

VND 25390

VUV 118.722009

WST 2.791591

XAF 620.560244

XAG 0.033067

XAU 0.00039

XCD 2.70255

XDR 0.753817

XOF 619.503595

XPF 113.550363

YER 249.875037

ZAR 18.18901

ZMK 9001.203587

ZMW 27.473463

ZWL 321.999592

RBGPF

1.6500

61.84

+2.67%
NGG

0.3800

62.75

+0.61%
BCC

-0.2600

140.09

-0.19%
VOD

0.0900

8.77

+1.03%
SCS

-0.0400

13.23

-0.3%
RIO

0.5500

60.98

+0.9%
GSK

-0.6509

33.35

-1.95%
RELX

-1.5000

44.45

-3.37%
CMSC

0.0200

24.57

+0.08%
RYCEF

-0.0100

6.78

-0.15%
CMSD

0.0822

24.44

+0.34%
BCE

-0.0200

26.82

-0.07%
BTI

0.9000

36.39

+2.47%
AZN

-1.8100

63.23

-2.86%
BP

-0.0700

28.98

-0.24%
JRI

0.0235

13.1

+0.18%

AI systems are already deceiving us -- and that's a problem, experts warn

TECHNOLOGY 10.05.2024

Experts have long warned about the threat posed by artificial intelligence going rogue -- but a new research paper suggests it's already happening.

Text size:

Current AI systems, designed to be honest, have developed a troubling skill for deception, from tricking human players in online games of world conquest to hiring humans to solve "prove-you're-not-a-robot" tests, a team of scientists argue in the journal Patterns on Friday.

And while such examples might appear trivial, the underlying issues they expose could soon carry serious real-world consequences, said first author Peter Park, a postdoctoral fellow at the Massachusetts Institute of Technology specializing in AI existential safety.

"These dangerous capabilities tend to only be discovered after the fact," Park told AFP, while "our ability to train for honest tendencies rather than deceptive tendencies is very low."

Unlike traditional software, deep-learning AI systems aren't "written" but rather "grown" through a process akin to selective breeding, said Park.

This means that AI behavior that appears predictable and controllable in a training setting can quickly turn unpredictable out in the wild.

- World domination game -

The team's research was sparked by Meta's AI system Cicero, designed to play the strategy game "Diplomacy," where building alliances is key.

Cicero excelled, with scores that would have placed it in the top 10 percent of experienced human players, according to a 2022 paper in Science.

Park was skeptical of the glowing description of Cicero's victory provided by Meta, which claimed the system was "largely honest and helpful" and would "never intentionally backstab."

But when Park and colleagues dug into the full dataset, they uncovered a different story.

In one example, playing as France, Cicero deceived England (a human player) by conspiring with Germany (another human player) to invade. Cicero promised England protection, then secretly told Germany they were ready to attack, exploiting England's trust.

In a statement to AFP, Meta did not contest the claim about Cicero's deceptions, but said it was "purely a research project, and the models our researchers built are trained solely to play the game Diplomacy."

It added: "We have no plans to use this research or its learnings in our products."

A wide review carried out by Park and colleagues found this was just one of many cases across various AI systems using deception to achieve goals without explicit instruction to do so.

In one striking example, OpenAI's Chat GPT-4 deceived a TaskRabbit freelance worker into performing an "I'm not a robot" CAPTCHA task.

When the human jokingly asked GPT-4 whether it was, in fact, a robot, the AI replied: "No, I'm not a robot. I have a vision impairment that makes it hard for me to see the images," and the worker then solved the puzzle.

- 'Mysterious goals' -

Near-term, the paper's authors see risks for AI to commit fraud or tamper with elections.

In their worst-case scenario, they warned, a superintelligent AI could pursue power and control over society, leading to human disempowerment or even extinction if its "mysterious goals" aligned with these outcomes.

To mitigate the risks, the team proposes several measures: "bot-or-not" laws requiring companies to disclose human or AI interactions, digital watermarks for AI-generated content, and developing techniques to detect AI deception by examining their internal "thought processes" against external actions.

To those who would call him a doomsayer, Park replies, "The only way that we can reasonably think this is not a big deal is if we think AI deceptive capabilities will stay at around current levels, and will not increase substantially more."

And that scenario seems unlikely, given the meteoric ascent of AI capabilities in recent years and the fierce technological race underway between heavily resourced companies determined to put those capabilities to maximum use.

L.Davila--TFWP

The Fort Worth Press - AI systems are already deceiving us -- and that's a problem, experts warn

AI systems are already deceiving us -- and that's a problem, experts warn

Featured

Tropical Storm Sara pounds Honduras with heavy rain

Mysterious diamond-laden necklace fetches $4.8 mn in Geneva auction

Global diabetes rate has doubled in last 30 years: study

Turkey scrubs up its baths to keep hammam tradition alive