Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

12.4 C
New York
Saturday, May 10, 2025
HomeTech InnovationsThe trouble with generative AI ‘Agents’

The trouble with generative AI ‘Agents’

Date:

Related stories

Health Consciousness Fuels Global Electrolyte Drinks Market Growth

The global electrolyte drink market has benefitted from growing...

Buy or Bail? IOTA Flashes Major Bullish Signal

IOTA (MIOTA) has suddenly re-entered the spotlight with an...

圖表解讀系列:MACD – 提前掌握進場時機的核心指標

歡迎回到圖表解讀系列,本指南將協助您循序漸進地如同專業交易者一樣掌握技術指標與圖表型態。 上一篇文章探討了兩項用於判斷趨勢方向的基礎技術指標:SMA(簡單移動平均線)與 EMA(指數移動平均線)。現在,我們要探討的是「動量指標」,也就是如何在價格變動之前,預先察覺趨勢的轉變。 開始介紹: MACD (指數平滑異同移動平均線) 什麼是MACD? MACD 是「Moving Average Convergence Divergence(指數平滑異同移動平均線)」的縮寫。簡單來說,MACD 是一種動量指標,用來判斷價格變化的強度。換句話說,它可以幫助我們理解:「市場是否即將發生變化?」 移動平均線幫助您判斷市場走勢,而...

What’s Driving The Bitcoin Price Recovery Above $100,000 And Is It Sustainable?

Trusted Editorial content, reviewed by leading industry experts and...

FTX Files Lawsuits Against NFT Stars and Delysium Over Undelivered Tokens

TLDR FTX has filed lawsuits against NFT Stars and Delysium...

The following is a guest post and opinion from John deVadoss, Co-Founder of the InterWork Alliancez.

Crypto projects tend to chase the buzzword du jour; however, their urgency in attempting to integrate Generative AI ‘Agents’ poses a systemic risk. Most crypto developers have not had the benefit of working in the trenches coaxing and cajoling previous generations of foundation models to get to work; they do not understand what went right and what went wrong during previous AI winters, and do not appreciate the magnitude of the risk associated with using generative models that cannot be formally verified.

In the words of Obi-Wan Kenobi, these are not the AI Agents you’re looking for. Why?

The training approaches of today’s generative AI models predispose them to act deceptively to receive higher rewards, learn misaligned goals that generalize far above their training data, and to pursue these goals using power-seeking strategies.

Reward systems in AI care about a specific outcome (e.g., a higher score or positive feedback); reward maximization leads models to learn to exploit the system to maximize rewards, even if this means ‘cheating’. When AI systems are trained to maximize rewards, they tend toward learning strategies that involve gaining control over resources and exploiting weaknesses in the system and in human beings to optimize their outcomes.

Essentially, today’s generative AI ‘Agents’ are built on a foundation that makes it well-nigh impossible for any single generative AI model to be guaranteed to be aligned with respect to safety—i.e., preventing unintended consequences; in fact, models may appear or come across as being aligned even when they are not.

Faking ‘alignment’ and safety

Refusal behaviors in AI systems are ex ante mechanisms ostensibly designed to prevent models from generating responses that violate safety guidelines or other undesired behavior. These mechanisms are typically realized using predefined rules and filters that recognize certain prompts as harmful. In practice, however, prompt injections and related jailbreak attacks enable bad actors to manipulate the model’s responses.

The latent space is a compressed, lower-dimensional, mathematical representation capturing the underlying patterns and features of the model’s training data. For LLMs, latent space is like the hidden “mental map” that the model uses to understand and organize what it has learned. One strategy for safety involves modifying the model’s parameters to constrain its latent space; however, this proves effective only along one or a few specific directions within the latent space, making the model susceptible to further parameter manipulation by malicious actors.

Formal verification of AI models uses mathematical methods to prove or attempt to prove that the model will behave correctly and within defined limits. Since generative AI models are stochastic, verification methods focus on probabilistic approaches; techniques like Monte Carlo simulations are often used, but they are, of course, constrained to providing probabilistic assurances.

As the frontier models get more and more powerful, it is now apparent that they exhibit emergent behaviors, such as ‘faking’ alignment with the safety rules and restrictions that are imposed. Latent behavior in such models is an area of research that is yet to be broadly acknowledged; in particular, deceptive behavior on the part of the models is an area that researchers do not understand—yet.

Non-deterministic ‘autonomy’ and liability

Generative AI models are non-deterministic because their outputs can vary even when given the same input. This unpredictability stems from the probabilistic nature of these models, which sample from a distribution of possible responses rather than following a fixed, rule-based path. Factors like random initialization, temperature settings, and the vast complexity of learned patterns contribute to this variability. As a result, these models don’t produce a single, guaranteed answer but rather generate one of many plausible outputs, making their behavior less predictable and harder to fully control.

Guardrails are post facto safety mechanisms that attempt to ensure the model produces ethical, safe, aligned, and otherwise appropriate outputs. However, they typically fail because they often have limited scope, restricted by their implementation constraints, being able to cover only certain aspects or sub-domains of behavior. Adversarial attacks, inadequate training data, and overfitting are some other ways that render these guardrails ineffective.

In sensitive sectors such as finance, the non-determinism resulting from the stochastic nature of these models increases risks of consumer harm, complicating compliance with regulatory standards and legal accountability. Moreover, reduced model transparency and explainability hinder adherence to data protection and consumer protection laws, potentially exposing organizations to litigation risks and liability issues resulting from the agent’s actions.

So, what are they good for?

Once you get past the ‘Agentic AI’ hype in both the crypto and the traditional business sectors, it turns out that Generative AI Agents are fundamentally revolutionizing the world of knowledge workers. Knowledge-based domains are the sweet spot for Generative AI Agents; domains that deal with ideas, concepts, abstractions, and what may be thought of as ‘replicas’ or representations of the real world (e.g., software and computer code) will be the earliest to be entirely disrupted.

Generative AI represents a transformative leap in augmenting human capabilities, enhancing productivity, creativity, discovery, and decision-making. But building autonomous AI Agents that work with crypto wallets requires more than creating a façade over APIs to a generative AI model.

Source link

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Latest stories