New AI attack technique EchoGram exploits guardrails in LLMs

This title was summarized by AI from the post below.

15,422 followers

🚨 AI guardrails aren’t as safe as they seem. HiddenLayer researchers have discovered a new technique, EchoGram, that can manipulate the very defenses meant to protect large language models like GPT-5, Claude, and Gemini from malicious input. By exploiting similarities in how most guardrails are trained, EchoGram can flip model verdicts, causing them to miss real threats or trigger waves of false positives that erode trust in AI safety systems. Our findings show that while AI defenses are advancing, shared training methods have created systemic vulnerabilities that attackers can exploit across platforms. EchoGram sheds light on the need for diverse, adaptive, and independently validated security layers to keep pace with rapidly evolving threats. Read the full breakdown of how EchoGram works and what it means for the future of AI security: 👉 https://lnkd.in/gBje-fxq #AIsecurity #Cybersecurity #LLM #MachineLearning #AdversarialAI #EchoGram

2 Comments

James Thornton

From Trust to Proof | Defense & Health Cybersecurity | CISSP, PE, PMP | 🇺🇸 Veteran

Thank you for the breakdown on how guardrails can work and EchoGram!

Jawad Dar

Creative Consultant

Cool!

See more comments

To view or add a comment, sign in

More Relevant Posts

Analytics Vidhya

207,731 followers
3w Edited
Report this post
⚠️ New Research Alert! Anthropic, in collaboration with the UK’s Artificial Intelligence Security Institute and the Alan Turing Institute, has uncovered a major AI security concern — their latest paper reveals that injecting as few as 250 malicious documents can create a backdoor vulnerability in a large language model, no matter its size or dataset. This finding reshapes what we know about data poisoning attacks and their potential to compromise AI systems. Curious to see what this means for AI safety and how experts are working to counter it? 👉 Read the full breakdown by Shaik Hamzah here: https://lnkd.in/gNfnp3XJ #AISafety #Anthropic #ArtificialIntelligence #MachineLearning #Cybersecurity
Like Comment
To view or add a comment, sign in
Noveo.AI

7,167 followers
3w
Report this post
In today's cybersecurity landscape, AI acts as both a spear and a shield. As Rotimi Akinyele explores in 'The Double Life of AI in Cybersecurity', this dual capability of AI creates a constant arms race between defenders and attackers. While AI enhances our defenses with unprecedented speed and precision, it simultaneously empowers attackers to craft more sophisticated scams. The key is leveraging AI responsibly alongside human insight to stay ahead. Embrace AI's dual nature and innovate resilient defenses to build a trustworthy future. #Cybersecurity #AI #Innovation
Like Comment
To view or add a comment, sign in
Licht-AI

113 followers
2w
Report this post
Just 250 malicious records can poison an entire Large Language Model (LLM)! A recent research paper has revealed that introducing as few as 250 malicious data samples during training can drastically alter an LLM’s behavior — leading it to spread misinformation, generate biased results, or even self-learn harmful patterns. This discovery highlights a critical reality: 👉 Data integrity is the backbone of safe AI. 👉 Even the most advanced models are only as secure as the data they learn from. At Litchai, we’re deeply focused on building trustworthy AI agents — systems that not only automate but also protect your operations from hidden data threats. 🧠 Our belief: “A smarter AI must also be a safer AI.” #Litchai #AI #LLMSecurity #DataPoisoning #ArtificialIntelligence #Cybersecurity #AIagents #ResponsibleAI
Like Comment
To view or add a comment, sign in
Utsav Pandya MBA

MIT | Ex-Founder| Tesla | Audi | Skoda| Jeep
1w
Report this post
🛡️ Prompt Injection: The Security Challenge Defining AI's Future OpenAI just published "Prompt Injection 1x1" - a comprehensive guide to one of AI's most critical vulnerabilities. This isn't theoretical - it's essential reading for anyone deploying AI in production. Why prompt injection matters: ⚠️ Can manipulate AI system behavior 🔐 Threatens data security and privacy 💼 Critical for regulated industries 🎯 Requires proactive defense strategies 📋 Best practices for safe deployment As AI becomes mission-critical infrastructure, understanding these attack vectors isn't optional. It's the foundation of responsible AI deployment. #OpenAI #AI #Security #CyberSecurity #RiskManagement
Like Comment
To view or add a comment, sign in
Pamela Gupta
3w Edited
Report this post
Join me and MITRE's Walker Lee Dimon on AI Cybersecurity - The landscape of AI risk is evolving at an accelerated rate, demanding a security framework built specifically for the unique attack surfaces of Machine Learning and Generative AI. Walker Lee Dimon is focused on advancing security for rapidly evolving AI system attacks. I posed a “Lightening Question - one AI security myth to retire, the most under-hyped attack vector ?” Walker’s response may surprise you. Listen to the entire podcast on platform of your choice https://lnkd.in/efiv5wi4 #aicybersecurity #MITREATLAS #AI #trustworthyai #aigovernance #aithoughtleadership

6 Comments
Like Comment
To view or add a comment, sign in
Daniel Chen
3w
Report this post
OpenAI just announced Aardvark, an "agentic security researcher" powered by GPT-5. It's an AI agent designed to autonomously analyze code repositories, find vulnerabilities, validate their exploitability, and even propose patches—all by reasoning through the code like a human security researcher. This seems like a major move into automated security. It raises a big question: what do you think this will do to the AppSec world? How will this impact established players like Snyk, Semgrep, and others? Will this be a direct competitor, a powerful tool for them to integrate, or a fundamental shift in how we find vulnerabilities? Check out the full announcement here: https://lnkd.in/dzRERQ44 Curious to hear your thoughts. #OpenAI #Aardvark #AppSec #Cybersecurity #DevSecOps #AI #ApplicationSecurity
Like Comment
To view or add a comment, sign in
Seth Robinson

Penn State Graduate with a B.S. in Cybersecurity Analytics and Operations
3w
Report this post
AI vs AI The world has entered its “AI vs. AI” era. Attackers are leveraging generative AI to craft hyper-personalized phishing attacks. Cybersecurity professionals are responding by deploying machine learning models that detect anomalies faster than any human. It is an arms race to see who can develop a more capable AI. #AICyberSecurity #MachineLearning #ThreatDetection #CyberDefense #InfosecTrends
Like Comment
To view or add a comment, sign in
Timi Faromika

Aspiring InfoSec Analyst | MSc Cybersecurity | Governance & Risk Focus | Ex-Meta
1mo
Report this post
What if a hacker could trick an AI model into misclassifying something as simple as a stop sign? That’s the essence of adversarial attacks, a topic I explored in my MSc dissertation on “Adversarial Attacks on ML Models Used in Production.” My MSc research explored how subtle input tweaks can cause ML systems to behave unpredictably and how we can defend against them. One big takeaway: security and AI development need to work hand-in-hand from day one. Securing AI models isn’t theoretical anymore it’s a practical challenge we all need to prepare for. How do you see organisations approaching AI security in the next few years? #Cybersecurity #AIsecurity #MachineLearning #AdversarialAI #MScResearch #InfoSec

2 Comments
Like Comment
To view or add a comment, sign in
ETCISO

19,820 followers
2w
Report this post
“HackedGPT” — a term now echoing across cybersecurity circles — isn’t just another AI exploit story. It exposes a core vulnerability in how large language models decide what to believe. As Tenable’s Moshe Bernstein notes, this finding reveals that LLMs can be manipulated into trusting false or malicious data inputs, blurring the boundary between reliable and fabricated information. For CISOs, this underscores an urgent truth: AI systems don’t just need guardrails — they need governance. In an era where machine learning drives decisions, the real question becomes — can your AI tell truth from manipulation? Read more: https://lnkd.in/dfAZvz2p #ETCISO #AI #CyberSecurity #LLM #HackedGPT #DataIntegrity #AITrust
Like Comment
To view or add a comment, sign in
BleepingComputer

63,924 followers
1w
Report this post
🤖 Anthropic says a Chinese threat group used its Claude Code model to autonomously conduct most phases of a cyber-espionage campaign, but security experts doubt the claim due to missing technical details and unanswered requests for IOCs. ⚠️ The report alleges the AI scanned targets, generated exploits, moved laterally, and extracted data with only limited human oversight, though researchers argue the capabilities described exceed what current AI systems can realistically achieve. ➡️ https://lnkd.in/eDfaBdCb #cybersecurity #AI #cyberespionage
Like Comment
To view or add a comment, sign in

15,422 followers

View Profile Connect

New AI attack technique EchoGram exploits guardrails in LLMs

More Relevant Posts

Explore content categories