Breaking the Black Box: An Introduction to Adversarial AI

Wed, Jun 17, 2026
Duration: 1 Hour
English
Foster Nethercott
Technical Presentation

It’s no secret that artificial intelligence now plays a critical role in most modern enterprises. It does everything from screening resumes to triaging security alerts to even generating code, often without human oversight. We have, in essence, handed these models the keys to our kingdom, getting so consumed by the question of "can we" that we forgot to ask "should we."

Unfortunately, that question is quickly being answered for us by our adversaries. Every capability and advancement we’ve celebrated also hides a vulnerability we overlooked. The same model that screens a resume can be coerced into following an attacker’s hidden instructions. The same assistant that writes our code can be convinced to implement a hidden back door. The worst part is that organizations have no idea how prominent these attacks have become, or that they are exposing an attack surface that traditional penetration testing was never built to address. Prompt injection, jailbreaks, training-data poisoning, model extraction, and agentic exploitation are happening every day, and most security teams still lack any methodology to test for them.

This session, based on the new SEC536: AI Penetration Testing course, shifts focus from AI's capabilities to its role as a new attack vector, encouraging critical evaluation of its use. Live demonstrations will show how production-grade AI assistants can be manipulated to bypass safeguards, leak confidential data, and perform prohibited actions.

ブラックボックスを打ち破る：攻撃者AI（Adversarial AI）入門

概要：

人工知能（AI）が現代の企業において重要な役割を果たしていることは、もはや周知の事実です。AIは履歴書の選考からセキュリティアラートのトリアージ、さらにはコード生成に至るまで、しばしば人間の監督なしにさまざまな業務を実行しています。私たちは実質的に、これらのモデルに「王国の鍵」を渡してしまいました。そして、「それができるのか（Can we?）」という問いに夢中になるあまり、「それをすべきなのか（Should we?）」という問いを忘れてしまったのです。

残念ながら、その問いに対する答えは今や攻撃者たちによって示されつつあります。私たちが称賛してきたあらゆる機能や進歩の裏には、見過ごされてきた脆弱性が潜んでいます。履歴書を選別するモデルは、攻撃者が埋め込んだ隠れた指示に従うよう誘導される可能性があります。コードを生成するアシスタントは、巧妙に仕組まれたバックドアを実装するよう騙される可能性があります。

さらに深刻なのは、多くの組織がこうした攻撃がどれほど広く行われているかを認識しておらず、従来のペネトレーションテストでは想定されていなかった新たな攻撃対象領域（アタックサーフェス）を自ら公開していることです。プロンプトインジェクション、ジェイルブレイク、学習データ汚染（データポイズニング）、モデル抽出（Model Extraction）、エージェント悪用（Agentic Exploitation）といった攻撃は日常的に発生していますが、それらを評価・検証するための方法論を持つセキュリティチームはまだ多くありません。

本セッションは、新コース SEC536: AI Penetration Testing をベースに、AIの能力そのものではなく、AIが新たな攻撃ベクトルとしてどのようなリスクをもたらすのかに焦点を当てます。そして、AI活用の是非を批判的に評価する視点を提供します。

ライブデモでは、本番環境レベルのAIアシスタントがどのように操作され、安全対策を回避し、機密データを漏洩させ、禁止された行為を実行するよう誘導されるのかを実演します。これにより、組織が直面する現実の脅威と、それに対処するために必要な新たなセキュリティテスト手法について理解を深めていただきます。

日本語で視聴する

Meet Your Speaker

Foster Nethercott

Founder at Fortisec

A Marine Corps veteran and founder of Fortisec, Foster Nethercott is a SANS course author and cybersecurity professional specializing in offensive operations and AI TTPs, bringing real-world experience into SEC535.

SEC536: Adversarial AI - Penetration Testing AI Systems

Breaking the Black Box: An Introduction to Adversarial AI

Share

ブラックボックスを打ち破る：攻撃者AI（Adversarial AI）入門

日本語で視聴する

Meet Your Speaker

Foster Nethercott