Amazon Nova Challenge

Building red-teaming LLM agents to detect vulnerable code generation in production tasks. Creating safety benchmarks for adversarial coding prompts, evaluating exploitability and model robustness.

Technologies: PyTorch, LLM Agents, AI Safety

Status: Ongoing (Feb 2026 - Present)

Key Contributions

Developed automated red-teaming agents that identify security vulnerabilities in LLM-generated code
Created comprehensive safety benchmarks for evaluating code generation under adversarial conditions
Analyzed exploitability patterns across different model architectures and prompt strategies

This work contributes to making AI-powered coding assistants more secure and reliable for production use.