Amazon Nova Challenge
Red-teaming LLM agents for code generation safety
Building red-teaming LLM agents to detect vulnerable code generation in production tasks. Creating safety benchmarks for adversarial coding prompts, evaluating exploitability and model robustness.
Technologies: PyTorch, LLM Agents, AI Safety
Status: Ongoing (Feb 2026 - Present)
Key Contributions
- Developed automated red-teaming agents that identify security vulnerabilities in LLM-generated code
- Created comprehensive safety benchmarks for evaluating code generation under adversarial conditions
- Analyzed exploitability patterns across different model architectures and prompt strategies
This work contributes to making AI-powered coding assistants more secure and reliable for production use.