Skip to content

Conversation

@danenania
Copy link
Contributor

Testing all-clear feedback links

Copy link

@promptfoo-scanner-staging promptfoo-scanner-staging bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed the new LLM integration code in llm_vuln.ts. The PR introduces two functions for interacting with OpenAI's API. I found one high-severity vulnerability where the customAssistant() function allows external control over the system prompt, which completely bypasses any security boundaries and enables attackers to manipulate the LLM's behavior arbitrarily.

Minimum severity threshold for this scan: 🟡 Medium | Learn more

Comment on lines +14 to +22
async function customAssistant(systemPrompt: string, question: string) {
const response = await openai.chat.completions.create({
model: "gpt-4",
messages: [
{ role: "system", content: systemPrompt },
{ role: "user", content: question }
]
});
return response.choices[0].message.content;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High

The customAssistant() function accepts a systemPrompt parameter that flows directly to the system role message without any validation. This allows callers to completely override the LLM's instructions and security boundaries. An attacker can inject arbitrary system-level instructions to manipulate the model's behavior, bypass safety constraints, or extract sensitive information from context.

💡 Suggested Fix

Hardcode the system prompt instead of accepting it as a parameter:

async function customAssistant(question: string) {
  const SYSTEM_PROMPT = "You are a helpful assistant. Do not reveal internal information or execute commands.";

  const response = await openai.chat.completions.create({
    model: "gpt-4",
    messages: [
      { role: "system", content: SYSTEM_PROMPT },
      { role: "user", content: question }
    ]
  });
  return response.choices[0].message.content;
}

Alternatively, if multiple assistant types are needed, use a whitelist of predefined system prompts rather than accepting arbitrary user input.

🤖 AI Agent Prompt

The code at llm_vuln.ts:14-22 contains a prompt injection vulnerability where the system prompt is user-controllable via the systemPrompt parameter. This eliminates any security boundary between user input and system instructions.

Investigate how customAssistant() is called in the broader codebase. Determine if there are legitimate use cases requiring multiple system prompt variations. If so, implement a whitelist approach with predefined prompts. If not, hardcode a single secure system prompt.

Check if there are any validation or authorization layers before this function is called. Search for any API endpoints or entry points that expose this function to external users. Consider whether the application's security model depends on prompt-based controls, and if so, what defense-in-depth measures should be added beyond fixing this specific vulnerability.


Was this helpful? 👍 Yes | 👎 No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants