Skip to main content
reNgine Cloud > Configuration

Configuring AI/LLM Features in reNgine Cloud

Last updated April 7, 2026

What AI Features Does reNgine Offer?

reNgine Cloud includes optional AI/LLM-powered capabilities that enhance your reconnaissance workflow:

  • Vulnerability Analysis Summaries — Automatically generates detailed technical descriptions, business impact assessments, remediation steps, and reference links for discovered vulnerabilities.
  • Attack Surface Insights — Analyzes your recon data (subdomains, open ports, technologies, HTTP responses) and suggests prioritized attack vectors mapped to the MITRE ATT&CK framework.
  • Enhanced Report Generation — Adds AI-driven context and analysis to scan reports, making them more actionable for both technical and non-technical stakeholders.

These features use either the OpenAI API (cloud-hosted) or Ollama (local/self-hosted) as the LLM backend.


Option A: OpenAI API (Cloud)

Best for teams that want fast results with minimal infrastructure setup.

Setup

  1. Navigate to API Vault — In reNgine, go to Scan Engine Settings > API Vault.
  2. Add your OpenAI API key — Enter your key in the OpenAI API Key field and save. You can generate a key at platform.openai.com/api-keys.
  3. Select a model — reNgine supports multiple OpenAI models. Choose one from the settings panel.

Recommended Models

Model Best For Context Window
GPT-4o Best quality analysis, complex targets 128k tokens
GPT-4o-mini Cost savings with good quality 128k tokens
GPT-4 Turbo High-quality analysis, large scans 128k tokens

Expected API Costs

Costs depend on scan size and how many vulnerabilities trigger LLM analysis. Ballpark estimates per scan:

  • Small scan (single target, <50 findings): $0.05 to $0.50
  • Medium scan (multiple subdomains, 50-200 findings): $0.50 to $3.00
  • Large scan (broad recon, 200+ findings): $3.00 to $15.00

GPT-4o-mini cuts costs by roughly 80% compared to GPT-4o. Reports are cached in the database, so re-viewing a previously analyzed vulnerability incurs no additional cost.


Option B: Ollama (Local/Self-Hosted)

Best for teams that require data to stay on-premises or want to eliminate ongoing API costs.

Why Local?

  • No data leaves your VM — all LLM inference runs locally.
  • No per-token API costs — after initial setup, usage is free.
  • Full control — choose your model, tune performance, and run offline.

Installing Ollama

If Ollama is not pre-installed on your reNgine Cloud VM:

curl -fsSL https://ollama.com/install.sh | sh
sudo systemctl enable ollama
sudo systemctl start ollama

Verify it is running:

curl http://localhost:11434/api/tags

Configuring reNgine for Ollama

reNgine connects to Ollama at http://ollama:11434 by default (the Docker service name). If Ollama runs on the host machine instead of in Docker, set the OLLAMA_INSTANCE environment variable in your reNgine configuration:

OLLAMA_INSTANCE=http://host.docker.internal:11434

In the reNgine UI, navigate to the Ollama settings panel to select and download models directly from the interface.

GPU Instance Types for Good Performance

A GPU dramatically improves local inference speed. Recommended instance types:

Provider Instance Type GPU VRAM
AWS g4dn.xlarge NVIDIA T4 16 GB
AWS g5.xlarge NVIDIA A10G 24 GB
Azure NC4as_T4_v3 NVIDIA T4 16 GB
Azure NC6s_v3 NVIDIA V100 16 GB

CPU-only: Ollama works without a GPU but expect significantly slower inference (minutes per analysis instead of seconds). Suitable for small targets or infrequent scans.

Recommended Models by Available RAM

RAM Recommended Models Notes
8 GB llama3:8b, mistral:7b Good baseline performance
16 GB llama3:8b (larger context), codellama:13b Better for detailed vulnerability analysis
32 GB+ llama3:70b (quantized), mixtral:8x7b Best local quality, approaches cloud model output

Download a model from the reNgine UI or via CLI:

ollama pull llama3:8b

Choosing Between Cloud and Local

Consideration OpenAI (Cloud) Ollama (Local)
Setup complexity Minimal — just add an API key Moderate — install Ollama, download models
Data privacy Data sent to OpenAI servers All data stays on your VM
Ongoing cost Pay per token Free after setup (GPU instance cost applies)
Output quality Best (GPT-4o) Good to very good (depends on model and size)
Speed Fast (cloud infrastructure) Fast with GPU, slow on CPU-only
Offline capable No Yes

Recommendation: Start with OpenAI using GPT-4o-mini to evaluate the features. If data residency or cost is a concern, switch to Ollama with a GPU-backed instance and llama3:8b or larger.


Troubleshooting

“API key invalid” Regenerate your key at platform.openai.com/api-keys. Check for leading or trailing whitespace when pasting. Ensure the key has not been revoked or expired.

Ollama not responding Check if the service is running: sudo systemctl status ollama (system install) or docker ps | grep ollama (Docker). Confirm the endpoint is reachable from the reNgine container: curl http://ollama:11434/api/tags.

Out of memory with local model Use a smaller model (e.g., llama3:8b instead of llama3:70b) or increase your VM RAM.

AI features not appearing in the UI These features require reNgine 2.0 or later. Check your version in the reNgine dashboard and update if needed.


Next Steps

Explore more configuration guides and tutorials at hailbytes.com/tutorials.

Still need help? Open a ticket at support.hailbytes.com.