HailBytes Cloud Support Hub

What AI Features Does reNgine Offer?

reNgine Cloud includes optional AI/LLM-powered capabilities that enhance your reconnaissance workflow:

Vulnerability Analysis Summaries — Automatically generates detailed technical descriptions, business impact assessments, remediation steps, and reference links for discovered vulnerabilities.
Attack Surface Insights — Analyzes your recon data (subdomains, open ports, technologies, HTTP responses) and suggests prioritized attack vectors mapped to the MITRE ATT&CK framework.
Enhanced Report Generation — Adds AI-driven context and analysis to scan reports, making them more actionable for both technical and non-technical stakeholders.

These features use either the OpenAI API (cloud-hosted) or Ollama (local/self-hosted) as the LLM backend.

Option A: OpenAI API (Cloud)

Best for teams that want fast results with minimal infrastructure setup.

Setup

Navigate to API Vault — In reNgine, go to Scan Engine Settings > API Vault.
Add your OpenAI API key — Enter your key in the OpenAI API Key field and save. You can generate a key at platform.openai.com/api-keys.
Select a model — reNgine supports multiple OpenAI models. Choose one from the settings panel.

Recommended Models

Model	Best For	Context Window
GPT-4o	Best quality analysis, complex targets	128k tokens
GPT-4o-mini	Cost savings with good quality	128k tokens
GPT-4 Turbo	High-quality analysis, large scans	128k tokens

Expected API Costs

Costs depend on scan size and how many vulnerabilities trigger LLM analysis. Ballpark estimates per scan:

Small scan (single target, <50 findings): $0.05 to $0.50
Medium scan (multiple subdomains, 50-200 findings): $0.50 to $3.00
Large scan (broad recon, 200+ findings): $3.00 to $15.00

GPT-4o-mini cuts costs by roughly 80% compared to GPT-4o. Reports are cached in the database, so re-viewing a previously analyzed vulnerability incurs no additional cost.

Option B: Ollama (Local/Self-Hosted)

Best for teams that require data to stay on-premises or want to eliminate ongoing API costs.

Why Local?

No data leaves your VM — all LLM inference runs locally.
No per-token API costs — after initial setup, usage is free.
Full control — choose your model, tune performance, and run offline.

Installing Ollama

If Ollama is not pre-installed on your reNgine Cloud VM:

curl -fsSL https://ollama.com/install.sh | sh
sudo systemctl enable ollama
sudo systemctl start ollama

Verify it is running:

curl http://localhost:11434/api/tags

Configuring reNgine for Ollama

reNgine connects to Ollama at http://ollama:11434 by default (the Docker service name). If Ollama runs on the host machine instead of in Docker, set the OLLAMA_INSTANCE environment variable in your reNgine configuration:

OLLAMA_INSTANCE=http://host.docker.internal:11434

In the reNgine UI, navigate to the Ollama settings panel to select and download models directly from the interface.

GPU Instance Types for Good Performance

A GPU dramatically improves local inference speed. Recommended instance types:

Provider	Instance Type	GPU	VRAM
AWS	g4dn.xlarge	NVIDIA T4	16 GB
AWS	g5.xlarge	NVIDIA A10G	24 GB
Azure	NC4as_T4_v3	NVIDIA T4	16 GB
Azure	NC6s_v3	NVIDIA V100	16 GB

CPU-only: Ollama works without a GPU but expect significantly slower inference (minutes per analysis instead of seconds). Suitable for small targets or infrequent scans.

Recommended Models by Available RAM

RAM	Recommended Models	Notes
8 GB	`llama3:8b`, `mistral:7b`	Good baseline performance
16 GB	`llama3:8b` (larger context), `codellama:13b`	Better for detailed vulnerability analysis
32 GB+	`llama3:70b` (quantized), `mixtral:8x7b`	Best local quality, approaches cloud model output

Download a model from the reNgine UI or via CLI:

ollama pull llama3:8b

Choosing Between Cloud and Local

Consideration	OpenAI (Cloud)	Ollama (Local)
Setup complexity	Minimal — just add an API key	Moderate — install Ollama, download models
Data privacy	Data sent to OpenAI servers	All data stays on your VM
Ongoing cost	Pay per token	Free after setup (GPU instance cost applies)
Output quality	Best (GPT-4o)	Good to very good (depends on model and size)
Speed	Fast (cloud infrastructure)	Fast with GPU, slow on CPU-only
Offline capable	No	Yes

Recommendation: Start with OpenAI using GPT-4o-mini to evaluate the features. If data residency or cost is a concern, switch to Ollama with a GPU-backed instance and llama3:8b or larger.

Troubleshooting

“API key invalid” Regenerate your key at platform.openai.com/api-keys. Check for leading or trailing whitespace when pasting. Ensure the key has not been revoked or expired.

Ollama not responding Check if the service is running: sudo systemctl status ollama (system install) or docker ps | grep ollama (Docker). Confirm the endpoint is reachable from the reNgine container: curl http://ollama:11434/api/tags.

Out of memory with local model Use a smaller model (e.g., llama3:8b instead of llama3:70b) or increase your VM RAM.

AI features not appearing in the UI These features require reNgine 2.0 or later. Check your version in the reNgine dashboard and update if needed.

Next Steps

Explore more configuration guides and tutorials at hailbytes.com/tutorials.

Still need help? Open a ticket at support.hailbytes.com.

Configuring AI/LLM Features in reNgine Cloud