Build NinjaScript with AI, Then Backtest in NT8

The reason AI agents often fail at NinjaScript is that the model has no access to your actual NT8 install. It guesses at signatures, hallucinates overloads, and produces code that does not compile. CrossTrade MCP gives the agent typed access to the NinjaScript symbol surface, an in-memory compile, and the real Strategy Analyzer engine. The result is a compile loop that converges instead of looping.

What vibe coding gets right

Vibe coding is shorthand for natural-language software development: describe what you want, let an AI agent write the code, iterate on feedback. For NinjaScript, vibe coding gets several things right:

The model is fluent in C# patterns and produces a credible first draft fast.
Translating a trader's intent into structured code is exactly what the model is good at.
Iteration is much faster than hand-writing NinjaScript from scratch.
Repetitive boilerplate (parameter declarations, OnBarUpdate skeletons, indicator wiring) is solved in seconds.

That value is real. The question is whether you stop there.

What vibe coding gets wrong in trading

For office software, the worst case of a vibe-coded mistake is a broken UI you fix tomorrow. For trading, the worst case is a fill at the wrong tick on a funded account. Trading raises the bar:

A draft that compiles is not a draft that behaves the way the spec described.
A backtest in some generic engine is not a backtest in your NT8 Strategy Analyzer.
A great-looking metric block can hide a strategy that takes 8 trades and earns most of its NetProfit on two outliers.
A wide parameter sweep finds noise fits the trader trusts more than they should.
The strategy code does not enforce firm rules. The firm's risk system does.

Vibe coding without verification is "we shipped." For trading, "we shipped" is the wrong endpoint.

Why NT8 compile feedback matters

Generic AI code review cannot tell you whether your strategy compiles against the running NT8 AppDomain. The model's training data may be old. The overload it picked may have been removed. The indicator name may not exist in your install. CompileNinjaScript(in_memory: true) is the only way to know.

The compile diagnostic is also the most useful single signal in the loop. It tells the agent exactly which line to fix and exactly which identifier to look up. Without it, the agent guesses at fixes and iterates without converging.

Why Strategy Analyzer feedback matters

NT8 Strategy Analyzer is the engine that will run your strategy live. It is also the engine RunStrategyBacktest drives through MCP. Single backtests are bit-identical to the UI for the documented reference parameters. That parity is the property that makes the agent's backtest result trustworthy.

A backtest in a different engine is not a backtest in NT8. The agent that uses a different engine and then deploys to NT8 is making the deploy decision on the wrong numbers.

Why Sim101 comes before live or funded accounts

Even with a green compile and a passing backtest, the strategy is not proven. Live conditions differ from historical replay. Fast markets fill differently than slow markets. Your account state and broker behavior matter. Funded firms have rules the code does not enforce.

Sim101 is the cheap rehearsal. Run the strategy on Sim101 for at least several sessions. Watch the trade list. Make sure the entries and exits behave the way the prompt described. Only after that, consider any other account.

For funded accounts, also verify the firm's rules permit the strategy. See AI for Funded Futures Traders.

The problem with AI-generated NinjaScript

A generic model will write C# that looks plausible. It will reference indicators that do not exist in your install. It will choose method overloads that were removed two versions ago. It will write EmaCross instead of CrossAbove(EMA(9), EMA(21), 1). You compile, you get errors, you paste the errors back into the chat, the model swears it understands, and it makes new mistakes.

The compile loop must be inside the conversation. The model must be able to call:

GetNinjaScriptHelp to read the docs for a symbol.
SearchNinjaScriptSymbols to find names by fuzzy match.
LookupNinjaScriptSymbol to resolve a single name to its overloads.
CompileNinjaScript(in_memory: true) to compile against the actual NT8 AppDomain.

Without those tools, the agent guesses. With them, the agent converges.

The CrossTrade compile loop

GetNinjaScriptHelp / Search / Lookup
       │
       ▼
Author NinjaScript source
       │
       ▼
CompileNinjaScript(in_memory: true)
       │
       ├──────  on failure: Lookup the unknown symbol  ──┐
       │                                                 │
       │                                                 ▼
       │                                            repair source
       │                                                 │
       └──────  recompile  ◄─────────────────────────────┘
       │
       ▼
on success: WriteNinjaScriptFile (with user confirmation)
       │
       ▼
RunStrategyBacktest

Compile errors that used to take 20 minutes of manual back-and-forth now take three or four iterations inside the chat. The agent reads the diagnostic, identifies the symbol, calls LookupNinjaScriptSymbol, and rewrites the offending line.

Backtesting through Strategy Analyzer

RunStrategyBacktest drives the actual NT8 Strategy Analyzer engine. Single backtests are bit-identical to running the same configuration in NT8's UI. The verification reference is SampleMACrossOver on MES 06-26, 5-minute bars, April 1 to April 30, 2026, Fast=10, Slow=25, Sim101, no commission, 0 slippage. Both paths produce the same NetProfit, ProfitFactor, Sharpe, and trade count.

This is the property the agent needs. If the agent's backtest result and your NT8 UI's backtest result diverge, the agent's plan is invalid. With parity, the agent's gate decisions are as good as the ones you would make staring at the same numbers.

Parameter sweeps

RunStrategyBacktest accepts an optimization.parameters_sweep block:

{
  "strategy": "MyEmaCross",
  "instrument": "MES 06-26",
  "from": "2026-04-01",
  "to": "2026-04-30",
  "bars_period": { "type": "Minute", "value": 5 },
  "optimization": {
    "parameters_sweep": [
      { "name": "FastPeriod", "min": 5, "max": 15, "step": 1 },
      { "name": "SlowPeriod", "min": 20, "max": 30, "step": 1 }
    ],
    "fitness": "NetProfit"
  }
}

The result includes a ranked grid with the best parameters in a top-level best block. The agent then re-runs a full single backtest on the winner so the user can see every trade.

Example prompt

On MES 06-26, write a NinjaScript strategy called MyEmaCross. Long when 9-EMA crosses above 21-EMA, exit on cross below. ATR-based trailing stop with multiplier 2.0. Maximum 4 contracts. Before writing, call GetNinjaScriptHelp on EMA, ATR, CrossAbove, and SetTrailStop. Compile in memory. If compile fails, look up the unknown symbols and fix. Only WriteNinjaScriptFile after I confirm. Then backtest April 1 to April 30, 2026, 5-minute bars, Sim101, no commission, 0 slippage. Show the metrics. After the metrics, sweep FastPeriod 5..15 step 1 and SlowPeriod 20..30 step 1 with fitness NetProfit. Tell me the top three parameter sets. Then re-run a full single backtest on the winner.

This prompt produces dozens of tool calls and several minutes of work. The agent narrates each step.

Overfitting warning

A great backtest is a hypothesis, not a result. A great sweep is a danger sign, not a green light. Two reasons:

The wider the sweep, the easier it is to find a parameter set that fits noise.
The Strategy Analyzer engine treats the entire backtest range as in-sample by default. You need an explicit out-of-sample period to test whether the winner is real.

Practical defenses inside the prompt:

Restrict sweep width to ranges the strategy was designed for. Avoid "try anything from 1 to 100."
Run the winner on a separate out-of-sample window before deploying.
Require a minimum trade count for a backtest to count.
Set a minimum profit factor.
Demand max drawdown below an explicit dollar cap.

The agent should refuse to deploy if the winner fails any gate.

Deployment gate

A typical deploy gate:

Only DeployStrategy if all are true: profit factor above 1.25 in the original window, profit factor above 1.10 in an out-of-sample window from the prior 30 days, at least 60 trades total, max drawdown below $500. If any gate fails, show me which one and stop. After deploy, call GetDeployedStrategyState and confirm is_trading: true.

If you let the agent deploy on the first thing that backtests above zero, you are not really using the deploy gate. Write it strictly.

FAQ

Can the agent write a complete strategy in one shot?

For simple strategies, sometimes yes. For real strategies, no. Plan on iteration. The strength of MCP is that iteration is fast.

Does compile in memory affect my install?

No. in_memory: true does not touch disk. The compile happens in the running NT8 AppDomain in a sandboxed snippet. Only WriteNinjaScriptFile writes.

Why does my backtest use Sim101?

Sim101 is the default NinjaTrader simulation account. Backtests do not require a live account; they require historical data and a configured account name. Sim101 is the standard choice.

Why does my sweep take so long?

Sweep size is cartesian. Two 11-step ranges produce 121 iterations. Each iteration runs the full backtest. Reduce the ranges or step.

Can the agent deploy without my confirmation?

If your prompt does not require confirmation, yes. Always require confirmation.

What vibe coding gets right​

What vibe coding gets wrong in trading​

Why NT8 compile feedback matters​

Why Strategy Analyzer feedback matters​

Why Sim101 comes before live or funded accounts​

The problem with AI-generated NinjaScript​

The CrossTrade compile loop​

Backtesting through Strategy Analyzer​

Parameter sweeps​

Example prompt​

Overfitting warning​

Deployment gate​

FAQ​

Can the agent write a complete strategy in one shot?​

Does compile in memory affect my install?​

Why does my backtest use Sim101?​

Why does my sweep take so long?​

Can the agent deploy without my confirmation?​

Related​