How to evaluate an AI consultant in 2026 (without getting sold a slide deck)

Why this guide exists

The AI consultancy market in Australia has the same buyer-protection problem the broader software services market had in 2008. No licensing body. Loud marketing. The difference between a builder who ships and a deck-shop is invisible from the outside until the project is six weeks deep, the invoices are stacking, and the only working artefact is a Notion page titled "AI Strategy Roadmap v3".

The market is not adversarial. Most of the people pitching are well-intentioned. The problem is structural — a buyer who has never commissioned an AI build cannot tell the difference between a vendor who has shipped twenty pipelines and one who has read twenty Substack posts about prompt engineering. Both are confident. Both have decks. Only one will hand you a working system on day 90.

This piece is a checklist. Seven questions, a handful of red flags, the Australian-specific concerns most overseas vendors miss, and a four-axis comparison framework. The pillar version lives in the 2026 starter guide; what follows is the long-form version with the failure modes named.

Seven questions to ask any AI consultant

1. Where does the data live?

The question is not "is it secure". Every vendor will say yes. The question is the physical location of the model API call, the source documents, and the audit logs.

A good answer is specific. The pipeline runs in your AWS Sydney or Azure Australia East tenancy. The Anthropic Claude calls route through AWS Bedrock's Sydney endpoint, not the US endpoint. Source documents stay in your SharePoint or Google Workspace, never copied into a vendor-controlled storage bucket. A bad answer is vague — "it's all in the cloud", "the data is encrypted". Encryption is necessary and not sufficient.

The failure mode this screens out is the consultant who builds against the OpenAI US endpoint, your documents end up in a vendor-owned vector database in Virginia, and the first time anyone notices is when an OAIC notification is being drafted. The Privacy Act amendments care where the personal information lived, not whether the US endpoint was faster.

2. What are your support hours?

AEDT and AEST are not optional for an Australian small business. The pipeline misses a sync at 9am Melbourne on a Wednesday. Who picks up the phone? If the answer is "we'll respond within 24 hours" and the vendor's working day starts at 9am London, the buyer just bought a 19-hour outage window into a system the team relies on Monday morning.

A good answer names hours, names a fallback contact, and gives a written response-time commitment. A bad answer is the offshore-support tier hidden inside a glossy proposal — a "24/7 support" line that resolves to a ticket queue triaged from another timezone with English-language SLA caveats nobody reads. That is not support. That is a queue.

3. Can you quote in AUD with explicit GST treatment?

International vendors quote in USD by default. A 5,000 USD project lands as 7,800 AUD plus GST in 2026 once the conversion is made. Currency exposure across the lifecycle is the buyer's risk, not the vendor's.

A good answer is a clean AUD quote with GST as a separate line, the conversion rate locked for the engagement, and any USD-denominated upstream tooling called out with conversion methodology. A bad answer is a USD quote with "and we'll handle the conversion" appended — the buyer is now carrying foreign-exchange risk on a 90-day project.

The ABN check belongs here. An ABN is free, public, and a 30-second lookup on the ABR. A vendor without one cannot legally invoice you for GST credit purposes. If the answer is "we operate through a US LLC and bill via Stripe", the vendor is asking the buyer to absorb the legal and tax frame the vendor decided to skip.

4. How do you handle Privacy Act and OAIC compliance?

The 2024 Privacy Act amendments raised the bar on how AI systems handle personal information. Consent has to be specific and informed. Purpose has to be named at collection. Access requests have to be honoured inside 30 days. The OAIC notification regime applies to data breaches an AI pipeline can absolutely cause — a model call that logs a redacted-but-not-redacted-enough document into a third-party telemetry surface, for instance.

A good answer references the amendments specifically, walks through how the system handles consent and purpose, and explains the redaction step before any model call. A bad answer is the wave-of-the-hand: "we're SOC 2 compliant". SOC 2 is a US framework and does not substitute for an Australian Privacy Act review. A consultant who confuses GDPR posture for Privacy Act posture is one regulator notice away from a difficult conversation.

5. Will you run a paid audit before quoting the build?

Free strategy calls are sales calls. The quote at the end is the price the vendor needs the project to be, calibrated against the buyer's body language during the meeting, not against the actual workflow.

A paid audit with a defined deliverable changes the dynamic. A workflow inventory. A feasibility ranking. A written recommendation. If the recommendation is "do not do this with AI", that is what gets delivered — there is no sales-margin incentive to push a project that should not exist. A consultant who charges 2,000 to 5,000 AUD for an audit and delivers a 90-day plan is operating in a different mode from one whose "free audit" is a slide deck dressed as discovery. The AI strategy consulting service page walks through what a paid audit covers.

6. Can you share a contract and walk through the IP and data clauses?

The contract is the document that survives the relationship. A vendor who cannot produce one until the deposit lands has not written one yet.

A good answer names the deliverable concretely, the timeline with milestones, the payment schedule, the IP ownership of code and configuration, the data-handling clauses with named regions and retention periods, and the change-of-scope process. A bad answer is the generic services-agreement template lifted from a 2019 web-design firm with "AI" pasted in. The clauses that matter — data residency, model-output ownership, retraining rights, termination handling — are absent because the template predates the questions.

The IP clause deserves its own attention. Who owns the prompt templates? Who owns any fine-tuned model weights? If the vendor uses your data to improve their internal tooling, is that called out and consented to? A consultant who hesitates here has not thought through the answer for the next ten clients either.

7. Can you show three shipped projects with measurable outcomes?

A portfolio of anonymised case studies with measurable outcomes is the right signal. Not aspirational. Not pilots "showing strong early indicators". Shipped projects with named timeframes, integration surfaces, and verified outcomes attached.

A good answer names the vertical, the integration stack, the timeframe, the measured outcome, and the verification path. The healthcare voice agent case study is one example of the shape — a 12-clinic GP network in NSW recovered 38% of missed bookings in 90 days using a Twilio-Vapi-Claude agent, the Practice Manager verified the numbers, the case study is anonymised but the outcome is real. The buyer should be able to read three studies of that shape from any vendor before signing.

A bad answer is the logo wall with no outcomes attached. Eight logos on a slide titled "Trusted by leaders" tell the buyer nothing. If a vendor is early and does not have three shipped projects yet, the right move is to say so and offer a discounted first project with explicit pilot terms — not to dress up workshop engagements as shipped builds. Honesty about the portfolio is itself a signal.

Red flags worth walking away from

Patterns worth treating as walk-away signals once any one of them appears:

The free strategy call dressed as an audit. A 60-minute Zoom that opens with "tell us about your business" and closes on a slide titled "Your AI Roadmap" is a sales call. The slide was made before the call.

USD-only pricing with no AUD conversion path. The buyer is being asked to carry foreign-exchange risk across a 90-day project. The vendor either does not understand Australian procurement or does not want to.

Vague answers about data residency. If the vendor cannot name the AWS region, the model API endpoint, and the storage location of the audit logs in the first 30 seconds, the vendor has not built an Australian-residency pipeline before.

A portfolio that is mostly logos. Eight logos on a slide with no outcomes attached is a marketing artefact, not a portfolio. The buyer should read three case studies with named timeframes and verified outcomes before the second meeting.

A contract that looks generic. The clauses that matter for an AI build — data residency, model-output ownership, retraining rights, OAIC notification handling — are specific enough that a generic template cannot cover them.

"AI partner" badges from vendors who issue them to anyone. Most platform-partner programmes have a low bar. A "Certified AI Partner" badge from a vendor who certifies any company that completes a 4-hour course is not a credential.

Australian-specific concerns most overseas vendors miss

The questions above are general due diligence. The following concerns are specific to Australian small businesses — the gaps where overseas vendors most frequently underestimate the local context.

Data residency in named regions. AWS Sydney (ap-southeast-2), AWS Melbourne (ap-southeast-4), Azure Australia East, and Google Cloud Sydney are the regions that matter. The Anthropic Claude API call routes through AWS Bedrock's Sydney endpoint when configured correctly. The OpenAI API does not currently offer an Australian endpoint — a constraint a buyer should know about up front.

AUD pricing with GST treatment. Every quote in AUD. Every conversion called out. Every USD-denominated upstream tool (Klaviyo, Gorgias, Zendesk, Stripe) named with conversion methodology so the buyer knows the actual exposure across the contract. The document processing and business process automation service pages illustrate the AUD-with-GST quoting shape.

AEDT and AEST support hours. The Australian working day overlaps roughly four hours with London and zero with US Pacific. A vendor's "24/7 support" that resolves to a US-time ticket queue is not 24/7 from the Australian buyer's seat.

Privacy Act 2024 amendments and the OAIC notification regime. Australian privacy law is its own animal. GDPR posture and SOC 2 are useful starting points and not substitutes. A vendor who can walk through the Privacy Principles in the context of a model pipeline is one who has been paying attention.

Australian Consumer Law and unfair-contract-terms regime. Small-business contracts here sit under the ACCC's unfair-contract-terms regime. A vendor whose contract template was written for a US enterprise market may include clauses — one-sided indemnities, broad termination rights, retraining rights on customer data — that are unenforceable here. If no localisation has happened, the buyer is the one carrying the legal risk.

The shorthand: an Australian operator quotes in AUD, supports in AEDT, hosts in Sydney or Melbourne, has read the Privacy Act amendments, has an ABN, and has a contract read by an Australian lawyer. That pile is the difference between a local builder and an overseas vendor with a localised landing page.

A four-axis comparison framework

Two or three consultants pass the seven-question checklist. The buyer now has to compare them. A simple matrix gives the comparison a shape.

Technical depth. Can the consultant explain the orchestration layer, the model selection, the validation logic, and the integration surface in a way a technically literate buyer can follow? A vendor who answers technical questions with marketing copy will not ship the build.

Australian-operator depth. Does the consultant pass the six concerns above? Each item is binary. The score is how many the vendor passes without prompting.

Shipped portfolio. Three case studies with named verticals, integration stacks, timeframes, and verified outcomes. Pilot work that never reached production is not a shipped project.

Contract quality. A contract that names the deliverable concretely, specifies data residency and retention, addresses IP and retraining rights, and walks through change-of-scope.

A buyer who scores three shortlisted vendors across the four axes — a 1-to-5 rating per axis with written reasoning — usually finds the decision makes itself. Where the scores tie, the decider is technical depth on the specific workflow the buyer cares about most. The vendor who has shipped the closest pattern is the one most likely to ship this one.

Next step

The honest answer to "how do I pick" is: ask the seven questions, watch for the red flags, score across the four axes, and trust the work the buyer can verify. If a paid audit feels like the right next step, the audit page walks through what we run. If a different vendor passes the checklist better, that is the right outcome — the goal here is the buyer leaving equipped, not the buyer landing in our calendar.

The seven questions are the beginning of telling builders and deck-shops apart. The buyer who asks them gets a different conversation back.

If you want to see what fits your team, we run a free 45-minute audit — no slides, just a walk through your current workflows.

Book a free 45-min audit