Inference Control for Financial Advisory Output

Nick Clark

Regulatory Framework

The regulatory perimeter around AI-assisted advice is defined by overlapping obligations from at least four distinct legal regimes, and an architecture that intends to serve regulated advisory output must satisfy all of them simultaneously rather than treating any one as primary. In the United States, SEC Regulation Best Interest, codified at 17 CFR 240.15l-1, imposes on broker-dealers a best-interest standard that explicitly requires recommendations to reflect the retail customer's investment profile, including age, financial situation, tax status, investment objectives, experience, time horizon, liquidity needs, and risk tolerance. The Care Obligation requires that the broker-dealer exercise reasonable diligence, care, and skill, which is a process obligation as well as an outcome obligation. FINRA Rule 2111 layers on a suitability standard that requires reasonable-basis, customer-specific, and quantitative suitability determinations for every recommendation, with quantitative suitability covering the cumulative effect of a series of recommendations rather than each in isolation.

The Investment Advisers Act of 1940 imposes a fiduciary duty on registered investment advisers, and Rule 204-2 specifies the books and records that must be preserved. Among these are records of the basis for any recommendation, communications relating to recommendations, and any document that creates or reflects the advisory relationship. Form ADV requires disclosure of conflicts of interest, fee structures, and disciplinary history. When a generative system participates in producing a recommendation, the basis for that recommendation, the constraints that governed it, and the disclosures that accompanied it all become recordable events under 204-2 that must be retrievable on demand.

In Europe, MiFID II Articles 24 and 25 impose an analogous suitability obligation on investment firms providing advice or portfolio management, requiring firms to obtain information about the client's knowledge and experience, financial situation, and investment objectives, and to ensure that recommendations are suitable on that basis. Recital 87 and the supplementing Commission Delegated Regulation 2017/565 require firms to maintain records demonstrating that suitability was assessed, with the suitability report itself being a regulatory deliverable. The FCA's Conduct of Business Sourcebook applies the same logic in the United Kingdom with COBS 9 and 9A. The EU AI Act, in force since 2024 with high-risk obligations phasing in through 2026 and 2027, classifies systems used for credit scoring and creditworthiness assessment as high-risk under Annex III, triggering requirements for risk management, data governance, technical documentation, record-keeping, transparency, human oversight, accuracy, robustness, and cybersecurity. Advisory systems that recommend leverage, margin, or credit products are caught by this classification, and the documentation and oversight requirements impose architectural constraints that retrofit poorly onto generate-then-filter pipelines.

Architectural Requirement

Read together, these regimes converge on a small set of architectural requirements that any compliant advisory system must satisfy. First, the system must evaluate suitability against the full and current client profile at the moment a recommendation is formed, not against a simplified or cached version of the profile. Second, the system must enforce licensing and registration boundaries so that recommendations falling outside the advisor's authorized product scope cannot be produced regardless of how the user phrases the request. Third, the system must apply quantitative suitability across a sequence of recommendations, accounting for the cumulative portfolio impact rather than evaluating each recommendation in isolation. Fourth, the system must attach mandatory disclosures to every recommendation in a manner that cannot be bypassed by prompt manipulation or model error. Fifth, the system must preserve a record of the basis for each recommendation, the constraints that were evaluated, and the disclosures that were attached, in a form retrievable on regulatory demand for the retention periods specified by 204-2 and MiFID II.

None of these requirements is satisfied by a language model that produces fluent text. They are requirements about the process by which an output is constructed, not about the content of the output itself. An architecture that satisfies them must place a governance evaluation at the point where each candidate semantic transition is considered, must carry persistent state through the advisory session, and must produce a lineage record of the evaluations themselves. This is the architectural shape of inference control.

Why Procedural Compliance Fails

The dominant industry pattern is procedural compliance: a generative model produces a draft recommendation, a downstream classifier or rule engine evaluates the draft against compliance constraints, and the draft is either delivered, modified, or regenerated. This pattern fails on three independent grounds, each of which is sufficient to reject it for regulated advisory use. First, the unsuitable recommendation existed. Under 204-2 and MiFID II record-keeping logic, every artifact produced in connection with an advisory relationship is potentially a record. A model that internally generated an unsuitable recommendation has produced an event that must be reconciled, even if the recommendation never reached the client. Compliance teams discovering a generation log full of suppressed unsuitable outputs face a recordkeeping and supervisory question that has no good answer.

Second, rule-based filters cannot capture contextual suitability. A ten percent allocation to emerging-market high-yield bonds may be within a rule-based concentration limit and yet manifestly unsuitable for a client six months from retirement whose stated objective is capital preservation. FINRA's customer-specific suitability standard requires evaluating the recommendation against the full client context, which is a semantic determination that rule engines approximate but cannot reduce to enumerated thresholds. The combinatorial space of client profile, existing holdings, market context, and product characteristics is too large for any tractable rule set to cover, and every gap in the rule set is a compliance exposure.

Third, post-generation filtering cannot satisfy the quantitative suitability standard of FINRA 2111(c) because that standard is about cumulative effect across a series of recommendations. A filter operating on a single output has no view into the trajectory. By the time the filter sees the third recommendation in a sequence that has incrementally drifted the portfolio away from the client's stated risk tolerance, each individual recommendation may pass its individual filter while the aggregate clearly violates the quantitative suitability standard. The drift is a property of the session, not the message, and procedural compliance attached to messages cannot detect it.

What AQ Inference-Control Provides

The Adaptive Query inference-control primitive places a semantic admissibility gate at the point where each candidate transition in the generation process is considered. The agent's persistent state carries the client's risk profile, the investment horizon, the existing portfolio composition, the advisor's licensing and registration scope, the jurisdictional regulatory frame, and a running record of recommendations made in the session. Every candidate transition is evaluated against this composite state before it is permitted to commit to the output stream.

Concretely, a transition that would name a product outside the advisor's Series 6 or Series 7 scope is inadmissible. A transition that would push portfolio concentration beyond the client's stated risk tolerance is inadmissible. A transition that would describe expected returns without the disclosures required by Form ADV Part 2A is inadmissible. A transition that would constitute a guarantee of future performance under FINRA 2210(d)(1)(B) is inadmissible. The inference engine does not generate these transitions and then suppress them. The engine steers the generation trajectory around them in real time, producing output that is compliant by construction rather than compliant by filtration.

The persistent state accumulates through the advisory session. As recommendations commit, their projected portfolio impact updates the state, and subsequent transitions are evaluated against the updated state. This is the architectural answer to FINRA 2111(c) quantitative suitability: the cumulative effect of recommendations is tracked as a first-class state variable, and transitions that would push the cumulative effect outside the client's profile are gated at the point of generation.

Lineage recording captures, for each committed transition, which constraints were evaluated, which were satisfied, which were tightened by accumulated session state, and which references in the regulatory frame were dispositive. The lineage is the recordkeeping artifact that 204-2 and MiFID II Article 25(6) require. It is produced as a byproduct of the governance evaluation rather than reconstructed after the fact, and it is retrievable as a contemporaneous record of the basis for the recommendation.

Compliance Mapping

Each obligation in the regulatory frame maps to a specific element of the inference-control architecture. The Care Obligation of Reg BI 15l-1(a)(2)(ii) is satisfied by the requirement that every transition be evaluated against the customer's investment profile carried in persistent state. The Conflict of Interest Obligation is satisfied by encoding firm-level and advisor-level conflict disclosures in the regulatory frame and gating any transition that would recommend a conflicted product without the corresponding disclosure. The Compliance Obligation is satisfied by the lineage record itself, which evidences the policies and procedures reasonably designed to achieve compliance.

FINRA 2111(a) reasonable-basis suitability is enforced by the constraint that products outside the advisor's diligence-completed scope are inadmissible regardless of client profile. FINRA 2111(b) customer-specific suitability is enforced by the per-transition evaluation against the client profile. FINRA 2111(c) quantitative suitability is enforced by the running portfolio-impact state. Investment Advisers Act Section 206 fiduciary duty is supported by the requirement that every transition be evaluated under the duty of care and duty of loyalty constraints encoded in the regulatory frame. Rule 204-2(a)(7) and (a)(16) record retention is satisfied by the lineage artifact and its retention configuration. Form ADV Part 2A disclosure obligations are enforced by gating transitions that describe fees, conflicts, or risks without the corresponding disclosure tokens.

For European deployments, MiFID II Article 25(2) suitability assessment maps to the persistent client profile and the per-transition evaluation. Article 25(6) suitability statement obligations map to the lineage record. The EU AI Act Article 9 risk management system, Article 10 data governance, Article 12 record-keeping, Article 13 transparency, Article 14 human oversight, and Article 15 accuracy and robustness obligations each have specific architectural correspondents in inference control: the regulatory frame is the risk management artifact, the persistent state is the data governance boundary, the lineage is the record-keeping output, the transition rationale is the transparency layer, the inadmissibility-driven escalation is the human oversight integration, and the deterministic admissibility evaluation is the accuracy and robustness foundation.

Adoption Pathway

Adoption proceeds in three stages. In the first stage, a wealth management platform identifies the regulatory frame applicable to its advisory operations, encodes the client profile schema required by Reg BI and FINRA 2111 into the persistent state object, and instruments its existing generation pipeline to surface candidate transitions to the inference-control gate. The gate is initially run in shadow mode, recording admissibility decisions without yet enforcing them. The shadow mode lineage is reviewed against historical advisory output to validate that the gate's evaluations align with the firm's compliance judgments.

In the second stage, the gate is moved into enforcement on a defined product and client scope: typically beginning with self-directed advisory tools where the regulatory exposure is most acute and the human-advisor fallback is built into the product. The portfolio-impact state is connected to the firm's positions-of-record system so that quantitative suitability is evaluated against actual rather than synthetic holdings. Disclosure templates from Form ADV Part 2A and any applicable product prospectuses are encoded as required-token constraints on terminal transitions.

In the third stage, the lineage retention configuration is tuned to satisfy the firm's 204-2 retention schedule, which for most categories is five years with the first two years in an easily accessible place. The lineage is integrated with the firm's existing supervisory review workflows so that flagged transitions are routed to human supervisors with the full evaluation context, and the supervisory determinations are recorded back into the lineage. At this stage the firm has an inference-controlled advisory architecture that is compliant by construction, supervisable by design, and auditable by retrieval rather than by reconstruction.