Semantic Discovery for Competitive Intelligence

Nick Clark

Regulatory framework

The Defend Trade Secrets Act of 2016 created a federal civil cause of action for trade secret misappropriation and operates alongside the Economic Espionage Act, which criminalizes the theft or unauthorized acquisition of trade secrets. Both statutes turn on whether information was obtained through proper means. Public observation, lawful purchase, reverse engineering of lawfully obtained product, and inference from publicly available materials are proper. Acquisition through breach of duty, deception, or unauthorized access is improper. The EU Trade Secrets Directive 2016/943 establishes a parallel regime across member states with comparable proper-means and improper-means distinctions. A competitive intelligence program that cannot demonstrate, for any given derived insight, that every source contributing to the conclusion was obtained through proper means is exposed to misappropriation claims regardless of the program's actual conduct.

The Computer Fraud and Abuse Act criminalizes access to a protected computer that exceeds authorization, and the Electronic Communications Privacy Act restricts the interception of electronic communications. Together these statutes draw the line between scraping a public website under its terms of service and circumventing access controls to obtain non-public material. The boundary is enforced both criminally and through civil claims, and the Supreme Court's narrowing of CFAA in Van Buren has not eliminated the exposure for intelligence programs that operate without disciplined source provenance.

GDPR Article 5 imposes lawfulness, purpose limitation, and data minimization on the processing of personal data, and Article 22 restricts decisions taken solely on automated processing where they produce legal or similarly significant effects. Competitive intelligence programs that ingest job postings, biographical information from professional networks, or content authored by identifiable individuals are processing personal data within the meaning of the regulation and inherit the full obligation set, including the documentation that demonstrates compliance. Section 5 of the FTC Act treats unfair or deceptive acts or practices as actionable, and federal enforcement has applied that authority to intelligence-gathering practices that involve pretexting or material deception. The SCIP Code of Ethics is not a statute but is the de facto industry standard against which the reasonableness of an intelligence program's conduct will be measured by a fact-finder. NIST SP 800-53 control SC-7, governing boundary protection, defines the security expectation under which an intelligence platform that ingests external sources must operate when deployed inside a regulated enterprise.

Architectural requirement

The convergent legal expectation across this landscape is that an enterprise competitive intelligence program must be able to demonstrate, for any given strategic conclusion, the full provenance of the signals that produced it: which sources, accessed under which terms, processed under which lawful basis, and weighted in what manner against the reliability and legal status of each source. This is not a documentation overlay on top of an analytic process. It is an architectural requirement on the analytic process itself. The system that performs cross-source synthesis must be the same system that records the provenance, because reconstructing provenance after the fact, from a synthesized conclusion back to its constituent sources, is what a misappropriation defendant cannot reliably do under litigation pressure.

Such a system must persist competitive context across time so that a longitudinal analysis remains coherent under analyst turnover and source change. It must differentiate signal weight based on source legal status, public regulatory filing weighted differently from a vendor-supplied dataset weighted differently from an aspirational job posting. It must enforce boundary conditions that prevent ingestion of sources whose acquisition would itself be improper. And it must produce, on demand, an audit trail that satisfies both the firm's internal compliance review and the discovery posture in litigation.

Why procedural compliance fails

The standard procedural response to these obligations is a competitive-intelligence policy document, an intake form, an annual training module, and a SharePoint repository of analyst reports. The policy declares the firm's adherence to the SCIP code, prohibits improper sourcing, and requires analysts to attest to compliance. The intake form asks the analyst to identify sources. The training module reviews trade secret law. The repository stores deliverables. None of these controls operates at the layer where competitive intelligence work actually occurs, which is the layer of cross-source synthesis performed by humans assisted by aggregation tooling that itself does not record provenance in a litigation-defensible form.

Dashboard-based aggregation platforms compound the problem. The analyst sees patent alerts, hiring alerts, and earnings call alerts on a single screen, but the cross-source connections that produce strategic insight are made in the analyst's head and committed to a deliverable document. The connection between the conclusion and its sources is not recorded; only the conclusion and a reading list are. Under DTSA litigation, a defendant whose deliverable cannot be decomposed into sourced components is in the position of arguing that its conclusions emerged from a process the defendant cannot describe. That argument loses.

AI-assisted summarization does not solve the cross-source synthesis problem. A summary of patent filings and a summary of job postings are still siloed analyses. The strategic connection that links a battery-technology patent cluster to a supply-chain hiring pattern to an earnings-call vertical-integration statement requires traversal across source types under a persistent competitive question, with each step in the traversal recorded with its source, source type, legal status, and weight. Procedural compliance has no mechanism to produce that record because the work the record would describe is happening in the analyst's cognition, outside any system the policy can govern.

What the AQ primitive provides

Semantic discovery, in the Adaptive Query architecture, treats the competitive analysis as a persistent discovery object that traverses across source types under explicit governance. The discovery object carries the competitive context, the competitors being tracked, the strategic questions under investigation, the patterns accumulated over time, and the lawful-basis declaration under which the analysis is being conducted. Traversal is the operation by which the discovery object follows semantic connections across source types, linking a patent filing to a related job posting to a related earnings-call statement. Every step of every traversal is recorded with its source, source type, access basis, and the weight applied to the resulting signal.

Trust-scoped resolution differentiates source reliability and source legal status as separate dimensions. A patent filing is a public record with verified content, weighted as authoritative. An earnings call statement is regulated by securities law and weighted as forward-looking but constrained. A job posting is public but aspirational and weighted accordingly. A vendor-supplied dataset is weighted by its license terms and provenance declaration. A scraped source whose acquisition would violate CFAA, ECPA, or a site's enforceable terms is not weighted, it is excluded at the boundary, and the exclusion itself is recorded so that the audit trail demonstrates not only what was used but what was rejected.

The persistent discovery object enables longitudinal competitive tracking under analyst turnover. A competitor analysis that runs continuously over months accumulates strategic context. New analysts contribute findings into the same governed object rather than starting fresh, and each new signal is evaluated against the accumulated pattern with its provenance and weight. Collaborative traversal allows a patent analyst, a financial analyst, and a regulatory analyst to contribute to a shared discovery object with role-scoped access, producing the cross-disciplinary synthesis that competitive intelligence requires while preserving the per-source, per-step audit trail that GDPR Article 5, the SCIP code, and DTSA defensibility require. The discovery object is the artifact that, under a litigation hold or a regulator's information request, can be produced as the evidentiary record of how the firm's competitive conclusions were formed.

Compliance mapping

Against the Defend Trade Secrets Act and the Economic Espionage Act, semantic discovery's per-source provenance record demonstrates that every signal contributing to a strategic conclusion was obtained through proper means and that improper sources were excluded at the boundary. Against the EU Trade Secrets Directive, the same record satisfies the parallel proper-means standard. Against CFAA and ECPA, the boundary-enforcement layer prevents ingestion of sources whose acquisition would constitute unauthorized access or interception, and records the exclusion. Against GDPR Articles 5 and 22, the lawful-basis declaration carried by the discovery object and the per-step processing record satisfy the documentation requirements, and the analyst-in-the-loop architecture keeps consequential decisions out of the Article 22 sole-automated-decision category. Against Section 5 of the FTC Act, the documented exclusion of pretexted or deceptive sources demonstrates that the program does not engage in unfair or deceptive practices. Against the SCIP Code of Ethics, the auditable trail provides the evidence of code adherence that the code itself contemplates but does not technically implement. Against NIST SP 800-53 SC-7, the boundary-protection posture is realized as an enforced ingestion gate rather than a policy declaration.

Adoption pathway

A corporate strategy organization adopting semantic discovery establishes a governed competitive-intelligence substrate in which each tracked competitor and each strategic question is a persistent discovery object. Analysts work inside the object rather than around it, contributing findings whose provenance and weight are captured at the point of contribution. Compliance and legal define the lawful-basis declarations and the boundary policies that govern source ingestion, and the legal team retains the ability to apply litigation-hold and regulator-response postures directly against the discovery objects rather than against ad hoc analyst deliverables scattered across SharePoint sites and personal drives.

Initial deployment typically begins with a single high-stakes analysis, a market-entry assessment, a technology-scouting program, or an acquisition-target evaluation, where the litigation and regulatory exposure of an ungoverned process is clearest and the auditable trail produces immediate defensibility benefits. The second-stage rollout extends the substrate across the firm's portfolio of competitive analyses, replacing fragmented monitoring streams with a coherent governed traversal layer. The end-state configuration is an enterprise-wide competitive-intelligence platform in which legal, compliance, strategy, R&D, and corporate development all operate against a shared governed substrate, the proper-means standard of DTSA and the EU Trade Secrets Directive is enforced architecturally rather than asserted procedurally, and the firm can demonstrate to a court, a regulator, or an auditor that its competitive intelligence program is conducted under the governance that the surrounding legal landscape expects.