Semantic Discovery for Patent Landscape Analysis

by Nick Clark | Published March 27, 2026 | PDF

Patent landscape analysis is the evidentiary foundation of patentability opinions, freedom-to-operate clearances, validity challenges, and licensing strategy. Yet the prevailing tooling — Boolean keyword search layered over CPC and IPC classification filters — confines discovery to the vocabulary an examiner happened to use and the classification code an examiner happened to assign. The Manual of Patent Examining Procedure, 35 U.S.C. § 102, and 35 U.S.C. § 103 do not limit prior art to a classification bucket; they demand reasonable diligence across the universe of accessible publications. Semantic discovery, anchored in a persistent governed traversal object, follows technical concepts across CPC boundaries, across vocabularies, and across jurisdictions, producing landscape evidence that can be defended in a Patent Trial and Appeal Board inter partes review or admitted under the Federal Rules of Evidence.


Regulatory Framework

Patent landscape analysis is not an unregulated research activity. It is the empirical predicate for legal opinions that carry duties under the Manual of Patent Examining Procedure (MPEP) and substantive Title 35 obligations. MPEP § 904 requires examiners to conduct a thorough search of the prior art, and applicants who file information disclosure statements under 37 C.F.R. § 1.56 inherit a duty of candor that extends to references discovered during their own diligence. Under 35 U.S.C. § 102, anticipating prior art comprises any printed publication, public use, or sale anywhere in the world before the effective filing date — a scope that is jurisdictionally and linguistically unbounded. Under 35 U.S.C. § 103, obviousness analysis under KSR v. Teleflex demands consideration of analogous art across technical fields when a person of ordinary skill would have looked there for a solution.

The institutional infrastructure surrounding these statutes is correspondingly broad. The USPTO operates Patent End-to-End (PE2E) for examiners and PatFT and PPUBS for the public. The European Patent Office maintains Espacenet and the EPC framework that governs validity proceedings and Supplementary Protection Certificates. The Japan Patent Office exposes its corpus through Japio and J-PlatPat. The World Intellectual Property Organization administers the Patent Cooperation Treaty and the PATENTSCOPE database, which exposes filings in dozens of national languages. Each system imposes its own classification scheme — CPC and IPC at the international level, USPC legacy codes, FI and F-term in Japan — and its own examiner conventions for vocabulary and abstract drafting.

When landscape evidence enters litigation or post-grant review, the Federal Rules of Evidence govern. A landscape report introduced under FRE 702 must rest on a methodology that is reliable and reproducible. PTAB inter partes review proceedings under 35 U.S.C. § 311 require petitioners to show a reasonable likelihood of prevailing on grounds of unpatentability based on prior art that is discoverable, citable, and defensible. A search methodology that cannot explain why a particular classification was queried, why a particular synonym was expanded, or why a particular jurisdiction was excluded is a methodology that cannot survive cross-examination.

Architectural Requirement

The regulatory framework imposes architectural constraints that classification-bounded search cannot satisfy. A defensible landscape system must traverse semantic neighborhoods, not classification cells. It must persist the analyst's evolving understanding across sessions, jurisdictions, and document types so that a freedom-to-operate analysis built over six weeks does not lose state when the analyst returns from a weekend. It must record the lineage of every retrieved reference — which concept anchored the query, which traversal step surfaced the document, which trust scope determined inclusion — so that the audit trail required by the duty of candor and by FRE 901 authentication can be produced on demand.

The system must also model trust scope explicitly. A granted U.S. patent and an unexamined Chinese utility model both qualify as printed publications under § 102, but they carry different evidentiary weight in a § 103 obviousness argument and different operational weight in a freedom-to-operate clearance. A defensible architecture lets the analyst scope traversal to granted-only, jurisdiction-specific, or family-aware corpora and records that scoping decision in the lineage. Anything less collapses the legal distinctions that the statutes preserve.

Why Procedural Compliance Fails

The dominant procedural response to landscape governance is checklist-driven: search teams document the CPC classes queried, the keyword expansions applied, the date ranges covered, and the databases consulted. The checklist is then attached to the opinion file as evidence of diligence. This is procedural compliance, and under the realities of cross-domain invention it systematically fails.

Consider a machine-learning method for optimizing radio resource allocation. CPC classification will place it in G06N or H04W depending on the examiner. A checklist that queries G06N will miss the H04W filings entirely. Keyword expansion within "neural network" and "deep learning" will not surface the 1990s adaptive filtering literature in H03H that solves a mathematically equivalent problem under different vocabulary. A checklist that confines itself to USPTO PatFT will miss the JPO filings in F-term 5K067 that describe the same approach in Japanese. Each gap is procedurally invisible — the checklist was completed — and legally fatal, because the prior art exists, was reasonably accessible, and was missed.

Synonym dictionaries and embedding-based query expansion are partial responses, but they expand vocabulary within a domain rather than crossing conceptual boundaries between domains. They cannot encode the analogous-art reasoning that KSR demands. They cannot persist the analyst's growing understanding of which adjacent fields a person of ordinary skill would have consulted. They cannot record why a particular semantic neighborhood was traversed in support of an obviousness argument and why another was excluded. A PTAB panel asked to evaluate the methodology cannot reconstruct the reasoning, because the reasoning was never an artifact of the system; it lived in the analyst's head and evaporated when the search ended.

The deeper failure is that procedural compliance is a record of activity rather than a record of coverage. The duty of candor under 37 C.F.R. § 1.56 attaches to material references that were known or should have been known. A checklist proves the analyst did things; it does not prove the analyst covered the semantic surface that the statute and the case law require.

What AQ Primitive Provides

Adaptive Query's semantic discovery primitive replaces the checklist with a persistent governed traversal object. The object carries the technical concept under investigation as a structured semantic anchor rather than a keyword string. It carries the accumulated set of retrieved references and the relationships among them. It carries the trust scope — granted versus published, jurisdiction set, document type filters, family-aware deduplication — as explicit governance state. And it carries the lineage of every traversal step, including the semantic distance between the anchor concept and each retrieved document and the rationale by which the traversal stepped from one neighborhood to the next.

Traversal proceeds through semantic neighborhoods in the unified embedding space across USPTO, EPO, JPO, and WIPO corpora simultaneously. A traversal that begins with a machine-learning resource allocation concept can step into adaptive filtering in H03H, into control-theoretic optimization in G05B, and into operations research filings classified outside G06 entirely, when the conceptual distance warrants the step. Each step is governed: the discovery object's persistent state ensures that traversal does not drift beyond the analyst's defined investigative scope, and trust scoping ensures that a freedom-to-operate analysis does not silently incorporate unexamined utility models when granted patents were the operational requirement.

Cross-lingual semantic alignment lets the same anchor concept retrieve relevant filings whether they are drafted in English, Japanese, German, or Chinese, addressing the § 102 worldwide-publication scope that English-only keyword search structurally cannot reach. The discovery object's lineage records the cross-lingual retrievals explicitly so that translation provenance can be audited and FRE 901 authentication can be supported.

Because the discovery object persists, multi-week landscape analyses accumulate state rather than re-searching established terrain. An analyst returning to a freedom-to-operate clearance after a deposition resumes from the accumulated traversal frontier rather than reconstructing prior queries from memory. When opposing counsel demands the search methodology in discovery, the lineage produces a deterministic, reproducible record of every step.

Compliance Mapping

The semantic discovery primitive maps directly to the regulatory surface. The duty of candor under 37 C.F.R. § 1.56 is supported by the lineage record, which demonstrates that traversal extended to the semantic neighborhoods a reasonably diligent practitioner would have consulted. MPEP § 904 thoroughness is supported by cross-classification and cross-jurisdiction traversal, with explicit evidence of which CPC, IPC, and national classifications were reached through semantic adjacency rather than guessed in advance. The 35 U.S.C. § 102 worldwide-publication standard is supported by cross-lingual retrieval against EPO, JPO, and WIPO PATENTSCOPE corpora with translation lineage preserved. The 35 U.S.C. § 103 analogous-art analysis under KSR is supported by recorded semantic-distance evidence showing why adjacent fields were considered.

For PTAB IPR petitions under 35 U.S.C. § 311, the lineage produces the methodological transparency that 37 C.F.R. § 42.104(b) petition requirements demand. For Federal Rules of Evidence admissibility, the deterministic traversal record satisfies FRE 702 reliability and FRE 901 authentication. For European validity proceedings under the EPC and for Supplementary Protection Certificate analyses, the explicit jurisdiction scoping and family-aware deduplication preserve the legal distinctions among national rights.

Adoption Pathway

An IP department adopts semantic discovery in three phases. The first phase is parallel operation: existing PE2E and Espacenet workflows continue, and the semantic discovery primitive runs alongside, producing supplementary lineage reports for a subset of high-stakes matters such as freedom-to-operate clearances for product launches and validity opinions supporting licensing negotiations. Comparing the semantic-discovery yield against the classification-bounded baseline calibrates analyst trust and surfaces the cross-classification gaps that the legacy workflow was missing.

The second phase is integration: the persistent discovery object becomes the canonical artifact of a landscape analysis, with classification queries and keyword expansions executed inside the object's governance scope rather than as standalone activities. Opinion letters cite the lineage record. Information disclosure statements are populated from the discovery object's retrieved-reference set. Litigation holds attach to the object rather than to a folder of search printouts.

The third phase is portfolio governance: discovery objects accumulate across the firm or department, building a semantic map of the technology areas under active prosecution and licensing. New matters inherit relevant traversal state from prior matters, dramatically reducing redundant search effort and ensuring that institutional knowledge about adjacent fields persists across analyst turnover. The resulting landscape capability is auditable, reproducible, and defensible — a platform on which patentability opinions, freedom-to-operate clearances, and IPR petitions rest on evidence rather than on procedural ritual.

Throughout adoption, the discovery primitive integrates with existing tooling rather than replacing it. PE2E, PatFT, Espacenet, J-PlatPat, PATENTSCOPE, and commercial platforms such as Derwent and PatBase remain accessible inside the governed traversal, with their retrievals captured in lineage. Federal Rules of Evidence 901 authentication of foreign-language references is supported by recorded translation provenance, and FRE 1006 summary admissibility is supported by the deterministic landscape report the discovery object can render on demand. For Supplementary Protection Certificate analyses under EU Regulation 469/2009, the jurisdiction-aware traversal preserves the national-right granularity that SPC determinations require. The cumulative effect is a search posture that scales with semantic complexity rather than collapsing under it, and that produces evidence aligned with the legal standards under which patentability and validity are ultimately tested.

Nick Clark Invented by Nick Clark Founding Investors:
Anonymous, Devin Wilkie
72 28 14 36 01