Constellation Signature: Geometry-Invariant Matching Across Crop, Scale, and Occlusion

Nick Clark

Constellation Signature: Geometry-Invariant Matching Across Crop, Scale, and Occlusion

by Nick Clark | Published March 27, 2026 | PDF

A single anchor point on a piece of content is a brittle identity: it can be removed, occluded, or contradicted by an attacker who controls a sufficient region of the medium. The constellation signature replaces single-point anchoring with a joint signature over a geometrically distributed set of saliency hotspots, each carrying a micro-constellation descriptor of its local neighborhood. The joint signature binds the hotspots into a single identity whose verification requires recovering a configurable threshold of constituent anchors. This construction yields content identity that survives cropping, scale change, and partial occlusion, and that supports threshold revocation in which the compromise or removal of a bounded subset of anchors does not invalidate the constellation as a whole.

Mechanism

The mechanism is constructed in three stages: hotspot detection, descriptor extraction, and constellation binding. Hotspot detection identifies a set of geometrically distinguished points within the content using a saliency operator that responds to structural variance rather than perceptual salience alone. The operator is selected to produce repeatable results under affine transformations, modest noise, and bounded compression artifacts. The output is a set of hotspot locations together with characteristic scales and orientations, ordered by descending saliency response.

Descriptor extraction computes, for each hotspot, a micro-constellation descriptor summarizing the local neighborhood at the hotspot's characteristic scale. The descriptor is invariant or quasi-invariant to rotation and scale within bounded ranges and is robust to bounded photometric distortion. The disclosed embodiment uses a binary descriptor with sufficient variance to render accidental collision negligible across realistic content populations, but other descriptor families (gradient histograms, learned embeddings, spectral fingerprints) are admissible alternatives without changing the architecture.

Constellation binding produces the joint signature by combining the individual descriptors through a threshold-tolerant cryptographic scheme. The scheme is constructed so that any sufficiently large subset of the original anchors, when reassembled with their descriptors and approximate geometric relationships, reconstructs a verification token that matches the published signature. Subsets smaller than the threshold do not reconstruct, and arbitrary anchors not drawn from the original set cannot be substituted without detection. The scheme accommodates bounded geometric distortion of the inter-anchor relationships, recognizing that crops and rescales perturb but do not destroy the constellation's relative geometry.

Verification proceeds by detecting hotspots in the candidate content using the same saliency operator, extracting descriptors, and attempting to match the candidate descriptors to the published constellation. If a sufficient subset matches within the configured geometric and descriptor tolerances, the candidate is recognized as bearing the constellation signature. The verification outcome, the matched subset, the geometric transformation reconstructed during matching, and the residual error are recorded in an append-only verification lineage that supports retrospective audit.

Threshold revocation is supported directly by the binding scheme. An authority empowered to revoke individual anchors publishes a revocation entry naming specific hotspots within a constellation. Verification consults the revocation list and excludes revoked anchors from the matching set. Provided the remaining unrevoked anchors meet the threshold, the constellation continues to verify; if revocations reduce the available anchor count below the threshold, the constellation is considered revoked in its entirety. This permits graceful response to localized compromise without requiring reissuance of the entire signature.

Operating Parameters

The constellation size, denoted N, governs both robustness and verification cost. Larger constellations tolerate more occlusion and revocation but increase signature size and detection workload. Typical embodiments use N in the range of sixteen to two hundred fifty-six anchors per content item, scaled to content complexity and expected adversarial pressure. The verification threshold, denoted k, is selected such that k of N anchors suffice to reconstruct the joint signature. Practical deployments use k between one-third and two-thirds of N depending on the expected occlusion fraction and the desired revocation headroom.

Saliency operator parameters control hotspot density, scale range, and detection sensitivity. Hotspot density is constrained to a target value per unit content extent, with non-maximum suppression preventing clustering that would reduce geometric diversity. The scale range is configured to span the anticipated rescaling envelope; broader ranges increase invariance but reduce repeatability. Detection sensitivity is tuned against a reference corpus to balance false-positive hotspots against missed true hotspots, with the operating point selected to maximize verification recall at the cost of bounded extraction work.

Descriptor parameters include dimensionality, quantization, and matching tolerance. The dimensionality is chosen to render accidental collision probabilistically negligible while keeping descriptor storage bounded; typical embodiments use 128 to 512 bits per descriptor. Quantization converts continuous descriptor values to discrete codes amenable to fast matching and exact equality comparison after reconstruction. Matching tolerance defines the descriptor-space distance under which two descriptors are treated as identical; the tolerance is set against measured intra-class variation under the bounded distortions that the system is designed to survive.

Geometric verification tolerances govern how much the inter-anchor relationships may deviate from the published constellation while still verifying. Tolerances are expressed as residual thresholds on a fitted similarity, affine, or projective transformation, depending on the content modality. Tighter tolerances reduce false positives at the cost of robustness to geometric distortion; looser tolerances expand robustness at the cost of admitting near-collisions. Operators select tolerance profiles per deployment, with the selected profile recorded in the verification lineage so that audit can reproduce the verification decision exactly.

Alternative Embodiments

Hotspot detection may be embodied through classical structural-variance operators, learned saliency networks producing repeatable keypoints, or hybrid operators combining handcrafted geometric criteria with learned re-ranking. Classical operators offer reproducibility and analytic tractability. Learned operators offer adaptability to content-specific structures at the cost of training-data dependence and reproducibility constraints. Hybrid operators are preferred where the deployment requires both robustness across diverse content and analytic auditability of the detection procedure.

Descriptor families admit broad variation. Binary descriptors derived from local intensity comparisons offer compactness and fast matching. Gradient-histogram descriptors offer continuous-valued discrimination at higher dimensionality. Learned embedding descriptors offer task-tuned discrimination but introduce model dependence. Spectral fingerprints offer invariance to a broader class of distortions for specific content modalities. The constellation binding scheme is agnostic to descriptor family provided the descriptors meet the variance and tolerance requirements imposed by the binding procedure.

Constellation binding may be embodied through threshold secret sharing, threshold signature schemes, accumulator-based set commitments tolerant to bounded membership perturbation, or tree-structured commitments admitting subset proofs. Threshold secret sharing offers strong information-theoretic guarantees. Threshold signatures offer publicly verifiable joint signatures usable by parties without access to the original content. Accumulator commitments offer compact proofs at the cost of cryptographic assumptions. Tree-structured commitments offer logarithmic proof sizes and admit incremental anchor revocation without rebinding the entire constellation.

Threshold revocation may be embodied through publicly published revocation lists indexed by constellation identifier and anchor index, through revocation accumulators admitting compact non-membership proofs, or through epoch-based reissuance in which the constellation is periodically rebound with the current set of unrevoked anchors. Public lists are simplest. Revocation accumulators minimize verifier-side storage. Epoch reissuance maintains compact constellations at the cost of periodic active management by the issuing authority.

Composition With Adjacent Mechanisms

The constellation signature composes with content lineage tracking, provenance attestation, and structural similarity search to produce a coherent content anchoring layer. Lineage tracking records each verification event together with the matched anchor subset, supporting retrospective analysis of how content propagated through derivative works. Provenance attestation binds the constellation signature to issuer identity through a separate signature over the constellation digest, providing an audit trail from content back to authoring authority. Structural similarity search uses the same hotspot and descriptor extraction primitives to locate content related to a query without requiring exact constellation match.

The mechanism also composes with watermarking and metadata-based identification as complementary rather than competing primitives. Watermarks survive transformations that disrupt structural saliency; constellations survive transformations that strip embedded metadata. Used together, they provide overlapping identity coverage with different failure modes. The disclosed system's append-only verification lineage is agnostic to which primitive produced a given identification, supporting unified audit across heterogeneous identity sources.

Composition with derivative-content tracking extends the constellation primitive into multi-generational provenance graphs. When a derivative is published, the issuer can compute a new constellation over the derivative and bind it to the parent constellation through a cryptographic linkage, producing a directed acyclic provenance graph in which each node carries its own threshold-tolerant identity and each edge documents the derivation relationship. Verification of a leaf node optionally walks the graph back toward authoritative roots, supporting end-to-end provenance attestation across content remixing, format migration, and editorial revision.

The mechanism additionally composes with consent and licensing layers. A constellation signature may be bound to a license descriptor specifying the terms under which the content may be used. Verification produces not only an identity assertion but a license retrieval, allowing downstream consumers to resolve licensing terms from the structural identity itself rather than from separately-maintained registries. Threshold revocation interacts naturally with license revocation: when the licensing authority withdraws permission, the corresponding anchors may be revoked, gracefully degrading the constellation's verification reach without requiring removal of the underlying content.

Distinction From Prior Art

Conventional content identification approaches based on perceptual hashing produce a single fixed-size digest of the content. Although robust to bounded distortion, single-digest schemes do not support partial verification: either the digest matches within tolerance or it does not. The constellation approach replaces single-digest matching with threshold subset matching, supporting verification of cropped, occluded, or partially modified content that single-digest schemes would reject. This is a structural rather than incremental advance.

Local feature matching approaches in computer vision, including SIFT, SURF, ORB, and learned local descriptors, support partial matching but typically do not bind the matched feature set into a cryptographically verifiable joint identity. They function as similarity primitives rather than as identity primitives. The disclosed mechanism differs by combining geometrically distributed hotspots with a threshold-tolerant cryptographic binding, producing a verifiable identity from the matching procedure rather than merely a similarity score.

Threshold cryptography literature offers the binding primitives but does not specify their application to geometrically distributed content anchors with bounded geometric distortion tolerance, nor the integration with structural-variance saliency detection and threshold revocation that the disclosed system provides. The combination of geometric saliency, descriptor extraction, threshold-tolerant binding, geometric verification tolerance, and threshold revocation as a single coordinated content identity primitive is novel and is not anticipated by the surveyed prior art.

Disclosure Scope

The disclosure encompasses hotspot detection by structural-variance saliency, micro-constellation descriptor extraction, threshold-tolerant cryptographic binding of descriptor sets into a joint signature, geometry-tolerant verification through threshold subset matching, threshold revocation of individual anchors, append-only verification lineage, and the alternative embodiments of detector, descriptor, binding, and revocation described above. The disclosure further encompasses the composition of constellation signatures with lineage tracking, provenance attestation, structural similarity search, and complementary identity primitives such as watermarking as integral architectural properties of the content anchoring layer.

The scope expressly excludes embodiments in which content identity is established by a single anchor whose removal invalidates identity, in which verification consults uncommitted or externally mutable state, or in which revocation requires reissuance of the entire signature for any single anchor compromise. These exclusions are structural. They are what render the disclosed mechanism a content identity primitive supporting graceful degradation under adversarial pressure rather than a brittle single-point identifier vulnerable to localized attack.