Nov 2025 · Pebblous Data Communication Team

Reading time: ~20 min · Korean

1. Patent Overview: Securing Rights

1.1. Introduction

This section analyzes the strategic importance of US Patent US 12,481,720 B2 secured by Pebblous. This patent serves as the first line of defense legally protecting the company's core technology assets and the legal foundation for building a Technological Moat that competitors cannot replicate.

1.2. Patent Basic Information

Item Details
Patent Title COMPUTING DEVICE THAT PERFORMS A METHOD FOR DIAGNOSING PROPERTIES OF DATA AND A SYSTEM COMPRISING THE COMPUTING DEVICE
Patent Number US 12,481,720 B2
Issue Date November 25, 2025
Application Number 18/511,617
Filing Date November 16, 2023
Inventors Joo Haeng Lee, Jeong Won Lee
Assignee PEBBLOUS INC., Daejeon (KR)
Patent Term Until November 19, 2042 (including 82-day term extension)

1.3. Scope of Granted Rights

The Director of the United States Patent and Trademark Office has granted Pebblous the exclusive right to exclude others from making, using, offering for sale, or selling the invention within the United States.

Business Implications

  • Market Exclusivity: Competitors cannot commercially implement the "method for diagnosing and improving data properties" specified in the patent or provide products or services incorporating this technology without Pebblous's authorization.
  • Competitive Advantage: This exclusive right effectively blocks competitors from entering the market with similar functionality by protecting the technological foundation of Pebblous's core solution, DataClinic.
  • Value Creation Foundation: This right provides the legal basis for Pebblous to monetize its intellectual property through various business models such as licensing agreements and technology partnerships.

In conclusion, the US 12,481,720 B2 patent goes beyond being a mere technical document -- it is a powerful and long-term legal shield protecting Pebblous's business activities until November 19, 2042.

2. Core Patent Technology Analysis: Diagnosis and Improvement Through 'Data Maps'

2.1. Introduction

The purpose of this section is to clearly analyze the operating principles of the core technology protected by the patent. This technology presents an original methodology for visually diagnosing and scientifically improving the invisible characteristics of data, to solve the biggest bottleneck in AI development: 'data quality'.

2.2. Step 1: Diagnosis - Data Imaging

The first step of the patented technology is transforming AI training data into a diagnosable form through 'Data Imaging'. AI training data is converted into individual 'Data Points' and placed in a high-dimensional virtual space called the 'Embedding Space'.

The essence of this process is mapping the abstract 'semantic similarity' of data (e.g., images of similar bird species) to physical 'proximity' within the embedding space. As a result, this embedding space serves as a kind of 'Data Map' that allows the overall structure and relationships of data to be grasped at a glance.

2.3. Step 2: Improvement - Distribution Adjustment and Synthetic Data Generation

The second step resolves the problems identified in the 'Data Map' by scientifically 'improving' the data distribution. To address the distribution (property) issues in the diagnosed data map, the patented technology adjusts data point positions or adds new ones to create a 'Modified First Data Point Set'.

When low-density gaps (empty areas where data is insufficient) are found in the diagnosed 'Data Map,' the patented technology precisely generates and adds 'new data points that did not exist in the original dataset' at those exact locations. This is the core principle of 'Synthetic Data' generation technology that reinforces dataset weaknesses, and it is the exclusive right protected by this patent.

2.4. Step 3: Reporting - Visualization and Diagnostic Reports

Finally, this step clearly communicates the diagnosis and improvement results to users. The patent specifies methods for visualizing the original data map as an 'Image of Data (IOD)' and the improved data map as a 'Modified Image of Data (MIOD)'.

2.5. Conclusion and Transition

In summary, this patented technology legally protects an integrated methodology that allows data problems to be visually identified (diagnosis), quantitatively analyzed (density/distribution measurement), and scientifically improved (distribution adjustment and synthetic data generation). This is an innovation that elevates data quality management from subjective experience to the domain of objective engineering.

3. Integration Analysis with Pebblous Products and Vision

3.1. Introduction

The analytical goal of this section is to clearly connect how the abstract patent technology analyzed earlier functions as the concrete technology engine of Pebblous's core products and long-term vision, thereby demonstrating the patent's practical business value. This patent is not merely an isolated technology but serves as the core DNA that runs through all of Pebblous's solutions.

3.2. Patent Technology and Core Solution Mapping

Pebblous Element Patent Correspondence Detailed Description
DataClinic Core Engine & Methodology The patent title itself specifies a "method for diagnosing properties of data," protecting the core technology that DataClinic uses to comprehensively diagnose and calibrate the quality of AI training data.
PebbloScope Data Visualization & Communication The 'Data Imaging (IOD/MIOD)' technology defined by the patent is the direct technological foundation of PebbloScope. PebbloScope's ability to transform high-dimensional data into 3D space for visual exploration of distributional characteristics is a commercialization of the patent's visualization and reporting methods.
Data Diet Duplicate & Similar Data Removal This directly corresponds to the patent's 'distribution property adjustment' and 'out-of-boundary data point deletion' functionality. The patented technology legally guarantees the technical justification and exclusivity of Data Diet by identifying and removing over-dense clusters.
Data Bulk-up Precision-Targeted Synthetic Data Generation This is the commercial implementation of the patent's core claim: the mechanism for 'adding new data points that did not exist in the original.' This gives Pebblous exclusive ownership of precision-targeted synthetic data generation capabilities that competitors cannot replicate.
AADS
(Agentic AI Data Scientist)
Core Tool for Autonomous Operation Systems AADS aims for a fully autonomous data operations framework. This patented technology serves as the core 'tool' or 'function' that the AADS autonomous agent uses to independently diagnose and improve data quality issues (density, distribution imbalances, etc.).

3.3. Conclusion and Transition

As demonstrated, the US 12,481,720 B2 patent is a core asset that not only secures the technical justification of Pebblous's current product portfolio but also encompasses the foundation of its future vision, AADS. It serves as a powerful shield protecting Pebblous's entire technology roadmap.

4. Technical Mapping with International Standard (ISO/IEC 5259-2)

4.1. Introduction

This section analyzes another strategic value of Pebblous's patented technology. ISO/IEC 5259-2, the international standard for AI data quality, provides a theoretical framework defining 'what' data quality characteristics AI systems should possess.

However, this standard has not provided concrete technical answers for 'how' to quantitatively measure those abstract quality characteristics, especially for unstructured data. Pebblous's patented technology serves as the exclusive technical implementation that resolves this 'lost connection', transforming the standard's theory into real-world engineering.

4.2. Detailed Mapping with ISO 5259-2 Additional Quality Characteristics

ISO Quality Characteristic Role of Pebblous Patent Technology Detailed Description
Similarity
Sim-ML-1/3
Density measurement in embedding space & intrinsic dimension calculation Sim-ML-1 (Sample Similarity) measures duplicate/similar samples within a dataset. The patented technology objectively proves similarity violations by measuring density in embedding space after data imaging and visually identifying over-dense clusters.
Representativeness
Rep-ML-1
Manifold shape analysis & bulk-up Rep-ML-1 (Representativeness) evaluates how well a dataset reflects the real environment. The patented technology proves representativeness deficiency by analyzing manifold shapes to identify low-density areas (gaps) where data is insufficient.
Diversity
Div-ML-1/2/3
Manifold size & distance-density measurement Div-ML (Diversity) evaluates how many different scenarios the data covers. The patented technology provides the basis for quantitatively determining how much of the feature space the data covers through manifold size and distance-density measurements.
Efficiency
Eff-ML-3
Density measurement-based Data Diet Eff-ML-3 (Storage Waste Risk) measures data processing efficiency. The Data Diet prescription for duplicate/similar data (over-dense clusters) identified by the patent's density measurement technology reduces unnecessary storage and computing resource waste, improving GPU efficiency.

4.3. Conclusion and Transition

In conclusion, Pebblous's patented technology transforms the abstract requirements of ISO standards into measurable, operational, and improvable engineering solutions. This is a key contribution that elevates data quality management from the realm of subjective judgment to the realm of objective science.

5. Comprehensive Value Analysis: Business Competitiveness Strategy

5.1. Introduction

As the conclusion of this report, this section comprehensively evaluates how the patent's technical content, product integration, and standards compliance analyzed earlier combine to build Pebblous's sustainable business competitiveness. This patent goes beyond simple technology protection to form a multi-layered and indefensible value structure that solidifies Pebblous's market dominance.

5.2. Technological Value

Strong IP Protection

The exclusive rights guaranteed until 2042 form a powerful technological moat that fundamentally blocks competitor entry in the core technology domain of data quality diagnosis and improvement.

Integrated Data Quality Management

A technological foundation supporting the entire end-to-end data quality management cycle encompassing Diagnosis, Enhancement, and Generation.

Physical AI Market Leadership

Provides the foundation for processing various modalities including images and text, conferring a critical technological advantage for capturing the high-value Physical AI market.

5.3. Business Value

Exclusive Competitive Advantage in Regulated Markets

Pebblous's patented technology provides an exclusive method for quantitatively measuring and improving abstract quality characteristics such as representativeness and similarity required by ISO/IEC 5259-2. The diagnostic reports generated by DataClinic perfectly fulfill the technical prerequisites for 'auditable evidence (audit trail)' required by the EU AI Act and ISO 42001.

Measurable Customer ROI

The patented technology serves as direct evidence for providing customers with clear financial return on investment (ROI). Quantifiable GPU and cloud cost savings through 'Data Diet' and visible AI model performance improvements through 'Data Bulk-up' make it easy for customers to demonstrate the economic viability of adopting Pebblous solutions.

6. Conclusion: The Cornerstone of an AI-Ready Data Platform

US Patent 12,481,720 B2 goes beyond merely protecting a single technology -- it is the Cornerstone that supports all of Pebblous's business strategies spanning data quality diagnosis, improvement, visualization, and autonomization.

This patent visualizes invisible data problems through the innovative method of 'Data Maps,' transforms abstract quality requirements of international standards into measurable engineering, and ultimately provides customers with auditable trust and measurable ROI.

Based on this powerful intellectual property, we are confident that Pebblous will build a technological moat that competitors cannot breach and lead the future data market as an 'AI-Ready Data Platform Provider.'

Frequently Asked Questions (FAQ)

Q1. What does Pebblous's US Patent US 12,481,720 B2 protect?

This patent protects a method for diagnosing and improving the quality of AI training data. Specifically, it grants exclusive rights to an integrated methodology that maps data to a high-dimensional embedding space for visualization (data imaging), measures density and distribution to diagnose problems, and improves datasets through distribution adjustment and synthetic data generation. These rights are valid until November 19, 2042.

Q2. Which Pebblous products are connected to this patent?

This patented technology is the technology engine that underpins all of Pebblous's core product lines. All solutions operate based on this patented technology: DataClinic (data quality diagnosis and improvement), PebbloScope (data visualization), Data Diet (duplicate data removal), Data Bulk-up (precision-targeted synthetic data generation), and AADS (Agentic AI Data Scientist).

Q3. What is the relationship between Pebblous's patented technology and the ISO/IEC 5259-2 standard?

ISO/IEC 5259-2 defines AI data quality characteristics such as similarity, representativeness, diversity, and efficiency, but does not provide specific methods for 'how' to measure them. Pebblous's patented technology provides the only practical implementation method for quantitatively measuring and improving the abstract quality characteristics required by this standard through embedding space density measurement, manifold shape analysis, and more. This is a key contribution that transforms the standard's theory into actual engineering.

Q4. What does patent protection until 2042 mean for business?

The exclusive rights until 2042 mean that for approximately 17 years, competitors cannot provide data quality diagnosis and improvement solutions using the same methods. This provides Pebblous with a powerful legal foundation to build a long-term technological moat in the data quality management market, stably lead the market, and monetize its intellectual property through various business models including licensing and partnerships.

Q5. How does Pebblous's patented technology differ from other synthetic data generation technologies?

General synthetic data generation (e.g., GANs, Diffusion Models) randomly generates large volumes of data 'similar' to existing data. In contrast, Pebblous's patented technology first visualizes the dataset's embedding space to accurately diagnose 'where' data is insufficient, then precisely targets and generates necessary data points in those low-density areas. This is a precision medicine-like approach that provides an integrated 'diagnosis-improvement' cycle, offering fundamental differentiation in efficiency and effectiveness compared to random generation.

Q6. Why is this patent important in the Physical AI market?

Physical AI (robots, autonomous driving, medical devices, etc.) operates in the real physical world, where data quality errors directly lead to safety accidents. Regulations such as the EU AI Act require audit trail capabilities for data quality in such high-risk AI systems. Pebblous's patented technology provides the only technical solution for quantitatively measuring and proving the quality of multimodal unstructured data including images and sensor data according to ISO 5259 standards, securing a decisive competitive advantage in the Physical AI market where regulatory compliance is mandatory.

References

1. U.S. Patent Documents

  • US 11,297,083 B1 (Kuppa et al., 2022) - Identifying and protecting against an attack against an anomaly detector machine learning classifier
  • US 11,568,245 B2 (Chang et al., 2023) - Method and system for providing synthetic data to a machine learning classifier
  • US 11,967,308 B2 (Lee et al., 2024) - Method and apparatus for processing data for machine learning model
  • US 2017/0236069 A1 (Min, 2017) - Method for data augmentation using generative adversarial networks
  • US 2020/0150235 A1 (Beijbom et al., 2020) - Sensor simulation using domain adaptation
  • US 2021/0093210 A1 (Sinha et al., 2021) - Physiological signal processing using manifold learning
  • US 2022/0343139 A1 (Passban et al., 2022) - Method for generating synthetic data for machine learning
  • US 2022/0382976 A1 (Shin et al., 2022) - Data augmentation method and apparatus
  • US 2022/0383570 A1 (Ling et al., 2022) - Image generation method and apparatus
  • US 2023/0107415 A1 (Banerjee et al., 2023) - System for data synthesis and method thereof

2. Foreign Patent Documents

Korean Intellectual Property Office (KR) application documents.

  • KR 10-2019-0056009 A (2019) - Method for generating AI training data
  • KR 10-2022-0011979 A (2022) - Data processing system for AI models
  • KR 10-2022-0102012 A (2022) - Synthetic data generation and membership information removal
  • KR 10-2022-0159213 A (2022) - Graph neural network (GNN) residual connection learning method

3. Other Publications

Key papers on data manifold, synthesis, and anomaly detection.

  • Bellinger et al. (2018) - Manifold-based synthetic oversampling with manifold conformance estimation
  • Feng et al. (2021) - Look, Cast and Mold: Learning 3D Shape Manifold via Single-view Synthetic Data
  • Law and Jain (2006) - Incremental nonlinear dimensionality reduction by manifold learning
  • Rue-Queralt et al. (2021) - Decoding brain states on the intrinsic manifold of human brain dynamics
  • Hao et al. (2018) - Mathematical and Intelligent Techniques for Data Analytics in Science and Engineering
  • Ahmed, O. (2014) - Dataset Modification To Improve Machine Learning Algorithm Performance And Speed (Master's Thesis, University of Houston)
  • Wang et al. (2013) - Sparse Subspace Denoising for Image Manifolds
  • Gong et al. (2022) - Deep Manifold Embedding for Hyperspectral Image Classification
  • Naud et al. (2020) - Manifolds for Unsupervised Visual Anomaly Detection

PDF Download

Download the full report including detailed technical analysis and business value assessment in PDF format.

Download PDF Report

Pebblous US Patent (US 12,481,720 B2) Technology & Business Value Analysis Report v2