Tamr Secures US Patent for Enterprise-Scale Data Unification
Tamr has announced that it has been issued a patent from the United States Patent and Trademark Office covering the principles underlying its enterprise-scale data unification platform. The patent, titled Method and System for Large Scale Data Curation, describes a comprehensive approach for integrating a large number of data sources by normalizing, cleaning, integrating, and deduplicating them using machine learning techniques supplemented by human expertise.
Tamr’s patent describes several features and advantages implemented in the company’s software, including: the techniques used to obtain training data for the machine learning algorithms; a unified methodology for linking attributes and database records in a holistic fashion; multiple methods for pruning the large space of candidate matches for scalability and high data volume considerations; and novel ways to generate highly relevant questions for experts across all stages of the data curation lifecycle.
Additional characteristics included in the platform covered by the patent include scalability through automation, data cleaning, non-programmer orientation, and incremental Data Integration and data curation.
Tamr’s co-founder and CTO Mike Stonebraker speaks to the announcement: “Our goal was to build an end-to-end system for enterprise-scale data curation that leveraged modern machine learning techniques to radically reduce the time and cost of producing clean, unified data sets. Tamr’s growth has proven the commercial value of the many innovations in our software, and this patent now confirms the uniqueness of our invention.”
Widget not in any sidebars