{"id":5522,"date":"2023-06-16T14:32:18","date_gmt":"2023-06-16T18:32:18","guid":{"rendered":"https:\/\/solutionsreview.com\/data-management\/?p=5522"},"modified":"2023-08-11T10:59:12","modified_gmt":"2023-08-11T14:59:12","slug":"the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race","status":"publish","type":"post","link":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/","title":{"rendered":"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-5535\" src=\"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg\" alt=\"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race\" width=\"800\" height=\"400\" srcset=\"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg 800w, https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI-300x150.jpg 300w, https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI-768x384.jpg 768w, https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI-600x300.jpg 600w, https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI-162x81.jpg 162w, https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI-360x180.jpg 360w\" sizes=\"(max-width: 800px) 100vw, 800px\" \/><\/p>\n<p style=\"text-align: justify;\"><i><strong>Solutions Review\u2019s Premium Content Series is a collection of contributed articles written by industry experts in enterprise software categories. In this feature, <a href=\"https:\/\/www.tolacapital.com\/\" target=\"_blank\" rel=\"noopener\">Tola Capital<\/a> Vice President Jake Nibley, Partner Akshay Bhushan, and Founder Sinan Ozdemir offer a commentary on how fine-tuning and data quality are defining the AI arms race.<\/strong><\/i><\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">We\u2019re only a few months into 2023, and the artificial intelligence foundation model arms race is heating up. OpenAI introduced GPT-4 in March with new <a href=\"https:\/\/streaklinks.com\/BhQNwb_09Jvwp-b3-QIccKKb\/https%3A%2F%2Fwww.nytimes.com%2F2023%2F03%2F14%2Ftechnology%2Fopenai-new-gpt4.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-auth=\"NotApplicable\" data-linkindex=\"0\">photo-to-text<\/a> capabilities that has the tech world buzzing, and Google (finally) soft-launched their LaMDA-powered Bard chatbot. The industry is changing so fast that tech leaders are now <a href=\"https:\/\/streaklinks.com\/BhQNwb_FhSXOaM8i3gChRk5Z\/https%3A%2F%2Fwww.bloomberg.com%2Fnews%2Farticles%2F2023-03-29%2Fai-leaders-urge-labs-to-stop-training-the-most-advanced-models%3Fsref%3Dk4UszphO%23xj4y7vzkg\" target=\"_blank\" rel=\"noopener noreferrer\" data-auth=\"NotApplicable\" data-linkindex=\"1\">calling for a 6-month ban<\/a> on training AI models more powerful than GPT-4 out of safety concerns.\u00a0<\/span><\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">With the media spotlight shining on these proprietary models trained on massive amounts of \u2018big\u2019 data, we\u2019re ignoring the equally valuable open-source foundation models trained on smaller parameter sets capable of delivering OpenAI-quality results for specific use cases. The biggest foundation models won\u2019t exclusively define this AI era. Just as much value can be derived from smaller, fine-tuned, open-source AI models. It\u2019s up to founders and practitioners to balance the benefits and risks and find the right mix of models for their business.<\/span><\/p>\n<div class=\"widget\"><div class=\"aside-card\">\t\t\t<div class=\"textwidget\"><a class=\"speedbump\" href=\"https:\/\/solutionsreview.com\/data-management\/data-management-data-warehouse-buyers-guide\/\" title=\"Download link to Data Management Buyers Guide\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft size-full wp-image-1682\" src=\"https:\/\/solutionsreview.com\/data-management\/files\/2019\/01\/data-management-speedbump-cta.jpg\" alt=\"Download Link to Data Management Buyers Guide\" width=\"800\" height=\"225\" \/><\/a><\/div>\n\t\t<\/div><\/div>\n<h2><strong>Fine-Tuning Data Quality<\/strong><\/h2>\n<h3 dir=\"ltr\"><strong><span style=\"color: black;\">Model Use Case &gt; Size<\/span><\/strong><\/h3>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">Speaking with developers, we repeatedly see open-source models outperform larger proprietary ones on discrete tasks. It&#8217;s not uncommon for us to hear from entrepreneurs that a well-tuned BERT or BLOOM model trained on their specific data outperforms the latest, largest model from OpenAI. The most famous public example is when Deepmind proved a lower parameter count model \u2014 Chinchilla, at 70B parameters \u2014 <a href=\"https:\/\/streaklinks.com\/BhQNwb77u1Y4ZLs-sAfaW9-8\/https%3A%2F%2Fwww.deepmind.com%2Fpublications%2Fan-empirical-analysis-of-compute-optimal-large-language-model-training\" target=\"_blank\" rel=\"noopener noreferrer\" data-auth=\"NotApplicable\" data-linkindex=\"2\">could outperform<\/a> models double its size \u2014 such as Gopher, at 280B parameters and GPT-3, at 175B parameters \u2014 at similar tasks. They found that the current large language models are far too large for their compute budget and are not being trained on enough data (or the right data) compared to their size.<\/span><\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\">Let&#8217;s say we work on the engineering team at a SaaS company and want to create a bot that routes text-based customer chats to the right team based on the semantics of a customer question or complaint. Engineers could use a large model with billions of parameters like GPT-3 to tag or route conversations to the right teams. Or, as we are increasingly seeing, they can use a much narrower open-source model trained exclusively on written transcripts from customer support calls or chats. This model would likely be a more relevant tool for our use case at a lower cost because it was trained with contextually relevant data for our use case.<\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\">Even if one model has incredible new state-of-the-art features (for now, it\u2019s OpenAI&#8217;s GPT-4), other open-sourced examples are six\u2013nine months behind with similar functionality or even outperforming GPT-4 on specific tasks at a lower cost. Even in the last few weeks, we\u2019ve seen this happen with <a href=\"https:\/\/streaklinks.com\/BhQNwb7Gxphrh4ogPQbeTNdu\/https%3A%2F%2Fwww.marktechpost.com%2F2023%2F03%2F28%2Fdatabricks-open-sources-dolly-a-chatgpt-like-generative-ai-model-that-is-easier-and-faster-to-train%2F\" target=\"_blank\" rel=\"noopener noreferrer\" data-auth=\"NotApplicable\" data-linkindex=\"3\">Databricks\u2019 Dolly<\/a> LLM. One example is <a href=\"https:\/\/streaklinks.com\/BhQNwb7LS0gC5oFp7QNl51hw\/https%3A%2F%2Fcrfm.stanford.edu%2F2023%2F03%2F13%2Falpaca.html\" target=\"_blank\" rel=\"noopener noreferrer\" data-auth=\"NotApplicable\" data-linkindex=\"4\">Alpaca<\/a>, a language model created by Stanford Ph.D. students at the Center for Research on Foundation Models (CRFM) in March this year. Alpaca is fine-tuned using supervised learning from a <a href=\"https:\/\/streaklinks.com\/BhQNwb7l5TqpV-hGXAQbRytB\/https%3A%2F%2Fai.facebook.com%2Fblog%2Flarge-language-model-llama-meta-ai%2F\" target=\"_blank\" rel=\"noopener noreferrer\" data-auth=\"NotApplicable\" data-linkindex=\"5\">LLaMA<\/a> 7B model on 52K instruction-following demonstrations generated in the style of <a href=\"https:\/\/streaklinks.com\/BhQNwb_eXpMNodpRvgZPafBB\/https%3A%2F%2Farxiv.org%2Fabs%2F2212.10560\" target=\"_blank\" rel=\"noopener noreferrer\" data-auth=\"NotApplicable\" data-linkindex=\"6\">self-instruct<\/a> using text-davinci-003.<\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\">The data generation process cost the team less than $500 using the OpenAI API. In their initial run, fine-tuning the model took three hours on 8 80GB A100s and gave them similar performance results to text-davinci-003 using only $100 worth of cloud computing costs. For less than $1,000, the team created a language model that won 1 more comparison against text-davinici-003 in a blind pairwise comparison evaluation (89 vs. 90). This shows us that it&#8217;s more than possible for open source models to catch up quickly; it\u2019s inevitable.<\/p>\n<h3 dir=\"ltr\"><strong><span style=\"color: black;\">Less is More: Small(er) Data Compared to Big Data\u00a0\u00a0<\/span><\/strong><\/h3>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">Deepmind\u2019s research and Stanford\u2019s Alpaca model remind us that what\u2019s really constraining the reasoning and output of these models is high-quality, curated training data. Founders and CTOs will likely balance cost, privacy, security, and performance alongside the usual headline improvements in accuracy and reasoning. There are benefits and risks to these two different models, and it\u2019s up to enterprises to balance the cost\/benefit analysis for themselves.\u00a0<\/span><\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">To understand the tradeoffs practitioners often evaluate, we summarized the pros and cons of open sources and proprietary models, which can be found in the table below.<\/span><\/p>\n<div id=\"attachment_5523\" style=\"width: 685px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-5523\" class=\"wp-image-5523 size-full\" src=\"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/oie_nEGRCXwk8Ma4.jpg\" alt=\"Open-Source and Proprietary Models\" width=\"675\" height=\"350\" srcset=\"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/oie_nEGRCXwk8Ma4.jpg 675w, https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/oie_nEGRCXwk8Ma4-300x156.jpg 300w, https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/oie_nEGRCXwk8Ma4-579x300.jpg 579w, https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/oie_nEGRCXwk8Ma4-156x81.jpg 156w, https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/oie_nEGRCXwk8Ma4-347x180.jpg 347w\" sizes=\"(max-width: 675px) 100vw, 675px\" \/><p id=\"caption-attachment-5523\" class=\"wp-caption-text\">Open-Source and Proprietary Models<\/p><\/div>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">Open-source models have many benefits. Because these models are self-hosted, you own the model, the data, and the entire ecosystem around it. It\u2019s also much less expensive to run because of the fewer compute parameters needed to get the desired results.\u00a0<\/span><\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">At the same time, the risk of open-source foundation models is the amount of high-quality labeled data needed to fine-tune. And getting the desired results may take time. Large proprietary models, like OpenAI\u2019s GPT, can give you a good enough output with contextualized prompting and the time to value is much faster than with open-source models. Yet, integrating these models and the computing power needed to run them can be expensive. As these models get more complex, the latency and costs increase.\u00a0<\/span><\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">Finally (and potentially most importantly), a risk of proprietary models is that you\u2019re at the mercy of the foundation model provider\u2019s terms of use, which could lead to data privacy concerns.\u00a0<\/span><\/p>\n<h3 dir=\"ltr\"><strong><span style=\"color: black;\">Foundation Model Innovation is Not Binary\u00a0<\/span><\/strong><\/h3>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">We\u2019re heading into a world where everyone has access to these models, and enterprises and individuals alike will get tons of value from them. It\u2019s up to enterprises to decide not only how they\u2019ll create the next industry-disrupting technology using proprietary or open-source models but also how they\u2019ll tweak them in a way that gives them better results for their specific use case. This kind of innovation isn\u2019t a binary approach: open-source and proprietary models can work harmoniously.<\/span><\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">For example, <a href=\"https:\/\/streaklinks.com\/BhQNwb7FRUICgTYM7wNxtHWr\/https%3A%2F%2Fwww.tryklarity.com%2F\" target=\"_blank\" rel=\"noopener noreferrer\" data-auth=\"NotApplicable\" data-linkindex=\"7\">Klarity<\/a> uses a combination of optical character recognition (OCR) tools, BERT, and other models to turn contracts into machine-readable corpuses of data. They then use GPT-4 to extract and understand contracts through 0 and k-shot learning. The space is evolving at a blistering pace, and we\u2019re excited to see what developers, entrepreneurs, and machine learning engineers flock to in the coming months.<\/span><\/p>\n<p dir=\"ltr\" style=\"text-align: justify;\"><span style=\"color: black;\">Despite whatever media hype cycle exists about large proprietary models, they aren\u2019t always the panacea that we infer them to be. For many entrepreneurs building the next great company, the technical reality is often a mix of models of different sizes. These two realities can coexist \u2014\u00a0it\u2019s just a matter of how companies use them.<\/span><\/p>\n<p dir=\"ltr\"><div class=\"widget\"><div class=\"aside-card\">\t\t\t<div class=\"textwidget\"><p><a class=\"speedbump\" href=\"https:\/\/solutionsreview.com\/data-management\/data-management-vendor-map-a-guide-to-the-best-data-management-tools\/\" target=\"_blank\" rel=\"noopener\"><img loading=\"lazy\" decoding=\"async\" class=\"alignleft size-full wp-image-1682\" src=\"https:\/\/solutionsreview.com\/data-management\/files\/2019\/01\/data-management-vendor-map-sb-cta.jpg\" alt=\"Download Link to Data Management Vendor Map\" width=\"800\" height=\"225\" \/><\/a><\/p>\n<\/div>\n\t\t<\/div><\/div><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Solutions Review\u2019s Premium Content Series is a collection of contributed articles written by industry experts in enterprise software categories. In this feature, Tola Capital Vice President Jake Nibley, Partner Akshay Bhushan, and Founder Sinan Ozdemir offer a commentary on how fine-tuning and data quality are defining the AI arms race. We\u2019re only a few months [&hellip;]<\/p>\n","protected":false},"author":714,"featured_media":5535,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[3],"tags":[1376,1374,1375,1373],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v23.5 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race<\/title>\n<meta name=\"description\" content=\"Tola Capital&#039;s Jake Nibley, Akshay Bhushan, and Sinan Ozdemir offer a commentary on how fine-tuning and data quality are defining AI.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race\" \/>\n<meta property=\"og:description\" content=\"Tola Capital&#039;s Jake Nibley, Akshay Bhushan, and Sinan Ozdemir offer a commentary on how fine-tuning and data quality are defining AI.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/\" \/>\n<meta property=\"og:site_name\" content=\"Best Data Management Software, Vendors and Data Science Platforms\" \/>\n<meta property=\"article:published_time\" content=\"2023-06-16T18:32:18+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-08-11T14:59:12+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"800\" \/>\n\t<meta property=\"og:image:height\" content=\"400\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Jake Nibley\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Jake Nibley\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/\",\"url\":\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/\",\"name\":\"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race\",\"isPartOf\":{\"@id\":\"https:\/\/solutionsreview.com\/data-management\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg\",\"datePublished\":\"2023-06-16T18:32:18+00:00\",\"dateModified\":\"2023-08-11T14:59:12+00:00\",\"author\":{\"@id\":\"https:\/\/solutionsreview.com\/data-management\/#\/schema\/person\/8c4f59582567fcc512b84cc07038d6a6\"},\"description\":\"Tola Capital's Jake Nibley, Akshay Bhushan, and Sinan Ozdemir offer a commentary on how fine-tuning and data quality are defining AI.\",\"breadcrumb\":{\"@id\":\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#primaryimage\",\"url\":\"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg\",\"contentUrl\":\"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg\",\"width\":800,\"height\":400,\"caption\":\"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/solutionsreview.com\/data-management\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/solutionsreview.com\/data-management\/#website\",\"url\":\"https:\/\/solutionsreview.com\/data-management\/\",\"name\":\"Best Data Management Software, Vendors and Data Science Platforms\",\"description\":\"Enterprise Information Management\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/solutionsreview.com\/data-management\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/solutionsreview.com\/data-management\/#\/schema\/person\/8c4f59582567fcc512b84cc07038d6a6\",\"name\":\"Jake Nibley\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/solutionsreview.com\/data-management\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/47b768c323554e72c0d0533230ee5365?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/47b768c323554e72c0d0533230ee5365?s=96&d=mm&r=g\",\"caption\":\"Jake Nibley\"},\"description\":\"Jake Nibley has a background in business planning, applying financial and metrics-based views to strategic business and product decisions. As an investor he applies this experience to do deep financial forensics to understand the underlying health and drivers of businesses and to identify strategic growth opportunities.\",\"url\":\"https:\/\/solutionsreview.com\/data-management\/author\/jnibley\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race","description":"Tola Capital's Jake Nibley, Akshay Bhushan, and Sinan Ozdemir offer a commentary on how fine-tuning and data quality are defining AI.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/","og_locale":"en_US","og_type":"article","og_title":"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race","og_description":"Tola Capital's Jake Nibley, Akshay Bhushan, and Sinan Ozdemir offer a commentary on how fine-tuning and data quality are defining AI.","og_url":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/","og_site_name":"Best Data Management Software, Vendors and Data Science Platforms","article_published_time":"2023-06-16T18:32:18+00:00","article_modified_time":"2023-08-11T14:59:12+00:00","og_image":[{"width":800,"height":400,"url":"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg","type":"image\/jpeg"}],"author":"Jake Nibley","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Jake Nibley","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/","url":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/","name":"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race","isPartOf":{"@id":"https:\/\/solutionsreview.com\/data-management\/#website"},"primaryImageOfPage":{"@id":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#primaryimage"},"image":{"@id":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#primaryimage"},"thumbnailUrl":"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg","datePublished":"2023-06-16T18:32:18+00:00","dateModified":"2023-08-11T14:59:12+00:00","author":{"@id":"https:\/\/solutionsreview.com\/data-management\/#\/schema\/person\/8c4f59582567fcc512b84cc07038d6a6"},"description":"Tola Capital's Jake Nibley, Akshay Bhushan, and Sinan Ozdemir offer a commentary on how fine-tuning and data quality are defining AI.","breadcrumb":{"@id":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#primaryimage","url":"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg","contentUrl":"https:\/\/solutionsreview.com\/data-management\/files\/2023\/06\/Fine-Tuning-Data-Quality-AI.jpg","width":800,"height":400,"caption":"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race"},{"@type":"BreadcrumbList","@id":"https:\/\/solutionsreview.com\/data-management\/the-smaller-data-era-how-fine-tuning-and-data-quality-are-defining-the-ai-arms-race\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/solutionsreview.com\/data-management\/"},{"@type":"ListItem","position":2,"name":"The Small(er) Data Era: How Fine-Tuning and Data Quality are Defining the AI Arms Race"}]},{"@type":"WebSite","@id":"https:\/\/solutionsreview.com\/data-management\/#website","url":"https:\/\/solutionsreview.com\/data-management\/","name":"Best Data Management Software, Vendors and Data Science Platforms","description":"Enterprise Information Management","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/solutionsreview.com\/data-management\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/solutionsreview.com\/data-management\/#\/schema\/person\/8c4f59582567fcc512b84cc07038d6a6","name":"Jake Nibley","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/solutionsreview.com\/data-management\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/47b768c323554e72c0d0533230ee5365?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/47b768c323554e72c0d0533230ee5365?s=96&d=mm&r=g","caption":"Jake Nibley"},"description":"Jake Nibley has a background in business planning, applying financial and metrics-based views to strategic business and product decisions. As an investor he applies this experience to do deep financial forensics to understand the underlying health and drivers of businesses and to identify strategic growth opportunities.","url":"https:\/\/solutionsreview.com\/data-management\/author\/jnibley\/"}]}},"_links":{"self":[{"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/posts\/5522"}],"collection":[{"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/users\/714"}],"replies":[{"embeddable":true,"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/comments?post=5522"}],"version-history":[{"count":0,"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/posts\/5522\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/media\/5535"}],"wp:attachment":[{"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/media?parent=5522"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/categories?post=5522"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/solutionsreview.com\/data-management\/wp-json\/wp\/v2\/tags?post=5522"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}