Building Enterprise-Grade AI: Lessons Learned to Get from Ideation to Launch

By Urmila Kukreja , Director of Product Management at Smartsheet
Best Practices,

Urmila Kukreja, Director of Product Management at Smartsheet, shares some lessons on developing enterprise-grade AI. This article originally appeared in Insight Jam, an enterprise IT community that enables human conversation on AI.

Every company today feels the pressure to leverage AI to boost productivity, streamline processes, and deliver impactful business results. And with good reason: Thomson Reuters’ 2024 Future of Professionals Report found that AI could save professionals 12 hours per week by 2029, allowing them to focus that time on driving innovation, with 77 percent of professionals believing AI will have a high or transformational impact on their work over the next five years.

Companies can explore several avenues when developing an AI approach: taking advantage of off-the-shelf AI solutions, adapting open-source generative AI models to get more customized results, or building their own models. Developing solutions using existing models is the preferred approach for many companies because, when done effectively, it can have a significant return on investment. This is particularly true for enterprises looking to scale generative AI across their organization, develop more proprietary capabilities, and meet higher security or compliance needs.

While ideas for new custom AI features may be plentiful, the road to launching them can present unique challenges. From an idea to an enterprise-grade AI tool that customers rely on daily, I’ll share best practices for every step of the process, from ideation to development to testing and launch, and the lessons my team learned along the way.

Ideation: Solving Old Problems, New Ways

Sometimes, figuring out where to start can be a challenge in itself. Hosting an internal hackathon is a great way to kickstart the creative process and generate unexpected ideas for applying AI to achieve a viable proof of concept. If your company already hosts an annual or bi-annual hackathon, you can encourage your team to explore AI ideas during an upcoming event.

Below are a few ways you can optimize AI ideation with a hackathon:

Create guidelines: Before the hackathon, share clear guidelines for employees around which generative AI models they can use and what data they can share with the model to ensure they’re experimenting securely and responsibly.
Fail fast, learn faster: Encourage rapid prototyping and iterative testing. Use initial failures as learning opportunities to refine and improve AI solutions.
Focus on real-world applications: Align hackathon projects with actual business needs and use cases as you think about the problems AI can solve. Think about what problems your tech tools solve today and how AI can enhance them. For example, if you have a content management tool, consider how AI can improve tagging and searching content. If you have a data analysis tool, consider how AI can help users formulate queries and get insights more easily.

Development: Tackling Key Obstacles

The road from the initial ‘aha moments’ to a production-ready, enterprise-grade AI solution can present numerous unique speedbumps. I have found that early successes were made possible by how easy Large Language Models (LLMs) are to build off, getting us 80 percent of the way in 20 percent of the time. Then, the hard work began.

When working with AI models, common speedbumps your team might face include:

Hallucinations: Creativity is key when using generative AI technology, but ensuring the content your model generates is accurate is paramount. We devoted considerable effort to limiting hallucinations through clever prompt engineering and testing generated responses, providing the data’s reliability.
Latency: LLM models are slower than other web interactions to which users are accustomed. We spent substantial time reducing latency and perceived latency without sacrificing accuracy. No one wants a wrong answer quickly.
Blank canvas: Embedding a ChatGPT interface is fairly simple, but users often don’t know what AI can do, so crafting an informative UX is essential. We worked to prevent customers from encountering confusion on where to start by offering suggested prompts, providing conversational experiences, and managing user expectations for latency.
Beyond conversational experiences: For better or worse, AI customer service bots have gotten a bad rap, and most users just want to speak to a human as quickly as possible. Generative AI is so much more than conversational UX. We found that providing dedicated solutions that leverage AI that users can then tweak with multi-turn conversations makes it easier for users to get started. For example, a specific button to generate a status report that users could then tweak to enter the timeframe of the report or definition of risk did much better than a general-purpose ask-me-anything conversational interface.

Testing AI: A Different Beast

Testing AI solutions requires a distinctly different approach compared to traditional software. Unlike deterministic code, AI outputs are probabilistic, meaning the same prompt can result in different answers. Traditional regression testing to ensure repeatable results no longer works. Instead, a continuous process of evaluating the accuracy and appropriateness of AI output across varying contexts is essential.

When testing AI solutions, I recommend the following:

Continuous evaluation: Create a dataset of questions with “correct” answers verified. Test the AI responses against this dataset to rank their accuracy. For example, we built an AI solution to help respond to customers’ Help and Support questions. We used a dataset of the Top 100 questions our current customers reach out to support with and the correct answers for each question.
Organize testing events: Employees are a great resource for initial testing and feedback. Hold company-wide AI events, whether it’s a testing “power hour” or a raffle with the ability to win prizes to encourage participation from various levels and teams so you can gather realistic test data.
Embrace failure: Learn from initial failures to refine and improve AI solutions. For example, our first beta of an AI-assisted template search tool didn’t provide significant value, so we returned to the drawing board to develop a more robust solution builder that truly co-builds solutions with users.

Launching Enterprise-Grade AI Tools: Increasing Transparency and Accountability

A key concern when introducing AI tools is customer privacy and security. Customers often ask if their prompts are used to train AI models. Transparency is crucial here—customers need to understand how you use public models and protect their data to feel comfortable using your AI tools.

To achieve this, publicly document exactly how your AI systems work. Encourage your team to get as specific as possible and explain how everything works in detail. It’s also important that the tools you’re building show their work so your customers understand the recommendations and insights AI provides. As AI becomes more integrated into our everyday work, explaining its reasoning will be important for maintaining customer trust.

Building customized AI solutions is challenging, but it can have huge benefits for your organization when done effectively. By fostering a culture of innovation, focusing on practical, useful applications, and maintaining rigorous testing standards that build trust, your team—and, most importantly, your customers—can leverage enterprise-grade AI to drive significant business value.

This article was written by Urmila Kukreja on October 7, 2024

Urmila Kukreja

Director of Product Management

Urmila Kukreja is a seasoned product leader with over two decades of experience driving innovation in the tech industry. As Director of Product at Smartsheet, she is at the forefront of shaping the future of work, with a specific focus on AI strategy and search. Urmila brings deep expertise in building collaborative products that seamlessly integrate into the enterprise ecosystem.

Best Practices