AiThority Interview with Erin LeDell, Chief Scientist at Distributional AI
Erin LeDell, Chief Scientist at Distributional AI chats about AI product development lifecycles, how AI is reshaping the world of business while touching upon the potential dangers of AI in this Q&A:
———
Hi Erin, tell us about yourself and your role at Distributional.
As Chief Scientist at Distributional, my focus includes a mix of scientific leadership and product strategy. I have deep experience designing and developing AI software, which allows me to collaborate closely across the go-to-market, engineering, product, research, and customer success teams to ensure that the product we’re building is a vital part of the modern AI tech stack. I’m also using my expertise in AI evaluation and statistical analysis to further build the depth and breadth of the product. It’s been a longtime goal of mine to make AI more reliable and trustworthy, so I am looking forward to delivering on that in this new role. I was actually in the process of starting my own company focused on quantifying the behavior of machine learning algorithms when I first connected with Distributional’s CEO Scott Clark and was so impressed by the vision and product differentiation that I decided to join forces with him and come on as Chief Scientist instead.
In my prior decade of AI evaluation and benchmarking roles, I co-created the industry benchmark for AutoML systems—AMLB – which continues to be used by large commercial AI labs such as Amazon and Microsoft to evaluate their AutoML systems. The platform has been a big driver in the increase in performance and reliability of AutoML systems for the past five years. Additionally, I spent eight years as the Chief Machine Learning Scientist at the enterprise AI software company, H2O.ai, where I had the opportunity to work on a variety of projects spanning the entire AI stack. For example, I led scientific and product efforts on the open source enterprise ML platform, H2O, which has been adopted by numerous Fortune 500 companies for its critical applications in high-impact production environments.
Why would you say cohesive AI testing is important in today’s AI product development lifecycle? What are some of the misses in current testing cycles that Distributional helps fill the gap in?
As someone who has built many benchmarks, I actually feel rigorous testing that provides confidence in AI models is much more valuable for an organization overall. In seeing the many ways that unpredictable or unsafe models can cause real-life harm to people, and financial loss for the companies deploying them, I’ve been a longtime advocate of increasing models’ fairness and safety.
As AI becomes more integrated throughout enterprise workflows and is leveraged as a strategic advantage to drive long-term business value, I am extremely motivated to deliver the correct tooling to minimize potential sources of harm. As generative AI models merge into already complex AI software systems, further increasing their complexity and unpredictability, robust AI testing has never been more important than it is right now.
In terms of how Distributional’s testing platform fills these crucial gaps, the product removes the operational burden on enterprises to build and maintain their own solutions or cobble together incomplete solutions with other tools. Generative AI is particularly unreliable since it is inherently non-deterministic, and is also more likely to be non-stationary with many shifting components that are outside of the control of developers. As AI leaders are increasingly under pressure to ship generative AI, Distributional helps automate AI testing with intelligent suggestions on augmenting application data, suggesting tests, and enabling a feedback loop that adaptively calibrates these tests for each AI application being tested. By proactively addressing these testing problems with Distributional, AI teams can deploy with more confidence and proactively catch issues with AI applications before they cause significant damage in production.
What about the current state of global AI innovations most piques your interest?
The current state of global AI innovations is evolving rapidly in a myriad of directions, however, what I personally find to be the most compelling is the applications of generative AI to scientific and medical research. GenAI’s ability to analyze vast, unstructured datasets, identify patterns, and generate hypotheses is revolutionizing the pace and scope of discovery in these fields. In scientific research, GenAI is enabling breakthroughs in materials science, drug discovery, and climate modeling by simulating complex systems and predicting outcomes with remarkable accuracy. In the medical field, it is transforming areas such as personalized medicine, early diagnosis, and treatment planning, allowing for tailored solutions that significantly improve patient outcomes. What makes this particularly compelling is not only the speed of these advancements but also the potential to solve challenges previously thought insurmountable, ultimately improving lives on a global scale.
Also Read:Â AiThority Interview with Louis Landry, CTO of Teradata
Can you talk about the most interesting AI software out there that you feel is set to reshape the future of AI?
A new software project I am very excited about is DocETL, and a new method that I think will reshape the capabilities of LLMs is a technique called Memory Tuning.
DocETL is a system designed to facilitate complex document processing tasks by leveraging Large Language Models (LLMs). Developed by researchers at the University of California, Berkeley, it provides a low-code, declarative interface that allows users to define and execute data processing pipelines on unstructured datasets. The low-code nature of DocETL democratizes AI by making advanced document processing capabilities accessible to users without extensive programming expertise. This accessibility can accelerate AI adoption across various sectors, fostering innovation and efficiency.
Memory Tuning is an innovative technique that enhances the factual accuracy of Large Language Models (LLMs) by embedding precise information directly into the model, significantly reducing hallucinations—instances where models generate incorrect or nonsensical information. Developed by Lamini Inc, this method involves fine-tuning millions of expert adapters, such as Low-Rank Adapters (LoRAs), on top of existing open-source LLMs like Llama 3 or Mistral 3. Unlike traditional fine-tuning methods that may compromise a model’s generalization capabilities, Memory Tuning maintains the LLM’s versatility while embedding specific facts. By addressing the longstanding challenge of balancing factual accuracy with generalization, Memory Tuning paves the way for more reliable and efficient AI systems, thereby reshaping the landscape of AI applications in the future.
What in your view should AI developers keep in mind when building new products for the market?
Successful AI product development requires a comprehensive approach that balances multiple critical factors. Technical robustness and thorough safety testing form the foundation, but equally important is ensuring the product solves real-world problems and integrates smoothly into existing workflows. Developers must carefully consider scalability and resource efficiency from the outset, while building in appropriate levels of transparency and explainability to build user trust. Success also depends on clear documentation and support systems that help users understand both capabilities and limitations, coupled with robust monitoring and feedback mechanisms that enable continuous improvement after deployment.
Also read:Â Building the AI Platform of Tomorrow: The Role of Inference Applications in 2025 and Beyond
A few thoughts on the dangers of AI and ethics that should be pursued in the future?
In my opinion, the most pressing concern in AI development is the critical need for comprehensive testing and risk assessment, particularly as AI systems are increasingly deployed in high-stakes domains like healthcare, transportation, finance and our information ecosystem. Without rigorous testing frameworks and proper safeguards, these systems can amplify societal biases, make consequential errors, and AI is already being used to pollute our information ecosystem with misinformation. The rapid advancement of language models and autonomous systems demands robust evaluation protocols that assess not just technical performance, but also potential societal impacts and unintended consequences. To mitigate these risks, we need safety standards, and proactive governance frameworks that evolve alongside technological capabilities.
Comprehensive testing provides a critical foundation for high-stakes AI systems by combining rigorous technical validation, adversarial testing, and real-world verification to catch potential failures before deployment, while ongoing monitoring ensures continued safety and reliability. This multi-layered approach helps identify issues across diverse scenarios, different populations, and varying conditions, ultimately creating more robust and trustworthy AI systems for critical applications.
[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]
Dr. LeDell is the Chief Scientist at Distributional, Inc, the modern enterprise platform for AI testing and evaluation. Since 2013, she is also the CEO of DataScientific, Inc., a boutique AI Advisory and Consulting firm specializing in the development and implementation of cutting-edge AI solutions.
Previously, she was the Chief Machine Learning Scientist at H2O.ai, a leading AI company known for producing H2O, an open-source, distributed machine learning platform, along with Driverless AI, h2oGPT, LLMStudio, and a range of other Enterprise AI systems. At H2O.ai, she created the H2O AutoML algorithm, the first enterprise open source AutoML tool. She also spearheaded efforts in explainable/interpretable AI, algorithmic fairness and AI benchmarking and measurement. She is co-creator of the OpenML AutoML Benchmark, the industry standard benchmarking platform for AutoML algorithms.
Before joining H2O.ai, she was the Principal Data Scientist at two AI startups (both acquired), founded DataScientific, Inc. and was a software engineer at a large consulting firm. She received her Ph.D. from UC Berkeley where her research focused on machine learning and computational statistics. She also holds a B.S. and M.A. in Mathematics.
Distributional AI is the modern platform for enterprise AI testing and evaluation to make AI safe, reliable, and secure.
Comments are closed.