H.R. Tizhoosh

KIMIA Lab, Mayo Clinic, Rochester, MN, USA

 

Nowadays, many people engage in artificial intelligence (AI) model building. While this is a promising indicator of the democratization of artificial intelligence, it can also lead to significant inefficiencies if those involved lack the requisite skills. Building AI models, particularly very deep and complex ones like foundation models, is a resource-intensive process that demands expertise across multiple disciplines. Without proper knowledge and preparation, efforts in AI model design and training can result in wasted time, energy, and computational resources. This article explores the skills and qualifications necessary for successfully designing, training, and validating large AI models, especially the very deep networks like foundation models that drive innovation in sensitive fields like healthcare and finance.

 

Building accurate, reliable, and unbiased AI models requires a combination of technical, analytical, and practical skills. Here’s a closer look at the key areas of expertise:

 

Strong Foundational Knowledge - To design and train AI models effectively, individuals need a solid understanding of the mathematical and theoretical foundations that underpin machine learning:

 

  • Mathematics: Expertise in linear algebra, calculus, probability, and statistics is essential for understanding model operations and optimization.
  • Machine Learning Fundamentals: Knowledge of supervised, unsupervised, and reinforcement learning forms the basis of all AI work.
  • Deep Learning Concepts: Proficiency in neural network architectures, backpropagation, and training techniques is critical for modern AI tasks.

 

Proficiency in Programming and Tools - Programming forms the backbone of AI model building:

 

  • Languages and Frameworks: Python is a must-have skill, along with deep learning libraries like TensorFlow, PyTorch, or Keras.
  • Software Engineering Practices: Writing clean, modular, and efficient code is essential for scalable and maintainable models.
  • Version Control: Familiarity with Git or similar tools is important for collaborative development.

 

Data Management Expertise - AI models are only as good as the data they are trained on:

 

  • Data Preprocessing: Skills in cleaning, augmenting, and balancing datasets are vital.
  • Handling Large Datasets: Experience with distributed computing tools like Apache Spark ensures efficiency in data processing.

 

Model Training and Optimization - Building a high-performing model requires careful design and optimization:

 

  • Architecture Design: Selecting or creating appropriate architectures tailored to the problem.
  • Training Techniques: Implementing methods like batch normalization, dropout, and adaptive learning rates.
  • Hardware Utilization: Leveraging Graphics Processing Units (GPUs) and Tensor Processing Units (TPUs) for faster training.

 

Validation and Bias Mitigation - To ensure models are reliable and fair:

 

  • Evaluation Metrics: Understanding task-specific metrics like accuracy, precision, recall, and F1-score.
  • Bias Analysis: Identifying and addressing potential biases in data or model behavior.
  • Error Analysis: Debugging and resolving issues that affect performance.

 

Deployment and Monitoring - The work doesn’t end with a trained model:

 

  • Deployment: Packaging models for real-world use with tools like Docker or TensorFlow Serving.
  • Monitoring: Continuously tracking model performance to detect drift or degradation over time.

 

Domain Expertise - Effective AI model builders often have specialized knowledge in the application area. For instance:

 

  • In healthcare, understanding medical imaging or diagnostics is crucial.
  • In finance, expertise in risk modeling or fraud detection adds value.

 

The High Cost of Inefficiency - Training AI models—especially deep networks—is computationally expensive, consuming vast amounts of energy and time. The recent rise in AI democratization has led to many individuals and organizations attempting to build models without sufficient expertise. While enthusiasm for AI is commendable, the inefficiency of unskilled efforts has significant consequences:

 

  • Wasted Resources: Computational power, storage, and human effort are squandered.
  • Inferior Models: Poorly designed models may underperform or fail to meet the intended goals.
  • Ethical Risks: Models that are biased or unreliable can have harmful consequences, especially in sensitive domains.

 

And most importantly: Previous practical experience  — successful experience (3–5 years)—with the design, training, validation, and publication of ordinary (not very large) deep networks is critical.

 

  • Experience with training Multilayer Perceptrons (MLPs): Familiarity with the fundamentals of shallow networks is essential.
  • Experience with training Convolutional Neural Networks (CNNs): A solid understanding of weight sharing and filtering concepts is necessary.
  • Experience with training autoencoders: Proficiency in handling bottlenecks and reconstruction nuances is important.
  • Experience with training Transformers: Expertise in the manual design and architecture of transformers is key.

 

While it is exciting to see widespread interest in AI model building, it is critical to recognize that success in this field requires a robust skill set. Designing, training, and validating very deep networks involves a blend of mathematical knowledge, programming expertise, data handling proficiency, and domain-specific insight.

 

One of the most striking examples of the damage caused by individuals operating outside their area of expertise is the 2008 global financial crisis, specifically the role of poorly understood and mishandled financial instruments like subprime mortgages and collateralized debt obligations (CDOs). Could we witness a similar catastrophe in medicine?

 

Without these skills, attempts to create large and complex AI models to serve as foundation models often result in inefficiencies, subpar outcomes, and ethical risks. It is utterly irresponsible to waste resources on the training of such models without the necessary expertise. Those seeking to engage in this field must first equip themselves with the required skills—or collaborate with those who already possess them—to ensure their efforts are impactful and meaningful.

 

Imagine a poorly designed and trained foundation model somehow making its way to a patient’s bedside or the operating room. It’s a horrifying picture of what could go wrong before the issues are detected.

 

First published on LinkedIn, December 23, 2024

 

 

Disclaimer: In seeking to foster discourse on a wide array of ideas, the Digital Pathology Association believes that it is important to share a range of prominent industry viewpoints. This article does not necessarily express the viewpoints of the DPA; however, we view this as a valuable point with which to facilitate discussion.