Machine Learning Guide

MLG 027 Hyperparameters 1


Listen Later

Full notes and resources at  ocdevel.com/mlg/27 

Try a walking desk to stay healthy while you study or work!

Hyperparameters are crucial elements in the configuration of machine learning models. Unlike parameters, which are learned by the model during training, hyperparameters are set by humans before the learning process begins. They are the knobs and dials that humans can control to influence the training and performance of machine learning models.

Definition and Importance

Hyperparameters differ from parameters like theta in linear and logistic regression, which are learned weights. They are choices made by humans, such as the type of model, number of neurons in a layer, or the model architecture. These choices can have significant effects on the model's performance, making them vital to conscious and informed tuning.

Types of Hyperparameters Model Selection:

Choosing what model to use is itself a hyperparameter. For example, deciding between linear regression, logistic regression, naive Bayes, or neural networks.

Architecture of Neural Networks:
  • Number of Layers and Neurons: Deciding the width (number of neurons) and depth (number of layers).
  • Types of Layers: Whether to use LSTMs, convolutional layers, or dense layers.
Activation Functions:

They transform linear outputs into non-linear outputs. Popular choices include ReLU, tanh, and sigmoid, with ReLU being the default for most neural network layers.

Regularization and Optimization:

These influence the learning process. The use of L1/L2 regularization or dropout, as well as the type of optimizer (e.g., Adam, Adagrad), are hyperparameters.

Optimization Techniques

Techniques like grid search, random search, and Bayesian optimization are used to systematically explore combinations of hyperparameters to find the best configuration for a given task. While these methods can be computationally expensive, they are necessary for achieving optimal model performance.

Challenges and Future Directions

The field strives towards simplifying the choice of hyperparameters, ideally automating them to become parameters of the model itself. Efforts like Google's AutoML aim to handle hyperparameter tuning automatically.

Understanding and optimizing hyperparameters is a cornerstone in machine learning, directly impacting the effectiveness and efficiency of a model. Progress continues to integrate these choices into model training, reducing the dependency on human intervention and trial-and-error experimentation.

Decision Tree
  • Model selection
    • Unsupervised? K-means Clustering => DL
    • Linear? Linear regression, logistic regression
    • Simple? Naive Bayes, Decision Tree (Random Forest, Gradient Boosting)
    • Little data? Boosting
    • Lots of data, complex situation? Deep learning
  • Network
    • Layer arch
      • Vision? CNN
      • Time? LSTM
      • Other? MLP
      • Trading LSTM => CNN decision
    • Layer size design (funnel, etc)
      • Face pics
      • From BTC episode
      • Don't know? Layers=1, Neurons=mean(inputs, output) link
  • Activations / nonlinearity
    • Output
      • Sigmoid = predict probability of output, usually at output
      • Softmax = multi-class
      • Nothing = regression
    • Relu family (Leaky Relu, Elu, Selu, ...) = vanishing gradient (gradient is constant), performance, usually better
    • Tanh = classification between two classes, mean 0 important
...more
View all episodesView all episodes
Download on the App Store

Machine Learning GuideBy OCDevel

  • 4.9
  • 4.9
  • 4.9
  • 4.9
  • 4.9

4.9

753 ratings


More shows like Machine Learning Guide

View all
Data Skeptic by Kyle Polich

Data Skeptic

474 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

585 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

630 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

429 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

200 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

295 Listeners

Python Bytes by Michael Kennedy and Brian Okken

Python Bytes

212 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

322 Listeners

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion by AI & Data Today

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion

147 Listeners

DataFramed by DataCamp

DataFramed

267 Listeners

Last Week in AI by Skynet Today

Last Week in AI

275 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

90 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

193 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

64 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

418 Listeners