Machine Learning Interview Questions 2025 series 4

April 19, 2025

41. 🚀 What is Learning Rate in Machine Learning?

The learning rate controls how much a model adjusts in response to the estimated error during training. Think of it like taking baby steps toward accuracy—too small, and training takes forever; too large, and the model may overshoot the best solution.

42. ⚙️ What Are Epochs, Batches, and Iterations?

An epoch is one full pass through the entire training dataset. A batch is a subset of the data passed at once, and an iteration is one update of the model weights per batch. These help manage memory and improve performance.

43. 🧪 What is the Difference Between Training, Validation, and Test Sets?

Training set is used to fit the model. Validation set is used to tune hyperparameters and check performance during training. Test set evaluates how well the model generalizes to new, unseen data.

44. 🔍 What is Grid Search?

Grid Search is a technique to find the best combination of hyperparameters by trying every possible option in a specified range. Though time-consuming, it's effective for fine-tuning model performance.

45. 📏 What Is Mean Absolute Error (MAE) and Mean Squared Error (MSE)?

MAE measures average absolute differences between predicted and actual values. MSE squares the errors, giving higher penalty to large mistakes. MSE is sensitive to outliers, while MAE gives equal weight to all errors.

46. 📊 What Is R-squared Score?

The R² score explains how much of the variance in the target variable is captured by the model. A value of 1 means perfect prediction; 0 means it performs no better than a constant guess. It’s a handy metric for regression models.

47. 🧱 What Is the Curse of Dimensionality?

As the number of features (dimensions) increases, data becomes sparse and models struggle to generalize. This “curse” can be tackled using dimensionality reduction techniques like PCA or feature selection.

48. ⚡ What Are Hyperparameters vs. Parameters?

Parameters are learned from training (e.g., weights in a neural network). Hyperparameters are set before training (like learning rate, number of trees). Tuning hyperparameters is essential for optimal model performance.

49. 📦 What is Bagging?

Bagging, or Bootstrap Aggregating, builds multiple models from random subsets of data and averages their outputs. It reduces variance and helps models like Decision Trees perform better. Random Forest is a classic example of bagging.

50. 🚀 What is Boosting?

Boosting combines weak learners sequentially, where each new model corrects the errors of the previous ones. It’s powerful for accuracy but can overfit if not carefully tuned. Examples: AdaBoost, XGBoost, LightGBM.

51. 🧠 What is a Perceptron?

A perceptron is the simplest type of neural network model. It takes inputs, multiplies them by weights, adds bias, and passes the result through an activation function. It’s the building block of deep neural networks.

52. 🔐 What Is One-Hot Encoding?

One-Hot Encoding converts categorical variables into binary format. Each category becomes a new column with 0 or 1 values. It’s essential for machine learning algorithms that can’t handle text directly.

53. 🧮 What Is PCA (Principal Component Analysis)?

PCA reduces high-dimensional data to fewer dimensions while preserving most of the variance. It transforms the data into a new coordinate system, making it easier to visualize and process.

54. 🌍 What Are Word Embeddings?

Word embeddings turn words into numerical vectors that reflect semantic meaning. Models like Word2Vec or GloVe allow similar words to have similar vector representations, boosting NLP model performance.

55. 🕵️ What is Anomaly Detection?

Anomaly detection finds data points that deviate significantly from the norm. It’s useful in fraud detection, network security, or system health monitoring. Algorithms include Isolation Forest and One-Class SVM.

56. 🎯 What Is AUC-ROC Curve?

The ROC curve plots true positive rate vs. false positive rate. AUC (Area Under the Curve) quantifies this graph. A model with AUC near 1 is great; near 0.5 means it's no better than random guessing.

57. ⚖️ What Is SMOTE in Machine Learning?

SMOTE (Synthetic Minority Oversampling Technique) generates new samples for the minority class by interpolating between existing examples. It helps in dealing with imbalanced datasets, especially in binary classification.

58. 🧬 What Is K-Fold Cross-Validation?

K-Fold Cross-Validation splits data into k parts. The model trains on k−1 parts and tests on the remaining part. This process repeats k times, and the final score is averaged. It gives a more reliable estimate of model performance.

59. 📉 What Is Gradient Vanishing Problem?

In deep networks, gradients used for updating weights can become very small, slowing or stopping learning. This is known as the vanishing gradient problem, often solved with ReLU activation or batch normalization.

60. ⚙️ What Is a Pipeline in Machine Learning?

A pipeline is a workflow that chains multiple processing steps like data cleaning, feature scaling, and model training. It ensures consistency and is especially helpful when testing or deploying ML models.

Search This Blog

Radhika Nanda