DEV Community

Cover image for 50 Machine Learning MCQs with Answers
ARUNNACHALAM R S
ARUNNACHALAM R S

Posted on

50 Machine Learning MCQs with Answers

50 Machine Learning MCQs with Answers
Q1. Which of the following is an example of supervised learning?

a) Clustering customers based on purchases
b) Predicting house prices using historical data
c) Finding anomalies in a dataset
d) Dimensionality reduction

Answer: b) Predicting house prices using historical data
πŸ‘‰ Supervised learning uses labeled data to train models.

Q2. Which algorithm is used for classification problems?

a) K-means
b) Linear Regression
c) Logistic Regression
d) PCA

Answer: c) Logistic Regression
πŸ‘‰ Logistic regression predicts categorical outcomes (yes/no).

Q3. What does overfitting in ML mean?

a) Model performs well on test data but poorly on training data
b) Model performs well on training data but poorly on test data
c) Model performs equally on both training and test data
d) Model is too simple to capture complexity

Answer: b) Model performs well on training data but poorly on test data
πŸ‘‰ Overfitting means memorization instead of generalization.

Q4. Which evaluation metric is best for imbalanced classification problems?

a) Accuracy
b) Precision & Recall
c) RΒ² Score
d) Mean Squared Error

Answer: b) Precision & Recall
πŸ‘‰ Accuracy is misleading in imbalanced datasets; Precision & Recall are better.

Q5. Which of the following is a dimensionality reduction technique?

a) K-means
b) PCA
c) Decision Trees
d) Naive Bayes

Answer: b) PCA
πŸ‘‰ Principal Component Analysis reduces the number of features.

Q6. Which ML technique is used in spam email detection?

a) Clustering
b) Classification
c) Regression
d) Reinforcement Learning

Answer: b) Classification
πŸ‘‰ Spam detection is a binary classification problem.

Q7. In k-NN algorithm, β€˜k’ represents what?

a) Number of features
b) Number of clusters
c) Number of nearest neighbors considered
d) Number of iterations

Answer: c) Number of nearest neighbors considered
πŸ‘‰ k-NN predicts based on the majority class of k nearest neighbors.

Q8. Which ML approach learns by interacting with an environment and receiving rewards?

a) Supervised Learning
b) Unsupervised Learning
c) Reinforcement Learning
d) Semi-supervised Learning

Answer: c) Reinforcement Learning
πŸ‘‰ Reinforcement learning is based on rewards and penalties.

Q9. Gradient Descent is used for?

a) Feature selection
b) Optimization
c) Regularization
d) Clustering

Answer: b) Optimization
πŸ‘‰ Gradient descent minimizes loss functions by updating weights.

Q10. Which activation function is commonly used in hidden layers of neural networks?

a) Sigmoid
b) ReLU
c) Softmax
d) Linear

Answer: b) ReLU
πŸ‘‰ ReLU is widely used as it reduces vanishing gradient issues.

Q11. What type of ML algorithm is K-means clustering?

a) Supervised
b) Unsupervised
c) Reinforcement
d) Semi-supervised

Answer: b) Unsupervised
πŸ‘‰ K-means is an unsupervised clustering algorithm.

Q12. Which of the following is an ensemble learning technique?

a) Decision Tree
b) Random Forest
c) Logistic Regression
d) PCA

Answer: b) Random Forest
πŸ‘‰ Random Forest combines multiple decision trees (bagging).

Q13. Which regularization technique adds L1 penalty to regression?

a) Ridge Regression
b) Lasso Regression
c) Elastic Net
d) Logistic Regression

Answer: b) Lasso Regression
πŸ‘‰ Lasso uses L1 penalty to shrink coefficients to zero.

Q14. Which ML concept is used to avoid overfitting in neural networks?

a) Dropout
b) Gradient Descent
c) Backpropagation
d) Normalization

Answer: a) Dropout
πŸ‘‰ Dropout randomly disables neurons during training to prevent overfitting.

Q15. ROC curve is used for?

a) Clustering evaluation
b) Classification model performance
c) Regression error measurement
d) Feature selection

Answer: b) Classification model performance
πŸ‘‰ ROC curve evaluates true positive vs false positive rates.

Q16. Which ML algorithm is best for text sentiment analysis?

a) K-means
b) Naive Bayes
c) PCA
d) KNN

Answer: b) Naive Bayes
πŸ‘‰ Naive Bayes is effective for text classification problems.

Q17. What is the purpose of a confusion matrix?

a) To visualize clustering results
b) To evaluate classification performance
c) To check feature correlations
d) To optimize hyperparameters

Answer: b) To evaluate classification performance
πŸ‘‰ It shows TP, FP, TN, FN counts.

Q18. Which ML concept is related to bias-variance tradeoff?

a) Overfitting and underfitting
b) Feature scaling
c) Normalization
d) Hyperparameter tuning

Answer: a) Overfitting and underfitting
πŸ‘‰ Bias-variance tradeoff balances model complexity and accuracy.

Q19. What does TF-IDF stand for in text mining?

a) Term Frequency - Inverse Document Frequency
b) Total Frequency - Inverse Data Factor
c) Text Factor - Input Document Feature
d) Term Feature - Important Document Frequency

Answer: a) Term Frequency - Inverse Document Frequency
πŸ‘‰ TF-IDF highlights important words in documents.

Q20. Which of the following is NOT a supervised ML algorithm?

a) Decision Tree
b) SVM
c) K-means
d) Linear Regression

Answer: c) K-means
πŸ‘‰ K-means is unsupervised, others are supervised.

Q21. In reinforcement learning, the agent learns by?

a) Labelled data
b) Clustering
c) Rewards & Penalties
d) Regression models

Answer: c) Rewards & Penalties
πŸ‘‰ RL is based on trial and error interactions with the environment.

Q22. Which gradient descent variant updates weights after each training example?

a) Batch Gradient Descent
b) Mini-Batch Gradient Descent
c) Stochastic Gradient Descent
d) Regularized Gradient Descent

Answer: c) Stochastic Gradient Descent
πŸ‘‰ SGD updates weights after every training sample.

Q23. Which algorithm is used for dimensionality reduction in images?

a) PCA
b) KNN
c) Naive Bayes
d) Decision Trees

Answer: a) PCA
πŸ‘‰ PCA reduces image feature dimensions effectively.

Q24. What is the main purpose of cross-validation?

a) Feature selection
b) Hyperparameter tuning
c) Model evaluation
d) Data cleaning

Answer: c) Model evaluation
πŸ‘‰ Cross-validation ensures the model generalizes well.

Q25. Which of the following is a kernel-based method?

a) Logistic Regression
b) Decision Tree
c) SVM
d) Random Forest

Answer: c) SVM
πŸ‘‰ Support Vector Machines use kernels for complex boundaries.

Q26. Which technique is used to convert categorical data into numeric?

a) Normalization
b) Standardization
c) One-hot encoding
d) PCA

Answer: c) One-hot encoding
πŸ‘‰ One-hot encoding represents categories as binary vectors.

Q27. Which of the following is a clustering algorithm?

a) Decision Tree
b) Random Forest
c) K-means
d) Logistic Regression

Answer: c) K-means
πŸ‘‰ K-means groups data into clusters.

Q28. Which metric is commonly used in regression problems?

a) Precision
b) Recall
c) Mean Squared Error (MSE)
d) Accuracy

Answer: c) Mean Squared Error (MSE)
πŸ‘‰ MSE is a standard metric for regression models.

Q29. Bagging and Boosting are types of?

a) Dimensionality reduction
b) Ensemble learning
c) Feature selection
d) Neural network architectures

Answer: b) Ensemble learning
πŸ‘‰ They combine multiple models to improve performance.

Q30. Which of the following is NOT a feature scaling method?

a) Min-Max Normalization
b) Z-score Standardization
c) One-hot encoding
d) Robust Scaler

Answer: c) One-hot encoding
πŸ‘‰ One-hot encoding is categorical encoding, not scaling.

Q31. What is backpropagation used for in neural networks?

a) Feature scaling
b) Weight optimization
c) Model evaluation
d) Data cleaning

Answer: b) Weight optimization
πŸ‘‰ Backpropagation adjusts weights using gradient descent.

Q32. Which of the following activation functions outputs values between 0 and 1?

a) ReLU
b) Sigmoid
c) Tanh
d) Softmax

Answer: b) Sigmoid
πŸ‘‰ Sigmoid maps values into [0,1].

Q33. Which algorithm is best for market basket analysis?

a) Apriori
b) K-means
c) SVM
d) Random Forest

Answer: a) Apriori
πŸ‘‰ Apriori is used for association rule mining.

Q34. What does an ROC curve show?

a) Sensitivity vs Specificity
b) True Positive Rate vs False Positive Rate
c) Accuracy vs Precision
d) Loss vs Accuracy

Answer: b) True Positive Rate vs False Positive Rate
πŸ‘‰ ROC curve helps evaluate classification performance.

Q35. Which ML technique is most suitable for time series forecasting?

a) Regression
b) ARIMA
c) PCA
d) Clustering

Answer: b) ARIMA
πŸ‘‰ ARIMA is widely used for time series forecasting.

Q36. What is the vanishing gradient problem in deep learning?

a) Gradient becomes too large
b) Gradient becomes too small to update weights
c) Gradient never changes
d) Gradient oscillates

Answer: b) Gradient becomes too small to update weights
πŸ‘‰ It slows learning in deep networks.

Q37. Which technique helps prevent overfitting in decision trees?

a) Bagging
b) Pruning
c) Dropout
d) Gradient Descent

Answer: b) Pruning
πŸ‘‰ Pruning removes branches to reduce complexity.

Q38. Which of the following is a type of anomaly detection algorithm?

a) Isolation Forest
b) Logistic Regression
c) PCA
d) KNN

Answer: a) Isolation Forest
πŸ‘‰ Isolation Forest is used for anomaly detection.

Q39. What is the purpose of feature scaling?

a) To reduce overfitting
b) To normalize features to a common scale
c) To encode categorical data
d) To improve interpretability

Answer: b) To normalize features to a common scale
πŸ‘‰ Scaling ensures features contribute equally to training.

Q40. Which of the following is an unsupervised ML problem?

a) Predicting loan default
b) Fraud detection
c) Customer segmentation
d) Predicting stock prices

Answer: c) Customer segmentation
πŸ‘‰ Segmentation is clustering (unsupervised learning).

Q41. Which algorithm is most commonly used for recommendation systems?

a) Naive Bayes
b) Collaborative Filtering
c) PCA
d) Random Forest

Answer: b) Collaborative Filtering
πŸ‘‰ Collaborative filtering powers Netflix and Amazon recommendations.

Q42. Which optimizer is commonly used in deep learning?

a) Adam
b) K-means
c) Naive Bayes
d) SVM

Answer: a) Adam
πŸ‘‰ Adam is widely used due to efficiency and adaptability.

Q43. What is the purpose of word embeddings in NLP?

a) To reduce stopwords
b) To convert words into numeric vectors
c) To improve grammar checking
d) To tokenize text

Answer: b) To convert words into numeric vectors
πŸ‘‰ Word embeddings (Word2Vec, GloVe) map words into dense vectors.

Q44. Which ML algorithm is sensitive to feature scaling?

a) Decision Trees
b) Random Forest
c) KNN
d) Naive Bayes

Answer: c) KNN
πŸ‘‰ KNN depends on distance metrics; scaling is essential.

Q45. What is the curse of dimensionality?

a) Too few features
b) Too many features causing sparsity
c) Too many models
d) Too much noise

Answer: b) Too many features causing sparsity
πŸ‘‰ High dimensions reduce model performance.

Q46. Which of the following is NOT a type of machine learning?

a) Supervised
b) Unsupervised
c) Reinforcement
d) Approximation

Answer: d) Approximation
πŸ‘‰ Approximation is not an ML category.

Q47. Which ML concept helps select the most relevant features?

a) PCA
b) Feature Engineering
c) Feature Selection
d) Regularization

Answer: c) Feature Selection
πŸ‘‰ Feature selection reduces irrelevant features.

Q48. What is the main difference between Bagging and Boosting?

a) Bagging builds models sequentially, Boosting in parallel
b) Bagging builds models in parallel, Boosting sequentially
c) Both are parallel
d) Both are sequential

Answer: b) Bagging builds models in parallel, Boosting sequentially
πŸ‘‰ Bagging = parallel, Boosting = sequential with weights.

Q49. Which of the following is an evaluation metric for clustering?

a) Adjusted Rand Index
b) Precision
c) Recall
d) Accuracy

Answer: a) Adjusted Rand Index
πŸ‘‰ ARI is used to measure clustering performance.

Q50. Which of the following frameworks is widely used in ML?

a) TensorFlow
b) Django
c) Angular
d) Flask

Answer: a) TensorFlow
πŸ‘‰ TensorFlow is a popular ML framework.

Top comments (0)