ARUNNACHALAM R S

Posted on Sep 1

50 Machine Learning MCQs with Answers

#machinelearning #ai

50 Machine Learning MCQs with Answers
Q1. Which of the following is an example of supervised learning?

a) Clustering customers based on purchases
b) Predicting house prices using historical data
c) Finding anomalies in a dataset
d) Dimensionality reduction

Answer: b) Predicting house prices using historical data
👉 Supervised learning uses labeled data to train models.

Q2. Which algorithm is used for classification problems?

a) K-means
b) Linear Regression
c) Logistic Regression
d) PCA

Answer: c) Logistic Regression
👉 Logistic regression predicts categorical outcomes (yes/no).

Q3. What does overfitting in ML mean?

a) Model performs well on test data but poorly on training data
b) Model performs well on training data but poorly on test data
c) Model performs equally on both training and test data
d) Model is too simple to capture complexity

Answer: b) Model performs well on training data but poorly on test data
👉 Overfitting means memorization instead of generalization.

Q4. Which evaluation metric is best for imbalanced classification problems?

a) Accuracy
b) Precision & Recall
c) R² Score
d) Mean Squared Error

Answer: b) Precision & Recall
👉 Accuracy is misleading in imbalanced datasets; Precision & Recall are better.

Q5. Which of the following is a dimensionality reduction technique?

a) K-means
b) PCA
c) Decision Trees
d) Naive Bayes

Answer: b) PCA
👉 Principal Component Analysis reduces the number of features.

Q6. Which ML technique is used in spam email detection?

a) Clustering
b) Classification
c) Regression
d) Reinforcement Learning

Answer: b) Classification
👉 Spam detection is a binary classification problem.

Q7. In k-NN algorithm, ‘k’ represents what?

a) Number of features
b) Number of clusters
c) Number of nearest neighbors considered
d) Number of iterations

Answer: c) Number of nearest neighbors considered
👉 k-NN predicts based on the majority class of k nearest neighbors.

Q8. Which ML approach learns by interacting with an environment and receiving rewards?

a) Supervised Learning
b) Unsupervised Learning
c) Reinforcement Learning
d) Semi-supervised Learning

Answer: c) Reinforcement Learning
👉 Reinforcement learning is based on rewards and penalties.

Q9. Gradient Descent is used for?

a) Feature selection
b) Optimization
c) Regularization
d) Clustering

Answer: b) Optimization
👉 Gradient descent minimizes loss functions by updating weights.

Q10. Which activation function is commonly used in hidden layers of neural networks?

a) Sigmoid
b) ReLU
c) Softmax
d) Linear

Answer: b) ReLU
👉 ReLU is widely used as it reduces vanishing gradient issues.

Q11. What type of ML algorithm is K-means clustering?

a) Supervised
b) Unsupervised
c) Reinforcement
d) Semi-supervised

Answer: b) Unsupervised
👉 K-means is an unsupervised clustering algorithm.

Q12. Which of the following is an ensemble learning technique?

a) Decision Tree
b) Random Forest
c) Logistic Regression
d) PCA

Answer: b) Random Forest
👉 Random Forest combines multiple decision trees (bagging).

Q13. Which regularization technique adds L1 penalty to regression?

a) Ridge Regression
b) Lasso Regression
c) Elastic Net
d) Logistic Regression

Answer: b) Lasso Regression
👉 Lasso uses L1 penalty to shrink coefficients to zero.

Q14. Which ML concept is used to avoid overfitting in neural networks?

a) Dropout
b) Gradient Descent
c) Backpropagation
d) Normalization

Answer: a) Dropout
👉 Dropout randomly disables neurons during training to prevent overfitting.

Q15. ROC curve is used for?

a) Clustering evaluation
b) Classification model performance
c) Regression error measurement
d) Feature selection

Answer: b) Classification model performance
👉 ROC curve evaluates true positive vs false positive rates.

Q16. Which ML algorithm is best for text sentiment analysis?

a) K-means
b) Naive Bayes
c) PCA
d) KNN

Answer: b) Naive Bayes
👉 Naive Bayes is effective for text classification problems.

Q17. What is the purpose of a confusion matrix?

a) To visualize clustering results
b) To evaluate classification performance
c) To check feature correlations
d) To optimize hyperparameters

Answer: b) To evaluate classification performance
👉 It shows TP, FP, TN, FN counts.

Q18. Which ML concept is related to bias-variance tradeoff?

a) Overfitting and underfitting
b) Feature scaling
c) Normalization
d) Hyperparameter tuning

Answer: a) Overfitting and underfitting
👉 Bias-variance tradeoff balances model complexity and accuracy.

Q19. What does TF-IDF stand for in text mining?

a) Term Frequency - Inverse Document Frequency
b) Total Frequency - Inverse Data Factor
c) Text Factor - Input Document Feature
d) Term Feature - Important Document Frequency

Answer: a) Term Frequency - Inverse Document Frequency
👉 TF-IDF highlights important words in documents.

Q20. Which of the following is NOT a supervised ML algorithm?

a) Decision Tree
b) SVM
c) K-means
d) Linear Regression

Answer: c) K-means
👉 K-means is unsupervised, others are supervised.

Q21. In reinforcement learning, the agent learns by?

a) Labelled data
b) Clustering
c) Rewards & Penalties
d) Regression models

Answer: c) Rewards & Penalties
👉 RL is based on trial and error interactions with the environment.

Q22. Which gradient descent variant updates weights after each training example?

a) Batch Gradient Descent
b) Mini-Batch Gradient Descent
c) Stochastic Gradient Descent
d) Regularized Gradient Descent

Answer: c) Stochastic Gradient Descent
👉 SGD updates weights after every training sample.

Q23. Which algorithm is used for dimensionality reduction in images?

a) PCA
b) KNN
c) Naive Bayes
d) Decision Trees

Answer: a) PCA
👉 PCA reduces image feature dimensions effectively.

Q24. What is the main purpose of cross-validation?

a) Feature selection
b) Hyperparameter tuning
c) Model evaluation
d) Data cleaning

Answer: c) Model evaluation
👉 Cross-validation ensures the model generalizes well.

Q25. Which of the following is a kernel-based method?

a) Logistic Regression
b) Decision Tree
c) SVM
d) Random Forest

Answer: c) SVM
👉 Support Vector Machines use kernels for complex boundaries.

Q26. Which technique is used to convert categorical data into numeric?

a) Normalization
b) Standardization
c) One-hot encoding
d) PCA

Answer: c) One-hot encoding
👉 One-hot encoding represents categories as binary vectors.

Q27. Which of the following is a clustering algorithm?

a) Decision Tree
b) Random Forest
c) K-means
d) Logistic Regression

Answer: c) K-means
👉 K-means groups data into clusters.

Q28. Which metric is commonly used in regression problems?

a) Precision
b) Recall
c) Mean Squared Error (MSE)
d) Accuracy

Answer: c) Mean Squared Error (MSE)
👉 MSE is a standard metric for regression models.

Q29. Bagging and Boosting are types of?

a) Dimensionality reduction
b) Ensemble learning
c) Feature selection
d) Neural network architectures

Answer: b) Ensemble learning
👉 They combine multiple models to improve performance.

Q30. Which of the following is NOT a feature scaling method?

a) Min-Max Normalization
b) Z-score Standardization
c) One-hot encoding
d) Robust Scaler

Answer: c) One-hot encoding
👉 One-hot encoding is categorical encoding, not scaling.

Q31. What is backpropagation used for in neural networks?

a) Feature scaling
b) Weight optimization
c) Model evaluation
d) Data cleaning

Answer: b) Weight optimization
👉 Backpropagation adjusts weights using gradient descent.

Q32. Which of the following activation functions outputs values between 0 and 1?

a) ReLU
b) Sigmoid
c) Tanh
d) Softmax

Answer: b) Sigmoid
👉 Sigmoid maps values into [0,1].

Q33. Which algorithm is best for market basket analysis?

a) Apriori
b) K-means
c) SVM
d) Random Forest

Answer: a) Apriori
👉 Apriori is used for association rule mining.

Q34. What does an ROC curve show?

a) Sensitivity vs Specificity
b) True Positive Rate vs False Positive Rate
c) Accuracy vs Precision
d) Loss vs Accuracy

Answer: b) True Positive Rate vs False Positive Rate
👉 ROC curve helps evaluate classification performance.

Q35. Which ML technique is most suitable for time series forecasting?

a) Regression
b) ARIMA
c) PCA
d) Clustering

Answer: b) ARIMA
👉 ARIMA is widely used for time series forecasting.

Q36. What is the vanishing gradient problem in deep learning?

a) Gradient becomes too large
b) Gradient becomes too small to update weights
c) Gradient never changes
d) Gradient oscillates

Answer: b) Gradient becomes too small to update weights
👉 It slows learning in deep networks.

Q37. Which technique helps prevent overfitting in decision trees?

a) Bagging
b) Pruning
c) Dropout
d) Gradient Descent

Answer: b) Pruning
👉 Pruning removes branches to reduce complexity.

Q38. Which of the following is a type of anomaly detection algorithm?

a) Isolation Forest
b) Logistic Regression
c) PCA
d) KNN

Answer: a) Isolation Forest
👉 Isolation Forest is used for anomaly detection.

Q39. What is the purpose of feature scaling?

a) To reduce overfitting
b) To normalize features to a common scale
c) To encode categorical data
d) To improve interpretability

Answer: b) To normalize features to a common scale
👉 Scaling ensures features contribute equally to training.

Q40. Which of the following is an unsupervised ML problem?

a) Predicting loan default
b) Fraud detection
c) Customer segmentation
d) Predicting stock prices

Answer: c) Customer segmentation
👉 Segmentation is clustering (unsupervised learning).

Q41. Which algorithm is most commonly used for recommendation systems?

a) Naive Bayes
b) Collaborative Filtering
c) PCA
d) Random Forest

Answer: b) Collaborative Filtering
👉 Collaborative filtering powers Netflix and Amazon recommendations.

Q42. Which optimizer is commonly used in deep learning?

a) Adam
b) K-means
c) Naive Bayes
d) SVM

Answer: a) Adam
👉 Adam is widely used due to efficiency and adaptability.

Q43. What is the purpose of word embeddings in NLP?

a) To reduce stopwords
b) To convert words into numeric vectors
c) To improve grammar checking
d) To tokenize text

Answer: b) To convert words into numeric vectors
👉 Word embeddings (Word2Vec, GloVe) map words into dense vectors.

Q44. Which ML algorithm is sensitive to feature scaling?

a) Decision Trees
b) Random Forest
c) KNN
d) Naive Bayes

Answer: c) KNN
👉 KNN depends on distance metrics; scaling is essential.

Q45. What is the curse of dimensionality?

a) Too few features
b) Too many features causing sparsity
c) Too many models
d) Too much noise

Answer: b) Too many features causing sparsity
👉 High dimensions reduce model performance.

Q46. Which of the following is NOT a type of machine learning?

a) Supervised
b) Unsupervised
c) Reinforcement
d) Approximation

Answer: d) Approximation
👉 Approximation is not an ML category.

Q47. Which ML concept helps select the most relevant features?

a) PCA
b) Feature Engineering
c) Feature Selection
d) Regularization

Answer: c) Feature Selection
👉 Feature selection reduces irrelevant features.

Q48. What is the main difference between Bagging and Boosting?

a) Bagging builds models sequentially, Boosting in parallel
b) Bagging builds models in parallel, Boosting sequentially
c) Both are parallel
d) Both are sequential

Answer: b) Bagging builds models in parallel, Boosting sequentially
👉 Bagging = parallel, Boosting = sequential with weights.

Q49. Which of the following is an evaluation metric for clustering?

a) Adjusted Rand Index
b) Precision
c) Recall
d) Accuracy

Answer: a) Adjusted Rand Index
👉 ARI is used to measure clustering performance.

Q50. Which of the following frameworks is widely used in ML?

a) TensorFlow
b) Django
c) Angular
d) Flask

Answer: a) TensorFlow
👉 TensorFlow is a popular ML framework.