50 Machine Learning MCQs with Answers
Q1. Which of the following is an example of supervised learning?
a) Clustering customers based on purchases
b) Predicting house prices using historical data
c) Finding anomalies in a dataset
d) Dimensionality reduction
Answer: b) Predicting house prices using historical data
π Supervised learning uses labeled data to train models.
Q2. Which algorithm is used for classification problems?
a) K-means
b) Linear Regression
c) Logistic Regression
d) PCA
Answer: c) Logistic Regression
π Logistic regression predicts categorical outcomes (yes/no).
Q3. What does overfitting in ML mean?
a) Model performs well on test data but poorly on training data
b) Model performs well on training data but poorly on test data
c) Model performs equally on both training and test data
d) Model is too simple to capture complexity
Answer: b) Model performs well on training data but poorly on test data
π Overfitting means memorization instead of generalization.
Q4. Which evaluation metric is best for imbalanced classification problems?
a) Accuracy
b) Precision & Recall
c) RΒ² Score
d) Mean Squared Error
Answer: b) Precision & Recall
π Accuracy is misleading in imbalanced datasets; Precision & Recall are better.
Q5. Which of the following is a dimensionality reduction technique?
a) K-means
b) PCA
c) Decision Trees
d) Naive Bayes
Answer: b) PCA
π Principal Component Analysis reduces the number of features.
Q6. Which ML technique is used in spam email detection?
a) Clustering
b) Classification
c) Regression
d) Reinforcement Learning
Answer: b) Classification
π Spam detection is a binary classification problem.
Q7. In k-NN algorithm, βkβ represents what?
a) Number of features
b) Number of clusters
c) Number of nearest neighbors considered
d) Number of iterations
Answer: c) Number of nearest neighbors considered
π k-NN predicts based on the majority class of k nearest neighbors.
Q8. Which ML approach learns by interacting with an environment and receiving rewards?
a) Supervised Learning
b) Unsupervised Learning
c) Reinforcement Learning
d) Semi-supervised Learning
Answer: c) Reinforcement Learning
π Reinforcement learning is based on rewards and penalties.
Q9. Gradient Descent is used for?
a) Feature selection
b) Optimization
c) Regularization
d) Clustering
Answer: b) Optimization
π Gradient descent minimizes loss functions by updating weights.
Q10. Which activation function is commonly used in hidden layers of neural networks?
a) Sigmoid
b) ReLU
c) Softmax
d) Linear
Answer: b) ReLU
π ReLU is widely used as it reduces vanishing gradient issues.
Q11. What type of ML algorithm is K-means clustering?
a) Supervised
b) Unsupervised
c) Reinforcement
d) Semi-supervised
Answer: b) Unsupervised
π K-means is an unsupervised clustering algorithm.
Q12. Which of the following is an ensemble learning technique?
a) Decision Tree
b) Random Forest
c) Logistic Regression
d) PCA
Answer: b) Random Forest
π Random Forest combines multiple decision trees (bagging).
Q13. Which regularization technique adds L1 penalty to regression?
a) Ridge Regression
b) Lasso Regression
c) Elastic Net
d) Logistic Regression
Answer: b) Lasso Regression
π Lasso uses L1 penalty to shrink coefficients to zero.
Q14. Which ML concept is used to avoid overfitting in neural networks?
a) Dropout
b) Gradient Descent
c) Backpropagation
d) Normalization
Answer: a) Dropout
π Dropout randomly disables neurons during training to prevent overfitting.
Q15. ROC curve is used for?
a) Clustering evaluation
b) Classification model performance
c) Regression error measurement
d) Feature selection
Answer: b) Classification model performance
π ROC curve evaluates true positive vs false positive rates.
Q16. Which ML algorithm is best for text sentiment analysis?
a) K-means
b) Naive Bayes
c) PCA
d) KNN
Answer: b) Naive Bayes
π Naive Bayes is effective for text classification problems.
Q17. What is the purpose of a confusion matrix?
a) To visualize clustering results
b) To evaluate classification performance
c) To check feature correlations
d) To optimize hyperparameters
Answer: b) To evaluate classification performance
π It shows TP, FP, TN, FN counts.
Q18. Which ML concept is related to bias-variance tradeoff?
a) Overfitting and underfitting
b) Feature scaling
c) Normalization
d) Hyperparameter tuning
Answer: a) Overfitting and underfitting
π Bias-variance tradeoff balances model complexity and accuracy.
Q19. What does TF-IDF stand for in text mining?
a) Term Frequency - Inverse Document Frequency
b) Total Frequency - Inverse Data Factor
c) Text Factor - Input Document Feature
d) Term Feature - Important Document Frequency
Answer: a) Term Frequency - Inverse Document Frequency
π TF-IDF highlights important words in documents.
Q20. Which of the following is NOT a supervised ML algorithm?
a) Decision Tree
b) SVM
c) K-means
d) Linear Regression
Answer: c) K-means
π K-means is unsupervised, others are supervised.
Q21. In reinforcement learning, the agent learns by?
a) Labelled data
b) Clustering
c) Rewards & Penalties
d) Regression models
Answer: c) Rewards & Penalties
π RL is based on trial and error interactions with the environment.
Q22. Which gradient descent variant updates weights after each training example?
a) Batch Gradient Descent
b) Mini-Batch Gradient Descent
c) Stochastic Gradient Descent
d) Regularized Gradient Descent
Answer: c) Stochastic Gradient Descent
π SGD updates weights after every training sample.
Q23. Which algorithm is used for dimensionality reduction in images?
a) PCA
b) KNN
c) Naive Bayes
d) Decision Trees
Answer: a) PCA
π PCA reduces image feature dimensions effectively.
Q24. What is the main purpose of cross-validation?
a) Feature selection
b) Hyperparameter tuning
c) Model evaluation
d) Data cleaning
Answer: c) Model evaluation
π Cross-validation ensures the model generalizes well.
Q25. Which of the following is a kernel-based method?
a) Logistic Regression
b) Decision Tree
c) SVM
d) Random Forest
Answer: c) SVM
π Support Vector Machines use kernels for complex boundaries.
Q26. Which technique is used to convert categorical data into numeric?
a) Normalization
b) Standardization
c) One-hot encoding
d) PCA
Answer: c) One-hot encoding
π One-hot encoding represents categories as binary vectors.
Q27. Which of the following is a clustering algorithm?
a) Decision Tree
b) Random Forest
c) K-means
d) Logistic Regression
Answer: c) K-means
π K-means groups data into clusters.
Q28. Which metric is commonly used in regression problems?
a) Precision
b) Recall
c) Mean Squared Error (MSE)
d) Accuracy
Answer: c) Mean Squared Error (MSE)
π MSE is a standard metric for regression models.
Q29. Bagging and Boosting are types of?
a) Dimensionality reduction
b) Ensemble learning
c) Feature selection
d) Neural network architectures
Answer: b) Ensemble learning
π They combine multiple models to improve performance.
Q30. Which of the following is NOT a feature scaling method?
a) Min-Max Normalization
b) Z-score Standardization
c) One-hot encoding
d) Robust Scaler
Answer: c) One-hot encoding
π One-hot encoding is categorical encoding, not scaling.
Q31. What is backpropagation used for in neural networks?
a) Feature scaling
b) Weight optimization
c) Model evaluation
d) Data cleaning
Answer: b) Weight optimization
π Backpropagation adjusts weights using gradient descent.
Q32. Which of the following activation functions outputs values between 0 and 1?
a) ReLU
b) Sigmoid
c) Tanh
d) Softmax
Answer: b) Sigmoid
π Sigmoid maps values into [0,1].
Q33. Which algorithm is best for market basket analysis?
a) Apriori
b) K-means
c) SVM
d) Random Forest
Answer: a) Apriori
π Apriori is used for association rule mining.
Q34. What does an ROC curve show?
a) Sensitivity vs Specificity
b) True Positive Rate vs False Positive Rate
c) Accuracy vs Precision
d) Loss vs Accuracy
Answer: b) True Positive Rate vs False Positive Rate
π ROC curve helps evaluate classification performance.
Q35. Which ML technique is most suitable for time series forecasting?
a) Regression
b) ARIMA
c) PCA
d) Clustering
Answer: b) ARIMA
π ARIMA is widely used for time series forecasting.
Q36. What is the vanishing gradient problem in deep learning?
a) Gradient becomes too large
b) Gradient becomes too small to update weights
c) Gradient never changes
d) Gradient oscillates
Answer: b) Gradient becomes too small to update weights
π It slows learning in deep networks.
Q37. Which technique helps prevent overfitting in decision trees?
a) Bagging
b) Pruning
c) Dropout
d) Gradient Descent
Answer: b) Pruning
π Pruning removes branches to reduce complexity.
Q38. Which of the following is a type of anomaly detection algorithm?
a) Isolation Forest
b) Logistic Regression
c) PCA
d) KNN
Answer: a) Isolation Forest
π Isolation Forest is used for anomaly detection.
Q39. What is the purpose of feature scaling?
a) To reduce overfitting
b) To normalize features to a common scale
c) To encode categorical data
d) To improve interpretability
Answer: b) To normalize features to a common scale
π Scaling ensures features contribute equally to training.
Q40. Which of the following is an unsupervised ML problem?
a) Predicting loan default
b) Fraud detection
c) Customer segmentation
d) Predicting stock prices
Answer: c) Customer segmentation
π Segmentation is clustering (unsupervised learning).
Q41. Which algorithm is most commonly used for recommendation systems?
a) Naive Bayes
b) Collaborative Filtering
c) PCA
d) Random Forest
Answer: b) Collaborative Filtering
π Collaborative filtering powers Netflix and Amazon recommendations.
Q42. Which optimizer is commonly used in deep learning?
a) Adam
b) K-means
c) Naive Bayes
d) SVM
Answer: a) Adam
π Adam is widely used due to efficiency and adaptability.
Q43. What is the purpose of word embeddings in NLP?
a) To reduce stopwords
b) To convert words into numeric vectors
c) To improve grammar checking
d) To tokenize text
Answer: b) To convert words into numeric vectors
π Word embeddings (Word2Vec, GloVe) map words into dense vectors.
Q44. Which ML algorithm is sensitive to feature scaling?
a) Decision Trees
b) Random Forest
c) KNN
d) Naive Bayes
Answer: c) KNN
π KNN depends on distance metrics; scaling is essential.
Q45. What is the curse of dimensionality?
a) Too few features
b) Too many features causing sparsity
c) Too many models
d) Too much noise
Answer: b) Too many features causing sparsity
π High dimensions reduce model performance.
Q46. Which of the following is NOT a type of machine learning?
a) Supervised
b) Unsupervised
c) Reinforcement
d) Approximation
Answer: d) Approximation
π Approximation is not an ML category.
Q47. Which ML concept helps select the most relevant features?
a) PCA
b) Feature Engineering
c) Feature Selection
d) Regularization
Answer: c) Feature Selection
π Feature selection reduces irrelevant features.
Q48. What is the main difference between Bagging and Boosting?
a) Bagging builds models sequentially, Boosting in parallel
b) Bagging builds models in parallel, Boosting sequentially
c) Both are parallel
d) Both are sequential
Answer: b) Bagging builds models in parallel, Boosting sequentially
π Bagging = parallel, Boosting = sequential with weights.
Q49. Which of the following is an evaluation metric for clustering?
a) Adjusted Rand Index
b) Precision
c) Recall
d) Accuracy
Answer: a) Adjusted Rand Index
π ARI is used to measure clustering performance.
Q50. Which of the following frameworks is widely used in ML?
a) TensorFlow
b) Django
c) Angular
d) Flask
Answer: a) TensorFlow
π TensorFlow is a popular ML framework.
Top comments (0)