A W S C E R T I F I C A T I O N
MLS C01 Machine Learning Specialty Practice Exam
Exam Number: 1211 | Last updated April 24, 2026 | 700+ questions across 4 vendor-aligned objectives
The AWS Certified Machine Learning — Specialty (MLS-C01) targets data scientists and ML practitioners who design, build, train, tune, and deploy machine learning solutions on AWS. Candidates typically have two or more years of hands-on experience developing and running ML or deep-learning workloads, plus working knowledge of Python, common ML frameworks, and statistical fundamentals. The exam emphasizes algorithm selection, model evaluation, and production-grade deployment patterns.
Modeling and Machine Learning Implementation and Operations are the two largest domains. Modeling (36%) is the heaviest weighted area and covers algorithm selection across regression, classification, clustering, recommendation, and deep learning; hyperparameter tuning with Amazon SageMaker Automatic Model Tuning; and evaluation metrics such as precision, recall, F1, AUC-ROC, and RMSE. Machine Learning Implementation and Operations (20%) covers Amazon SageMaker endpoints (real-time, batch transform, asynchronous), AWS Lambda for inference glue, Amazon API Gateway, and Amazon EventBridge for retraining triggers.
The remaining domains cover the data and analysis lifecycle. Data Engineering (20%) covers Amazon S3 data lakes, AWS Glue ETL, Amazon Kinesis Data Streams and Amazon Data Firehose for streaming ingestion, and AWS Lake Formation for fine-grained access. Exploratory Data Analysis (24%) covers feature engineering, handling missing values and class imbalance, and visualization with Amazon SageMaker Studio notebooks and Amazon QuickSight.
Every answer links to the source. Each explanation below includes a hyperlink to the exact AWS documentation page the question was derived from. PowerKram is the only practice platform with source-verified explanations. Learn about our methodology →
192
practice exam users
90.1%
satisfied users
86.1%
passed the exam
4/5
quality rating
Test your aws-machine-learning-specialty knowledge
10 of 700+ questions
Question #1 - Data Engineering
A team needs to ingest 200 GB/hour of clickstream data into S3 in Parquet format, partitioned by event date, with minimal code.
Which service handles this most simply?
A) AWS Snowball Edge
B) EC2 fleet running custom Python scripts
C) Amazon Kinesis Data Firehose with Parquet conversion and dynamic partitioning
D) Manual S3 multipart uploads from on-prem
Show solution
Correct answers: C – Explanation:
Firehose natively converts JSON to Parquet via Glue schemas and supports dynamic partitioning by event attributes — fully managed at hundreds of GB/hour. Custom EC2 scripts are high-ops; Snowball is offline; manual multipart is not streaming. Source: [Firehose Parquet conversion](https://docs.aws.amazon.com/firehose/latest/dev/record-format-conversion.html)
Question #2 - Exploratory Data Analysis
An analyst sees a regression target with strong right-skew and a long tail of high values.
Which transformation typically improves a linear model on this target?
A) Min-max scaling only
B) One-hot encoding the target
C) Log transformation of the target
D) Removing all rows above the median
Show solution
Correct answers: C – Explanation:
A log transform compresses the long right tail toward normality, improving linear-model assumptions. Min-max only rescales without changing skew; one-hot is for categoricals; truncating data at the median throws away signal and biases the model. Source: [SageMaker feature engineering](https://docs.aws.amazon.com/sagemaker/latest/dg/data-prep-process-features.html)
Question #3 - Modeling
A binary classifier on imbalanced fraud data shows 99% accuracy but catches almost no fraud.
Which evaluation metric should the team optimize instead?
A) Accuracy
B) R-squared
C) Mean squared error
D) F1 score or AUPRC (precision-recall AUC)
Show solution
Correct answers: D – Explanation:
On imbalanced classification, F1 and AUPRC reflect minority-class performance much better than accuracy. MSE and R-squared are regression metrics. Source: [SageMaker model evaluation](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-considerations.html)
Question #4 - Modeling
An engineer wants the SageMaker built-in algorithm best suited for high-dimensional time-series forecasting across many related items (e.g., per-store, per-SKU sales).
Which algorithm is the right choice?
A) Linear Learner
B) Random Cut Forest
C) K-Means
D) DeepAR
Show solution
Correct answers: D – Explanation:
DeepAR trains a single global RNN across many related time series and outperforms per-series classical methods when many series are available. Linear Learner is for regression/classification; K-Means clusters; RCF is for anomaly detection. Source: [SageMaker DeepAR](https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html)
Question #5 - Machine Learning Implementation and Operations
A model in production must respond in <100 ms to single requests and traffic is steady.
Which deployment option fits?
A) SageMaker Real-time Inference endpoint
B) SageMaker Batch Transform
C) SageMaker Asynchronous Inference
D) Save the model to S3 and load it ad-hoc
Show solution
Correct answers: A – Explanation:
Real-time endpoints provide low-latency single-request inference for steady traffic. Batch Transform is offline; Async Inference suits long-running requests; loading from S3 each call is too slow. Source: [SageMaker hosting options](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model.html)
Question #6 - Data Engineering
A team must label 100,000 images for object detection with a managed labeling workflow and optional human-in-the-loop verification.
Which service is purpose-built?
A) Amazon Translate
B) Amazon Comprehend
C) Amazon Polly
D) Amazon SageMaker Ground Truth
Show solution
Correct answers: D – Explanation:
Ground Truth provides managed labeling workflows (private, vendor, or Mechanical Turk) with active learning to reduce labeling cost. The other services are pre-built ML APIs unrelated to labeling. Source: [SageMaker Ground Truth](https://docs.aws.amazon.com/sagemaker/latest/dg/sms.html)
Question #7 - Modeling
Hyperparameter tuning is needed across 8 dimensions with non-trivial training time per trial.
Which strategy is most efficient?
A) Bayesian optimization (SageMaker Automatic Model Tuning)
B) Grid search across all combinations
C) Single manual trial
D) Random guess and stop after one
Show solution
Correct answers: A – Explanation:
Bayesian optimization learns from prior trials to focus on promising regions, dramatically outperforming grid search in high dimensions. Manual or single-trial approaches won’t find good configurations. Source: [SageMaker Automatic Model Tuning](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html)
Question #8 - Machine Learning Implementation and Operations
A deployed model’s predictions are slowly degrading over weeks.
Which AWS service detects feature and prediction drift over time?
A) AWS CloudFront
B) Amazon SageMaker Model Monitor
C) Amazon Macie
D) AWS Trusted Advisor
Show solution
Correct answers: B – Explanation:
Model Monitor schedules baseline-vs-production checks for data quality, drift, model quality, and bias. The other services are unrelated. Source: [SageMaker Model Monitor](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html)
Question #9 - Modeling
An engineer wants to use a pre-trained image classifier and adapt it to a new dataset of medical images.
Which technique is the natural fit?
A) Train from scratch on the small dataset
B) Transfer learning by fine-tuning the pre-trained backbone
C) Use only k-nearest neighbors on raw pixels
D) Train a linear regression on image bytes
Show solution
Correct answers: B – Explanation:
Transfer learning leverages features learned on a large source dataset; fine-tuning on the target medical set typically achieves much higher accuracy than training from scratch on limited data. KNN/linear on raw pixels ignores image structure. Source: [SageMaker built-in image classification](https://docs.aws.amazon.com/sagemaker/latest/dg/image-classification.html)
Question #10 - Machine Learning Implementation and Operations
A workload spikes for ~30 minutes a day and is idle the rest of the time. Cost is the priority.
Which inference option is most cost-effective?
A) A 24/7 ml.p4d real-time endpoint
B) SageMaker Serverless Inference
C) Multi-Region active/active provisioned endpoints
D) Always-on Multi-Model Endpoint at peak size
Show solution
Correct answers: B – Explanation:
Serverless Inference scales to zero between bursts and bills per request — ideal for short, infrequent spikes. The other options keep costly capacity running idle. Source: [SageMaker Serverless Inference](https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html)
Get 700+ more questions with source-linked explanations
Every answer traces to the exact AWS documentation page — so you learn from the source, not just memorize answers.
Exam mode & learn mode · Score by objective · Updated April 24, 2026
Learn more...
What the aws-machine-learning-specialty exam measures
- Data Engineering (20%) — Build training data pipelines on Amazon S3, AWS Glue, Amazon Kinesis Data Streams, and Amazon Data Firehose; govern access with AWS Lake Formation.
- Exploratory Data Analysis (24%) — Engineer features, handle missing values and class imbalance, and visualize distributions with Amazon SageMaker Studio and Amazon QuickSight.
- Modeling (36%) — Select algorithms across regression, classification, clustering, recommendation, and deep learning; tune hyperparameters with Amazon SageMaker Automatic Model Tuning; evaluate with precision, recall, F1, AUC-ROC, and RMSE.
- Machine Learning Implementation and Operations (20%) — Deploy real-time, batch transform, and asynchronous Amazon SageMaker endpoints; orchestrate retraining with AWS Lambda and Amazon EventBridge; monitor models in production.
How to prepare for this exam
- Review the official AWS exam guide and confirm the latest domain weights and content scope before scheduling.
- Complete the matching learning plan on AWS Skill Builder, including the digital courses and exam prep modules.
- Build hands-on muscle memory in an AWS Free Tier account by deploying the services that appear in the Modeling domain.
- Apply your skills to a real-world project — workplace assignments, volunteer work, or open-source contributions where AWS services solve a concrete problem.
- Master one objective at a time, beginning with the highest-weighted domain so the score impact of each study session is maximized.
- Run PowerKram in Learn mode to read the explanations and follow every sourced documentation link until you can predict the right answer before reading the choices.
- Switch to PowerKram Exam mode across all objectives once your accuracy in Learn mode passes 85%, simulating the timed exam experience.
Career paths and salary outlook
The Machine Learning Specialty credential maps to high-demand data science roles:
- Senior Machine Learning Engineer — $155,000 to $245,000. Levels.fyi: ML Engineer Compensation
- Data Scientist (Cloud-focused) — $130,000 to $200,000. BLS: Data Scientists Outlook
- Applied Scientist — $165,000 to $260,000. Glassdoor: Applied Scientist Salaries
Official resources
Pair Amazon SageMaker hands-on time with the AWS-authored ML guides for the cleanest preparation:
