A W S C E R T I F I C A T I O N

ML Engineer Associate Practice Exam

Q: Question #8 - Data Preparation for ML

A dataset has class imbalance: 1% positive, 99% negative. Which approach helps the model learn the minority class? A) Train lessB) Drop the minority classC) Use accuracy as the only metricD) Oversample the minority class (e.g., SMOTE) or use class weights Show solution Correct answers: D – Explanation:Oversampling/SMOTE and class weights rebalance learning toward the minority class; pair with metrics like F1 or AUPRC. Dropping the minority class defeats the goal; accuracy is misleading on imbalanced data; less training won’t help. Source: [Handling imbalanced data](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-considerations.html)

Exam Number: 1209 | Last updated April 24, 2026 | 700+ questions across 4 vendor-aligned objectives

The AWS Certified Machine Learning Engineer — Associate (MLA-C01) targets practitioners who operationalize machine learning workloads on AWS, including training pipelines, model deployment, monitoring, and governance. Candidates typically have one or more years of experience using Amazon SageMaker and adjacent AWS services, plus working knowledge of Python and a major ML framework. The exam emphasizes pragmatic engineering choices rather than theoretical algorithm design.

ML Model Development and ML Solution Monitoring, Maintenance, and Security carry the largest weights. ML Model Development (26%) covers Amazon SageMaker training jobs, hyperparameter tuning with SageMaker Automatic Model Tuning, AWS Trainium and AWS Inferentia, Amazon SageMaker JumpStart, and built-in algorithms. ML Solution Monitoring, Maintenance, and Security (24%) covers Amazon SageMaker Model Monitor, AWS Identity and Access Management for ML resources, Amazon SageMaker Clarify for bias drift, and Amazon CloudWatch for operational metrics.

The remaining domains complete the MLOps lifecycle. Data Preparation for Machine Learning (28%) is also weighty and covers Amazon SageMaker Data Wrangler, AWS Glue, Amazon S3 feature stores, Amazon SageMaker Feature Store, and feature engineering patterns. Deployment and Orchestration of ML Workflows (22%) covers Amazon SageMaker Pipelines, Amazon SageMaker MLflow, AWS Step Functions, model endpoints (real-time, asynchronous, batch transform, serverless), and Amazon SageMaker Inference Recommender.

♥ Know the four Amazon SageMaker inference patterns cold (real-time, asynchronous, serverless, batch transform) and the cost and latency trade-offs of each. Memorize when to use Amazon SageMaker Feature Store online versus offline and the consistency guarantees of each. The exam mixes generative AI questions in lightly — review Amazon Bedrock customization (continued pre-training, fine-tuning, RAG) so you can distinguish them under scenario pressure.

Every answer links to the source. Each explanation below includes a hyperlink to the exact AWS documentation page the question was derived from. PowerKram is the only practice platform with source-verified explanations. Learn about our methodology →

234

practice exam users

95.3%

satisfied users

88.2%

passed the exam

4.5/5

quality rating

Test your aws-ml-engineer-associate knowledge

10 of 700+ questions

Question #1 - Data Preparation for ML

A team needs to engineer features for a customer churn model and reuse them across training and real-time inference without skew.

Which AWS service is purpose-built for this?

A) S3 with manual CSVs
B) Amazon DynamoDB only
C) Amazon SageMaker Feature Store
D) Amazon Comprehend

Show solution

Correct answers: C – Explanation:
Feature Store provides online (low-latency) and offline (training) stores from the same definition, eliminating training/serving skew. DynamoDB or S3 can store data but lack the dual-store lineage features; Comprehend is NLP. Source: [SageMaker Feature Store](https://docs.aws.amazon.com/sagemaker/latest/dg/feature-store.html)

Question #2 - ML Model Development

A model trains for hours on a single instance. The team wants to train across multiple GPUs and instances with minimal code change.

Which SageMaker feature helps most?

A) Use ml.t3.medium
B) Increase EBS volume size only
C) SageMaker distributed training libraries (data and model parallel)
D) Disable checkpointing

Show solution

Correct answers: C – Explanation:
SageMaker’s distributed training libraries provide near-linear scaling across GPUs/instances with simple wrappers. Larger EBS doesn’t speed compute; t3.medium is too small; disabling checkpoints risks losing progress. Source: [SageMaker distributed training](https://docs.aws.amazon.com/sagemaker/latest/dg/distributed-training.html)

Question #3 - Deployment and Orchestration of ML Workflows

An engineer wants A/B testing between two model versions in production with shifting traffic and metrics.

Which feature supports this directly?

A) SageMaker Endpoint production variants with traffic-splitting
B) Two unrelated endpoints with manual DNS swap
C) Lambda only
D) S3 versioning

Show solution

Correct answers: A – Explanation:
Production variants on a SageMaker endpoint serve multiple models behind one endpoint with weighted traffic and per-variant metrics. Manual DNS swaps don’t allow gradual splitting; Lambda alone lacks the model server; S3 versioning is for objects. Source: [SageMaker production variants](https://docs.aws.amazon.com/sagemaker/latest/dg/model-ab-testing.html)

Question #4 - ML Model Monitoring

A deployed model’s accuracy is silently degrading because input distribution has shifted.

Which service detects this?

A) AWS Config rules
B) Amazon CloudFront
C) AWS Trusted Advisor
D) SageMaker Model Monitor (data quality and drift)

Show solution

Correct answers: D – Explanation:
Model Monitor compares production traffic to a baseline to detect data quality issues and drift, with scheduled monitoring jobs. The other services don’t watch model inputs/outputs. Source: [SageMaker Model Monitor](https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor.html)

Question #5 - ML Model Development

A model is overfitting: training accuracy is 99% but validation is 70%.

Which actions are appropriate? (Choose two.)

A) Add regularization (dropout, L2) and/or get more diverse training data
B) Train for many more epochs on the same data
C) Use early stopping based on validation loss
D) Remove the validation set

Show solution

Correct answers: AC – Explanation:
Regularization and more data reduce variance; early stopping prevents continued overfitting once validation loss rises. More epochs typically worsen overfitting; removing validation hides the problem. Source: [Training best practices](https://docs.aws.amazon.com/sagemaker/latest/dg/training.html)

Question #6 - Deployment and Orchestration of ML Workflows

An ML pipeline must orchestrate data preprocessing, training, evaluation, conditional registration, and deployment with full lineage.

Which AWS service is purpose-built?

A) AWS CodePipeline only
B) Amazon SageMaker Pipelines
C) Cron on EC2
D) Amazon SQS

Show solution

Correct answers: B – Explanation:
SageMaker Pipelines provides ML-aware DAGs with steps, conditions, and Model Registry integration including lineage. CodePipeline is general CI/CD without ML primitives; cron lacks tracking; SQS is messaging. Source: [SageMaker Pipelines](https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines.html)

Question #7 - ML Model Monitoring

Stakeholders ask the team to explain individual model predictions and report feature importance.

Which tool provides this?

A) AWS X-Ray
B) AWS Lambda traces
C) SageMaker Clarify (explainability with SHAP)
D) S3 access logs

Show solution

Correct answers: C – Explanation:
Clarify computes SHAP-based feature attributions for individual predictions and global importance. The others don’t measure feature importance. Source: [SageMaker Clarify explainability](https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-model-explainability.html)

Question #8 - Data Preparation for ML

A dataset has class imbalance: 1% positive, 99% negative.

Which approach helps the model learn the minority class?

A) Train less
B) Drop the minority class
C) Use accuracy as the only metric
D) Oversample the minority class (e.g., SMOTE) or use class weights

Show solution

Correct answers: D – Explanation:
Oversampling/SMOTE and class weights rebalance learning toward the minority class; pair with metrics like F1 or AUPRC. Dropping the minority class defeats the goal; accuracy is misleading on imbalanced data; less training won’t help. Source: [Handling imbalanced data](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-considerations.html)

Question #9 - Deployment and Orchestration of ML Workflows

An inference workload has bursty, unpredictable traffic; cost matters more than steady-state latency.

Which SageMaker option fits best?

A) SageMaker Real-time Inference on a 24/7 ml.p4d
B) SageMaker Serverless Inference
C) SageMaker Batch Transform only
D) Multi-Region active/active provisioned endpoints

Show solution

Correct answers: B – Explanation:
Serverless Inference scales to zero between bursts and bills per request — ideal for intermittent traffic. A 24/7 GPU is wasteful; Batch is offline; multi-Region active/active is overkill. Source: [SageMaker Serverless Inference](https://docs.aws.amazon.com/sagemaker/latest/dg/serverless-endpoints.html)

Question #10 - ML Model Development

A team must tune hyperparameters efficiently across a large search space.

Which SageMaker capability does this?

A) Automatic Model Tuning (Bayesian/Hyperband search)
B) Random sampling in a notebook only
C) Manual single trials
D) Hard-code values from a textbook

Show solution

Correct answers: A – Explanation:
Automatic Model Tuning runs parallel jobs with strategies like Bayesian and Hyperband to find good hyperparameters efficiently. Manual approaches don’t scale; textbook values rarely match a specific dataset. Source: [SageMaker Automatic Model Tuning](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning.html)

Get 700+ more questions with source-linked explanations

Every answer traces to the exact AWS documentation page — so you learn from the source, not just memorize answers.

Exam mode & learn mode · Score by objective · Updated April 24, 2026

Learn more...

What the aws-ml-engineer-associate exam measures

Data Preparation for Machine Learning (28%) — Ingest and transform training data with Amazon SageMaker Data Wrangler and AWS Glue; manage features with Amazon SageMaker Feature Store; engineer features for tabular and unstructured data.
ML Model Development (26%) — Train models with Amazon SageMaker built-in algorithms and custom containers, tune with SageMaker Automatic Model Tuning, and accelerate with AWS Trainium and AWS Inferentia.
Deployment and Orchestration of ML Workflows (22%) — Orchestrate with Amazon SageMaker Pipelines and AWS Step Functions; deploy real-time, asynchronous, serverless, and batch endpoints; right-size with Amazon SageMaker Inference Recommender.
ML Solution Monitoring, Maintenance, and Security (24%) — Detect drift with Amazon SageMaker Model Monitor, audit fairness with Amazon SageMaker Clarify, secure resources with AWS Identity and Access Management, and observe with Amazon CloudWatch.

How to prepare for this exam

Review the official AWS exam guide and confirm the latest domain weights and content scope before scheduling.
Complete the matching learning plan on AWS Skill Builder, including the digital courses and exam prep modules.
Build hands-on muscle memory in an AWS Free Tier account by deploying the services that appear in the Data Preparation for Machine Learning domain.
Apply your skills to a real-world project — workplace assignments, volunteer work, or open-source contributions where AWS services solve a concrete problem.
Master one objective at a time, beginning with the highest-weighted domain so the score impact of each study session is maximized.
Run PowerKram in Learn mode to read the explanations and follow every sourced documentation link until you can predict the right answer before reading the choices.
Switch to PowerKram Exam mode across all objectives once your accuracy in Learn mode passes 85%, simulating the timed exam experience.

Career paths and salary outlook

ML Engineer Associate maps directly to one of the highest-paying technical career tracks in cloud:

Machine Learning Engineer — $145,000 to $230,000. Levels.fyi: ML Engineer Compensation
MLOps Engineer — $135,000 to $210,000. Glassdoor: MLOps Engineer Salaries
Applied Scientist (Cloud-focused) — $160,000 to $260,000. BLS: Data Scientists Outlook

Official resources

Live in Amazon SageMaker Studio while preparing — most questions are easier when you have hands-on muscle memory:

ML Engineer Associate Practice Exam

234

95.3%

88.2%

4.5/5

Test your aws-ml-engineer-associate knowledge

Get 700+ more questions with source-linked explanations

Learn more...

Related certifications to explore

Related reading from our Learning Hub