MICROSOFT CERTIFICATION
DP-700 Fabric Data Engineer Associate Practice Exam
Exam Number: 3118 | Last updated 16-Apr-26 | 789+ questions across 4 vendor-aligned objectives
The DP-700 Fabric Data Engineer Associate certification validates the skills of data engineers who design, implement, and manage data solutions using Microsoft Fabric. This exam measures your ability to work with Microsoft Fabric, Data Factory, Lakehouse, Data Warehouse, Spark, Eventstream, OneLake, demonstrating both conceptual understanding and practical implementation skills required in today’s enterprise environments.
The heaviest exam domains include Implement and Manage a Lakehouse (35–40%), Implement and Manage a Data Warehouse (20–25%), and Ingest and Transform Data (20–25%). These areas collectively represent the majority of exam content and require focused preparation across their respective subtopics.
Additional domains tested include Monitor and Optimize a Data Solution (10–15%). Together, these areas round out the full exam blueprint and ensure candidates possess well-rounded expertise across the certification scope.
Every answer links to the source. Each explanation below includes a hyperlink to the exact Microsoft documentation page the question was derived from. PowerKram is the only practice platform with source-verified explanations. Learn about our methodology →
252
practice exam users
96%
satisfied users
88.8%
passed the exam
4/5
quality rating
Test your DP-700 Fabric Data Engineer Associate knowledge
10 of 789+ questions
Question #1 - Implement and Manage a Lakehouse
A data engineering team sets up a new Fabric Lakehouse to store IoT sensor readings. Analysts will query by device_id and event_date. The team must choose an appropriate table format.
Which table format and partitioning strategy should the team use?
A) CSV files partitioned by device_id
B) Delta tables partitioned by event_date
C) Parquet files with no partitioning
D) JSON files organized into folders by hour
Show solution
Correct answers: B – Explanation:
Delta tables support ACID transactions, time travel, and efficient predicate pushdown. Partitioning by event_date aligns with time-range queries while avoiding too many partitions. CSV lacks indexing. Unpartitioned Parquet requires full scans. JSON is inefficient for columnar analytics. Source: Check Source
Question #2 - Implement and Manage a Lakehouse
A data engineering team sets up a Lakehouse for IoT sensor readings. Analysts query by device_id and event_date.
Which table format and partitioning strategy should the team use?
A) Parquet files stored without any partitioning scheme in a single flat directory structure
B) JSON files organized into hourly folders based on the ingestion timestamp of each record
C) CSV files partitioned into folders organized by the device_id field for each sensor
D) Delta tables partitioned by event_date enabling time-range predicate pushdown on queries
Show solution
Correct answers: D – Explanation:
Delta tables support ACID transactions, time travel, and efficient predicate pushdown. Partitioning by event_date aligns with time-range analytical queries while maintaining manageable partition counts. CSV files lack columnar optimization and transaction support. Unpartitioned Parquet requires full directory scans for time-filtered queries. JSON files are row-oriented and inefficient for columnar analytical workloads at IoT scale. Source: Check Source
Question #3 - Implement and Manage a Lakehouse
Over months, a Delta table accumulates thousands of small files from frequent micro-batch ingestion, slowing read queries.
Which maintenance operation should the data engineer run?
A) Run OPTIMIZE to compact small files into larger ones and VACUUM to remove stale versions
B) Delete the entire table and recreate it from the original source data with fresh ingestion
C) Convert the table from Delta format to standard Parquet to eliminate transaction log overhead
D) Increase the Spark compute cluster size to compensate for the small-file read performance
Show solution
Correct answers: A – Explanation:
OPTIMIZE compacts small files into larger, more efficient ones, and VACUUM removes files no longer referenced by the Delta transaction log. Deleting and recreating loses time-travel history and requires full re-ingestion. Converting to Parquet loses Delta features like ACID transactions, time travel, and schema evolution. Larger compute clusters do not fix the fundamental I/O overhead of reading thousands of small files. Source: Check Source
Question #4 - Implement and Manage a Lakehouse
A retailer stores three years of order history. A developer accidentally updates 100,000 rows with incorrect values.
How should the data engineer revert the accidental update?
A) Use Delta time travel to RESTORE the table to the version immediately before the update
B) Delete the entire table and reimport all three years of order data from the source systems
C) Restore the table from a nightly backup snapshot taken twelve hours before the incident
D) Manually correct each of the 100,000 affected rows using individual UPDATE statements
Show solution
Correct answers: A – Explanation:
Delta time travel enables restoring a table to any prior version number instantly, reverting the accidental update without losing changes made in the hours since the last backup. A twelve-hour-old backup loses all legitimate data changes since that snapshot. Manually correcting 100,000 rows is extremely time-consuming and error-prone. Full reimport wastes hours of processing and may lose incremental data not in the original source. Source: Check Source
Question #5 - Implement and Manage a Lakehouse
A healthcare Lakehouse must enforce that only the clinical team can read patient tables while finance accesses only billing tables.
Which security mechanism should be configured?
A) Post a written policy asking users to query only the tables relevant to their department
B) Store each team’s tables in separate Lakehouses governed by independent workspace role assignments
C) Configure OneLake shortcut permissions granting cross-Lakehouse read access to both teams
D) Encrypt patient tables with a clinical-team-only key and share the decryption key via email
Show solution
Correct answers: B – Explanation:
Separate Lakehouses with workspace roles provide clear access boundaries where workspace membership controls who can read, write, or manage each Lakehouse’s tables. OneLake shortcuts can grant access but do not inherently enforce per-team isolation without workspace-level boundaries. Encryption key sharing via email is not a Fabric access control mechanism and creates key management risks. Written policies are unenforceable without technical controls restricting actual data access. Source: Check Source
Question #6 - Implement and Manage a Data Warehouse
A financial reporting team builds a Fabric Data Warehouse tracking customer addresses over time with full change history.
Which slowly changing dimension approach should the data engineer implement?
A) Store only the latest address and discard all historical versions to simplify the schema
B) Add effective_date, end_date, and is_current columns to preserve full address version history
C) Keep a separate timestamp-only log table with no link back to the customer dimension record
D) Overwrite the customer address row each time a change is detected losing all prior values
Show solution
Correct answers: B – Explanation:
SCD Type 2 using effective/end dates and a current flag preserves complete address history enabling point-in-time reporting at any date. Overwriting (Type 1) loses historical values preventing any historical analysis. Discarding history eliminates the ability to report on past-period customer geography. A disconnected log table without dimension integration breaks the star schema reporting pattern. Source: Check Source
Question #7 - Implement and Manage a Data Warehouse
A Fabric Data Warehouse query joining a 2-billion-row fact table with a 50,000-row dimension runs slowly.
Which optimization should the engineer apply first?
A) Permanently increase the warehouse capacity units to the maximum available tier level
B) Add column statistics on the join columns and review the generated query execution plan
C) Convert the entire Data Warehouse to a Lakehouse to change the underlying query engine
D) Export all data to Excel and perform the join analysis in a local pivot table workbook
Show solution
Correct answers: B – Explanation:
Statistics help the query engine choose efficient join strategies, and the execution plan reveals whether unnecessary scans or suboptimal join types are occurring. Permanent capacity increases add cost without addressing the root query design issue. Excel cannot handle 2 billion rows and has a row limit of approximately 1 million. Converting platforms changes the engine but does not inherently fix query optimization problems. Source: Check Source
Question #8 - Ingest and Transform Data
A media company receives hourly CSV files via SFTP from an advertising partner. Files need schema validation and error handling before landing in the Lakehouse.
Which Fabric component should orchestrate this ingestion?
A) Manual file upload performed hourly through the Lakehouse browser-based file upload interface
B) Power BI scheduled refresh pulling CSV data directly into a semantic model import table
C) A Data Pipeline with Copy Activity using an SFTP connector followed by a validation Notebook
D) Fabric Eventstream configured to capture real-time events from the SFTP server connection
Show solution
Correct answers: C – Explanation:
Data Pipelines orchestrate Copy Activities from SFTP with scheduling capabilities, and a downstream Notebook validates schema and handles errors programmatically. Manual hourly upload requires constant human attention and is not scalable. Power BI refresh imports into semantic models, not into Lakehouse Delta tables. Eventstream handles real-time streaming sources, not scheduled batch file transfers from SFTP. Source: Check Source
Question #9 - Ingest and Transform Data
A bank needs to stream real-time transaction events from Azure Event Hubs into a Fabric Lakehouse for fraud detection within seconds.
Which Fabric component should be used for real-time ingestion?
A) A scheduled Dataflow Gen2 configured to run at one-minute intervals polling the Event Hub
B) A Fabric Notebook running in a continuous polling loop checking Event Hubs for new messages
C) Power Automate with a premium Event Hubs connector triggering a flow for each transaction
D) Fabric Eventstream connected to Event Hubs with a Lakehouse configured as the destination
Show solution
Correct answers: D – Explanation:
Eventstream natively connects to Event Hubs and continuously streams events into a Lakehouse destination with low latency and built-in exactly-once delivery. Scheduled Dataflow at one-minute intervals introduces unacceptable delay for fraud detection. Polling Notebooks waste compute resources and are less efficient than native streaming integration. Power Automate is not designed for high-throughput, low-latency event stream processing at banking transaction volumes. Source: Check Source
Question #10 - Monitor and Optimize a Data Solution
Nightly Spark jobs in Fabric take progressively longer each week. The team needs to identify the bottleneck.
Which monitoring approach should the team use?
A) Review the Spark application monitoring UI for stage-level metrics, shuffle sizes, and skew
B) Ask all workspace users to reduce the number and frequency of their scheduled queries
C) Check the Azure subscription billing dashboard for unexpected cost increases per resource
D) Restart the entire Fabric capacity and observe whether performance improves afterward
Show solution
Correct answers: A – Explanation:
The Spark monitoring UI shows stage durations, shuffle read/write volumes, and task distribution, revealing bottlenecks like data skew, inefficient joins, or excessive shuffling. Billing dashboards show cost trends but not pipeline-level performance diagnostics. Restarting capacity does not fix pipeline design issues causing progressive degradation. Reducing user queries masks the underlying problem without identifying the root cause. Source: Check Source
Get 789+ more questions with source-linked explanations
Every answer traces to the exact Microsoft documentation page — so you learn from the source, not just memorize answers.
Exam mode & learn mode · Score by objective · Updated 16-Apr-26
Learn more...
What the DP-700 Fabric Data Engineer Associate exam measures
- Implement and Manage a Lakehouse (35–40%) — Evaluate your ability to implement and manage tasks within this domain, including real-world job skills and scenario-based problem solving.
- Implement and Manage a Data Warehouse (20–25%) — Evaluate your ability to implement and manage tasks within this domain, including real-world job skills and scenario-based problem solving.
- Ingest and Transform Data (20–25%) — Evaluate your ability to implement and manage tasks within this domain, including real-world job skills and scenario-based problem solving.
- Monitor and Optimize a Data Solution (10–15%) — Evaluate your ability to implement and manage tasks within this domain, including real-world job skills and scenario-based problem solving.
How to prepare for this exam
- Review the official exam guide to understand every objective and domain weight before you begin studying
- Complete the relevant Microsoft Learn learning path to build a structured foundation across all exam topics
- Get hands-on practice in an Azure free-tier sandbox or trial environment to reinforce what you have studied with real configurations
- Apply your knowledge through real-world project experience — whether at work, in volunteer roles, or contributing to open-source initiatives
- Master one objective at a time, starting with the highest-weighted domain to maximize your score potential early
- Use PowerKram learn mode to study by individual objective and review detailed explanations for every question
- Switch to PowerKram exam mode to simulate the real test experience with randomized questions and timed conditions
Career paths and salary outlook
Earning this certification can open doors to several in-demand roles:
- Fabric Data Engineer: $115,000–$155,000 per year (based on Glassdoor and ZipRecruiter data)
- Cloud Data Engineer: $110,000–$150,000 per year (based on Glassdoor and ZipRecruiter data)
- Data Platform Architect: $125,000–$165,000 per year (based on Glassdoor and ZipRecruiter data)
Official resources
Microsoft provides comprehensive free training to prepare for the DP-700 Fabric Data Engineer Associate exam. Start with the official Microsoft Learn learning path for structured, self-paced modules covering every exam domain. Review the exam study guide for the complete skills outline and recent updates.
