I B M C E R T I F I C A T I O N

C9007300 IBM Certified watsonx Data Lakehouse Engineer v1 – Associate Practice Exam

Exam Number: 4340 | Last updated April 17, 2026 | 342+ questions across 5 vendor-aligned objectives

Lakehouse engineers who build, load, and query watsonx.data deployments are the audience for the C9007300 credential. This associate-level exam validates your ability to design lakehouse storage and compute topologies, ingest and transform data, register tables in open formats like Apache Iceberg, and query across engines such as Presto and Spark. Candidates should be comfortable with SQL, object storage, and modern data-lake concepts like time travel and schema evolution.

Landing 26% of the exam, Architecture and Storage Design covers object storage, bucket organization, table formats, and engine placement. At 22%, Ingestion and Transformation covers batch and streaming ingestion, DBT-style transformations, and the movement of data through bronze/silver/gold layers. A further 20% targets Table Formats, covering Apache Iceberg in depth — partitioning, time travel, schema evolution, and compaction.

Tying off the blueprint, Query Engines accounts for 18% and spans Presto and Spark configuration, query optimization, and cost-based planning. Governance and Security represents 14% and spans catalog registration, column- and row-level security, and lineage. Lakehouse questions often test whether a workload should run in Presto or Spark — decide based on latency, complexity, and concurrency, not personal familiarity with either engine.

♥ Iceberg internals are tested more deeply than candidates expect — know the manifest-list structure, how snapshot expiration works, and the trade-offs of copy-on-write versus merge-on-read. Query-engine selection appears in multiple scenarios, so memorize when Spark dominates Presto and vice versa in typical lakehouse patterns.

Every answer links to the source. Each explanation below includes a hyperlink to the exact IBM documentation page the question was derived from. PowerKram is the only practice platform with source-verified explanations. Learn about our methodology →

752

practice exam users

94%

satisfied users

91%

passed the exam

4.7/5

quality rating

Test your C9007300 watsonx lakehouse v1 knowledge

10 of 342+ questions

Question #1 - Architecture and Storage Design

A lakehouse engineer at Branwell Media is designing object-storage layout for a multi-team watsonx.data deployment.

Which watsonx.data storage-design approach organizes buckets for the multi-team deployment?

A) Organize buckets by data domain with consistent prefixes per layer (bronze/silver/gold), attach buckets to catalogs, and give each team least-privilege access
B) Dump every dataset into one bucket with no prefixes
C) Use a different storage vendor per team
D) Skip bucket design and let teams create buckets at random

Show solution

Correct answers: A – Explanation:
Domain-based buckets with layered prefixes and scoped access is watsonx.data’s storage reference. Flat dumps, vendor sprawl, and random bucket creation all fail the design. Source: Check Source

Question #2 - Architecture and Storage Design

A watsonx.data engineer at Redway Analytics must decide which engine to attach to a bucket for mixed interactive and batch queries.

Which approach fits?

A) Use only Presto for everything including heavy ETL
B) Attach both Presto (for interactive, low-latency SQL) and Spark (for heavy batch and ETL) to the bucket, choosing per workload
C) Use only Spark for everything including interactive dashboards
D) Avoid engines and query objects directly

Show solution

Correct answers: B – Explanation:
Engine selection per workload — Presto for interactive SQL, Spark for ETL — is watsonx.data’s engine-placement reference. All-Presto, all-Spark, and direct-object access all miss the trade-off. Source: Check Source

Question #3 - Architecture and Storage Design

A lakehouse at Hardwick Financial must separate compute from storage so engines can scale independently.

Which principle fits?

A) Keep data in object storage in open table formats (Iceberg) and attach query engines on demand, scaling compute without moving data
B) Copy data into the engine’s local disk
C) Tie compute and storage together on a single VM
D) Use a closed proprietary format that locks the compute choice

Show solution

Correct answers: A – Explanation:
Separated compute and storage with open table formats is the lakehouse reference. Local copies, single-VM bundling, and proprietary lock-in all fail the architecture. Source: Check Source

Question #4 - Ingestion and Transformation

A batch ingestion at Pemberfield Energy lands CSVs into bronze, needs cleaning to silver, and aggregation to gold.

Which watsonx.data transformation pattern moves CSVs through bronze, silver, and gold layers?

A) Ingest raw CSVs to bronze, apply cleaning and validation transformations to silver, then aggregate to gold — using Spark (or DBT-style transformations) and Iceberg tables at each layer
B) Write everything to a single table and call it done
C) Skip validation and aggregate raw CSVs
D) Maintain only gold and hope raw data is never needed again

Show solution

Correct answers: A – Explanation:
Layered bronze/silver/gold with Iceberg tables is watsonx.data’s transformation reference. Single-table, no-validation, and gold-only all skip necessary layering. Source: Check Source

Question #5 - Ingestion and Transformation

A streaming use case at Haldane Retail must land events into the lakehouse with near-real-time availability.

Which watsonx.data ingestion pattern lands streaming events near real time into the lakehouse?

A) Batch-only ingestion at midnight
B) Stream events with a streaming framework (e.g., Spark Structured Streaming or Kafka Connect) into Iceberg tables, using commit cadences that balance latency and small-file overhead
C) Stream events into one giant append-only CSV
D) Skip streaming and ask users to refresh manually

Show solution

Correct answers: B – Explanation:
Streaming into Iceberg tables with tuned commit cadence is the streaming-ingestion reference. Midnight batch, CSV append, and manual refresh all miss the feature. Source: Check Source

Question #6 - Table Formats

A data engineer at Greshley Insurance needs to query a table as it was two days ago.

Which Iceberg capability serves the as-of-two-days-ago query directly?

A) Reconstruct the state by subtracting recent changes manually
B) Restore a backup into a separate table
C) Use Iceberg time travel to query a snapshot or timestamp from two days ago directly
D) Skip the request because time travel is not possible in lakehouses

Show solution

Correct answers: C – Explanation:
Iceberg time travel is the reference. Backup restores, manual reconstruction, and denial all miss the feature. Source: Check Source

Question #7 - Table Formats

A table at Finmore Financial gains a new column that should not break existing queries.

Which Iceberg capability adds the new column without breaking existing queries?

A) Block all schema changes permanently
B) Rewrite the entire table to add the column
C) Create a new table and deprecate the old one
D) Use schema evolution to add the column (nullable or with a default) so existing queries continue to work and new queries can use the new column

Show solution

Correct answers: D – Explanation:
Iceberg schema evolution is the reference for non-breaking column additions. Rewrites, table duplication, and blocking changes all miss the feature. Source: Check Source

Question #8 - Table Formats

An Iceberg table at Turvey Retail has accumulated many small files from frequent streaming commits.

Which maintenance capability fits?

A) Turn off streaming to avoid small files
B) Delete small files at random
C) Ignore the small-file problem and accept slow queries
D) Schedule Iceberg compaction (rewrite_data_files) to combine small files into larger ones, improving query performance without changing data

Show solution

Correct answers: D – Explanation:
Iceberg compaction is the reference. Random deletion, ignoring, and killing streaming all fail maintenance practice. Source: Check Source

Question #9 - Query Engines

A query choice at Harvingham Ltd pits Presto against Spark for an interactive dashboard with sub-second response.

Which engine fits?

A) Spark, which is tuned for batch and ETL, not sub-second interactive SQL
B) Presto, whose low-latency distributed SQL engine is tuned for interactive queries
C) Neither — dashboards cannot use lakehouses
D) Both simultaneously for the same query

Show solution

Correct answers: B – Explanation:
Presto for interactive is the watsonx.data engine reference. Spark is batch-oriented. Dashboards can use lakehouses. Dual-engine for one query is not a thing. Source: Check Source

Question #10 - Governance and Security

A sensitive table at Marshford Bank must restrict certain columns (SSN, email) so only the compliance group can read them.

Which watsonx.data capability fits?

A) Remove the columns entirely and lose the data
B) Store sensitive columns in a separate file and email them when requested
C) Configure column-level security on the table so only the compliance group can read the restricted columns, while other users see allowed columns only
D) Grant everyone access and add a disclaimer

Show solution

Correct answers: C – Explanation:
Column-level security on the table is watsonx.data’s governance reference. Email workflows, data loss, and disclaimers all fail governance. Source: Check Source

Get 342+ more questions with source-linked explanations

Every answer traces to the exact IBM documentation page — so you learn from the source, not just memorize answers.

Exam mode & learn mode · Score by objective · Updated April 17, 2026

Learn more...

What the C9007300 watsonx lakehouse v1 exam measures

Design and provision object storage, bucket organization, table formats, and engine placement to deliver a lakehouse topology that scales with data growth without breaking the cost story
Ingest and transform batch and streaming data, DBT-style pipelines, and bronze/silver/gold layers to move raw data through progressively refined layers so analysts and models get trustworthy inputs
Model and evolve Apache Iceberg partitioning, time travel, schema evolution, and compaction to keep large datasets queryable, rewindable, and storage-efficient over months and years
Query and optimize Presto and Spark configuration, query optimization, and cost-based planning to meet latency and concurrency targets across interactive and batch workloads
Catalog and control catalog registration, column- and row-level security, and data lineage to expose lakehouse assets responsibly while satisfying governance requirements

How to prepare for this exam

Review the official exam guide to understand every objective and domain weight before you begin studying
Work through the relevant IBM Training learning path — ibm certified watsonx data lakehouse engineer v1 associate C9007300 — to cover vendor-authored material end-to-end
Get hands-on inside IBM TechZone or a comparable sandbox so you can practice the console tasks, CLI commands, and APIs the exam expects
Tackle a real-world project at your workplace, a volunteer role, or an open-source repository where the technology under test is actually in use
Drill one exam objective at a time, starting with the highest-weighted domain and only moving on once you can teach it to someone else
Study by objective in PowerKram learn mode, where every explanation links back to authoritative IBM documentation
Switch to PowerKram exam mode to rehearse under timed conditions and confirm you consistently score above the pass mark

Career paths and salary outlook

Data engineers with lakehouse skills lead the highest-paid tier of modern data-platform roles:

Data Lakehouse Engineer — $120,000–$165,000 per year, building and operating modern lakehouse platforms (Glassdoor salary data)
Senior Data Engineer — $130,000–$175,000 per year, leading data-platform work across teams (Indeed salary data)
Analytics Platform Architect — $140,000–$185,000 per year, designing enterprise analytics platforms end-to-end (Glassdoor salary data)

Official resources

Work through the official IBM Training learning path for this certification, which bundles videos, labs, and skill tasks aligned to every objective. The official exam page lists the full objective breakdown, prerequisite knowledge, and scheduling details.

C9007300 IBM Certified watsonx Data Lakehouse Engineer v1 – Associate Practice Exam

752

94%

91%

4.7/5

Test your C9007300 watsonx lakehouse v1 knowledge

Get 342+ more questions with source-linked explanations

Learn more...

Related certifications to explore

Related reading from our Learning Hub