This guide covers how the API works, how to scale it properly, and the architectural patterns that keep large data operations stable in production.
What Is the Shopify Bulk Operations API?
The Shopify Bulk Operations API is a subset of the Shopify GraphQL Admin API designed for large-scale data retrieval and mutation. Instead of paginating through thousands of API calls, you submit a single GraphQL operation and Shopify runs it in the background.
When the job completes, Shopify generates a JSONL file you download via a signed URL. Each line in the file is one resource object.
There are two operation types:
| Operation Type | What It Does | Common Use Case |
|---|---|---|
| bulkOperationRunQuery | Exports data asynchronously | Orders export, product catalog dump, customer list |
| bulkOperationRunMutation | Applies mutations to a large dataset | Price updates, tag additions, metafield writes |
Both operations follow the same lifecycle: create, poll, download.
The Lifecycle of a Bulk Operation
Understanding the full lifecycle prevents the most common integration bugs. Here is what happens step by step.
Step 1: Submit the Operation
You send a bulkOperationRunQuery mutation via the GraphQL Admin API. Shopify queues the job and returns a bulk operation ID with status CREATED.
mutation {
bulkOperationRunQuery(
query: """
{
products {
edges {
node {
id
title
variants {
edges {
node {
id
price
sku
}
}
}
}
}
}
}
"""
) {
bulkOperation {
id
status
}
userErrors {
field
message
}
}
}
Step 2: Poll for Completion
You poll the currentBulkOperation query at intervals. Status transitions from CREATED to RUNNING to COMPLETED (or FAILED).
Poll every 3 to 10 seconds for small jobs. For large datasets, poll every 30 to 60 seconds to avoid unnecessary API calls.
query {
currentBulkOperation {
id
status
errorCode
createdAt
completedAt
objectCount
fileSize
url
partialDataUrl
}
}
Step 3: Download and Parse the JSONL
When status is COMPLETED, the url field returns a signed download link (valid for 7 days). Each line is a flat JSON object. Parent-child relationships are encoded via a __parentId field on child nodes.
| Status | Meaning | Action Required |
|---|---|---|
| CREATED | Job queued, not yet running | Continue polling |
| RUNNING | Job actively processing | Continue polling |
| COMPLETED | Job finished successfully | Download JSONL file |
| FAILED | Job encountered an error | Check errorCode, retry |
| CANCELED | Manually or auto-canceled | Resubmit if needed |
| CANCELING | Cancel in progress | Wait, then resubmit |
Why Bulk Operations Beat Standard Pagination at Scale
Standard GraphQL pagination on Shopify works by cursor-walking through data with after arguments. For 10,000 records, that is manageable. For 1 million records, you are making thousands of API calls and burning through your GraphQL query cost budget at speed.
The Bulk Operations API bypasses per-request rate limits almost entirely. Shopify runs the operation server-side. Your integration waits, then downloads a single file. This fundamentally changes how you architect large data operations on Shopify.
Here is how the two approaches compare directly:
| Factor | Standard Pagination | Bulk Operations API |
|---|---|---|
| Rate Limit Exposure | High (hundreds of requests) | Very low (1 submission + polling) |
| Max Records per Run | Limited by cost/throttle | Millions (no hard cap) |
| Processing Model | Synchronous | Asynchronous |
| Error Recovery | Per-request retries | Job-level retry |
| Data Format | JSON in response body | JSONL file download |
| Suitable Dataset Size | Up to ~50K records | 50K to millions |
Shopify Bulk GraphQL: Key Constraints to Know
The Bulk Operations API is powerful, but it has rules. Violating them produces silent failures or unexpected cancellations.
One Operation at a Time
Shopify allows only one bulk operation running per store at any given moment. If you submit a second one while the first is still running, it fails immediately. You must cancel the current operation or wait for it to complete.
In a multi-tenant app serving thousands of stores, build per-store operation locks into your queue infrastructure. The patterns described in queue infrastructure for Shopify apps apply directly here.
No Nested Mutations
Bulk mutations require a specific structure. You provide the mutation string and a JSONL input file (uploaded via the stagedUploadsCreate mutation). Each line in the input file maps to one mutation call. Nested mutations within a single bulk operation are not supported.
File Size and Memory Considerations
Output JSONL files can reach several gigabytes for large stores. Stream the file rather than loading it fully into memory. Use line-by-line parsing with buffered reads.
Partial Data on Failure
When a job fails or gets canceled, Shopify may still return a partialDataUrl. This lets you process whatever completed before the failure, and retry only the remaining records. Always check for this field in your polling logic.
Bulk Mutations: Updating Data at Scale
Bulk mutations are where the real power lies for operations like price updates, tag management, and metafield writes at scale.
The flow works like this:
1. Stage an upload. Use stagedUploadsCreate to get a signed PUT URL where you upload your JSONL input file.
2. Upload your input file. Each line is a JSON object containing the variables for one mutation call.
3. Submit the bulk mutation. Reference the staged upload URL in your bulkOperationRunMutation call.
Example input JSONL line for a price update:
{"input": {"id": "gid://shopify/ProductVariant/123456789", "price": "29.99"}}
Shopify applies each line as a separate mutation call, processes them in order, and returns results in the JSONL output. Each result line includes a __lineNumber field so you can map successes and failures back to your input.
This is far more efficient than running individual mutations through async event processing pipelines for large batches.
Architectural Patterns for Bulk Query Scaling
Running bulk operations in isolation is straightforward. Running them reliably at scale across a production system requires deliberate architecture decisions.
Pattern 1: Webhook-Triggered Completion
Instead of polling, subscribe to the BULK_OPERATIONS_FINISH webhook topic. Shopify notifies your endpoint when a job completes, including the operation status and download URL.
This eliminates polling overhead entirely. Your system sleeps until Shopify calls you. Combine this with Shopify webhook infrastructure and a durable message queue for reliability.
Make your webhook handler idempotent. Shopify can fire the same completion event more than once. The guidance in idempotency strategies for Shopify systems covers exactly how to handle this safely.
Pattern 2: Job Orchestration Layer
For apps managing bulk operations across hundreds or thousands of merchant stores, build a job orchestration layer that:
- Queues bulk operation submissions per store
- Tracks the current active operation ID per store in a database
- Handles completions via webhooks
- Retries failed jobs with exponential backoff
- Logs partial data URLs before discarding failed results
The principles behind scalable Shopify integration patterns give you a solid foundation for this layer.
Pattern 3: Chunked Mutation Strategy
Even though bulk mutations can handle millions of records, splitting very large jobs into chunks of 50,000 to 100,000 records per operation improves resilience. Smaller jobs complete faster, are easier to retry, and produce smaller JSONL files that are cheaper to process.
Track which chunk was last successfully completed in a persistent state store. If a job fails, you resume from the last good checkpoint rather than reprocessing everything.
Pattern 4: JSONL Processing Pipeline
Your output processing needs to handle the flat, __parentId-linked structure of bulk output JSONL. Build a streaming parser that:
- Reads the file line by line without loading it into memory
- Reconstructs parent-child relationships using
__parentIdlookups - Writes processed records into your database or downstream system
- Tracks failed lines separately for targeted retries
For teams using ERP integration architectures with Shopify, the bulk operation JSONL output often becomes the primary data feed for inventory and order sync pipelines.
Error Handling and Retry Logic
Bulk operations fail for several reasons. Build your error handling around the errorCode field returned when a job reaches FAILED status.
| Error Code | Cause | Recommended Action |
|---|---|---|
| ACCESS_DENIED | Missing required API scope | Review and update OAuth scopes |
| INTERNAL_SERVER_ERROR | Shopify-side failure | Retry with exponential backoff |
| TIMEOUT | Query too complex or dataset too large | Simplify query, reduce scope, chunk inputs |
| TOO_MANY_FILE_STORAGE_REQUESTS | Too many uploads in flight | Throttle staged upload submissions |
For jobs that time out, simplify the GraphQL query structure. Remove unnecessary nested fields. Fetch only the fields your downstream system actually uses. A leaner query resolves most timeout issues without needing to reduce the record scope.
The retry and observability patterns from webhook monitoring and observability translate well to bulk operation jobs. Log every status transition, capture completion times, and alert on jobs that stay in RUNNING state beyond expected thresholds.
Performance Benchmarks: What to Expect
Processing time varies based on store size, query complexity, and Shopify platform load. These are realistic reference points based on common production workloads:
| Dataset Size | Typical Completion Time | Approximate File Size |
|---|---|---|
| 10,000 products | 30 to 90 seconds | 5 to 15 MB |
| 100,000 orders | 3 to 8 minutes | 100 to 300 MB |
| 500,000 customers | 15 to 40 minutes | 500 MB to 2 GB |
| 1M+ line items | 30 to 90 minutes | 2 to 10 GB |
These are estimates, not guarantees. Build your systems to handle the upper range of each window. Never assume a fixed completion time in production logic.
Bulk Operations and App Performance Architecture
Bulk operations change the performance profile of your app. Your server is no longer hammering the API with thousands of requests. Instead, it performs a few API calls and then processes a large file. The bottleneck shifts from network throughput to local compute and storage I/O.
Plan your infrastructure accordingly:
- Use worker processes separate from your web layer for JSONL processing
- Write processed results to a fast intermediate store (Redis, PostgreSQL) before your final destination
- Monitor worker memory consumption when processing multi-gigabyte files
- Use streaming HTTP clients that do not buffer the full response body
Teams scaling Shopify apps to high request volumes will find that bulk operations dramatically reduce API call counts. This directly improves the stability profile covered in scaling Shopify apps to millions of requests.
If your app also handles real-time events alongside batch operations, separate the two processing pipelines entirely. Use event-driven architecture patterns for real-time updates and reserve the bulk operation pipeline for scheduled or triggered large-batch jobs.
How KolachiTech Builds Bulk Operation Systems
At KolachiTech, we architect Shopify integrations that handle enterprise-scale data without breaking under load. Our approach combines the Shopify Bulk Operations API with webhook-driven completion handling, idempotent processing pipelines, and structured retry logic.
We have built bulk operation systems for catalog migrations from platforms like Magento and WooCommerce, large-scale ERP sync pipelines, and automated product update workflows touching millions of variants. Every system we build prioritizes reliability over raw speed.
If your Shopify store or app needs a production-grade bulk operations architecture, our team handles the full design and implementation.
Frequently Asked Questions
What is the Shopify Bulk Operations API used for?
It is used to export or mutate large datasets asynchronously without hitting standard GraphQL rate limits. Common uses include full product catalog exports, bulk price updates, and large-scale metafield writes.
Can I run multiple bulk operations at the same time on one store?
No. Shopify allows only one bulk operation running per store at any moment. You must cancel the current operation or wait for it to finish before submitting a new one.
How do I know when a bulk operation is complete?
You can either poll the currentBulkOperation query or subscribe to the BULK_OPERATIONS_FINISH webhook topic. The webhook approach is more efficient for production systems.
What format does the bulk operation output use?
Output is a JSONL file where each line is one resource object. Parent-child relationships are encoded via a __parentId field on child records.
What happens if a bulk operation fails mid-run?
Shopify returns an errorCode in the failed status and may provide a partialDataUrl containing whatever data completed before the failure. You can process this partial file and retry only the remaining records.
Is there a record limit for bulk operations?
There is no hard record cap. Shopify processes as many records as your query matches. Jobs matching millions of records are supported, though they take longer and produce larger output files.
How do bulk mutations work differently from bulk queries?
Bulk mutations require you to upload a JSONL input file where each line contains variables for one mutation call. Shopify applies each line as a separate mutation and returns results in a JSONL output file with line numbers for mapping success and failure.
Can the Bulk Operations API handle metafield updates?
Yes. Metafield mutations are one of the most common bulk mutation use cases. You upload a JSONL file with one metafield input object per line and Shopify processes all writes in the background.
