If you receive files as binary data — CSV exports, JSON payloads, HTML pages, XML feeds — the Extract From File node is how you turn them into structured items n8n can route, filter, and transform. This guide covers every supported format, the gotchas that trip people up, and three ready-to-use workflow patterns with free JSON.
What the Extract From File Node Does
The Extract From File node reads a binary file already present in your workflow (from an HTTP Request, email attachment, S3 download, FTP pull, etc.) and outputs its contents as structured n8n items.
It replaces the old Move Binary Data node's JSON-extraction path and the deprecated Spreadsheet File read path for non-Excel formats.
Supported Formats
| Format | Notes |
|---|---|
| CSV | Delimiter auto-detected or set manually; first row = headers by default |
| JSON | Root must be an array or object; nested objects become sub-keys |
| HTML | Extracts text content; combine with HTML Extract node for CSS selector targeting |
| XML | Converted to JSON-like structure; attribute prefix configurable |
| Text | Splits on newline by default; returns one item per line |
| iCal | Parses .ics calendar files into event objects |
| ODS | OpenDocument Spreadsheet (LibreOffice) |
Node Configuration
Input Binary Field — the property name on the incoming item that holds the binary file (default: data). If your HTTP Request node stores the file in attachment or file, change this to match.
File Type — select the format. n8n does not auto-detect; wrong type = empty output or error.
CSV options:
-
Delimiter — default comma; set to
\tfor TSV -
Header Row — toggle off if your CSV has no headers (columns become
column0,column1, …) - Skip Empty Lines — recommended on
- Include Empty Cells — toggle on to preserve blank cells as empty strings instead of omitting them
JSON options:
-
Root Property — if the JSON is
{"records": [...]}, set this torecordsto unwrap the array
XML options:
-
Attribute Prefix — default
@; XML attributes appear as@id,@class, etc.
Common Gotchas
1. Binary field name mismatch
The most frequent error: Error: No binary data found for property "data". Run a snapshot of the previous node, find the actual binary property name, and update Input Binary Field.
2. CSV with BOM
Files exported from Excel often start with a UTF-8 BOM (\uFEFF). This corrupts the first header name. Strip it with a Code node upstream:
const raw = $binary.data.data; // base64
const decoded = Buffer.from(raw, 'base64').toString('utf-8').replace(/^\uFEFF/, ');
// re-encode and pass forward
3. JSON root is an object, not an array
If your JSON is {"user": {...}} rather than [{...}], set Root Property to user — otherwise you get one item with the whole object as a nested value.
4. Large files and memory
Extract From File loads the entire file into memory. For files > 50 MB on constrained instances, use Split in Batches + streaming patterns or pre-process server-side.
5. HTML extraction is text-only
The HTML format strips tags and returns raw text. If you need to target specific elements, feed the output through the HTML Extract node with CSS selectors.
Three Workflow Patterns
Pattern 1: CSV Email Attachment Parser
Trigger: Email → Gmail / IMAP node receives email with CSV attachment
Extract From File: Input Binary Field = attachment, File Type = CSV
Filter: Remove rows where status = "cancelled"
Google Sheets: Append remaining rows to a tracking sheet
This replaces a manual download-open-copy-paste cycle. One workflow processes every inbound CSV report automatically.
{
"name": "CSV Email Parser",
"nodes": [
{"type": "n8n-nodes-base.gmail", "name": "Watch Inbox", "parameters": {"operation": "getAll", "filters": {"q": "has:attachment filename:csv"}}},
{"type": "n8n-nodes-base.extractFromFile", "name": "Parse CSV", "parameters": {"operation": "csv", "binaryPropertyName": "attachment"}},
{"type": "n8n-nodes-base.filter", "name": "Active Only", "parameters": {"conditions": {"string": [{"value1": "={{$json.status}}", "operation": "notEqual", "value2": "cancelled"}]}}},
{"type": "n8n-nodes-base.googleSheets", "name": "Append Rows", "parameters": {"operation": "append", "sheetId": "YOUR_SHEET_ID"}}
]
}
Pattern 2: JSON API Export Processor
Trigger: Schedule (daily)
HTTP Request: Download JSON export from internal API (returns {"records": [...]})
Extract From File: File Type = JSON, Root Property = records
Set: Map fields to your schema
HTTP Request: POST each record to downstream service
Useful when a vendor only offers file exports instead of a live API.
Pattern 3: XML Feed Normalizer
Trigger: Schedule (hourly)
HTTP Request: Fetch RSS/Atom or vendor XML feed
Extract From File: File Type = XML
Set: Flatten @ prefixed attributes — ={{$json['@id']}} → id
Filter: Only items published in the last hour
Slack / Email: Alert on new entries
This is a lightweight alternative to the RSS Feed Read node when you need full XML access (namespaces, attributes, custom tags).
Free Workflow JSON
Download a ready-to-import workflow covering all three patterns above:
👉 n8n Workflow Starter Pack — pirateprentice.gumroad.com/l/sxcoe
Import via n8n → Settings → Import Workflow. Credentials not included — wire your own after import.
Related Articles
- n8n Spreadsheet File Node — Excel/CSV read & write
- n8n Read/Write Files Node — local disk access (self-hosted)
- n8n Move Binary Data Node — rename/reorganize binary properties
- n8n HTTP Request Node — fetching files from APIs
- n8n Item Lists Node — manipulating the items after extraction
Have a file format you're struggling to parse in n8n? Drop a comment below.
Top comments (1)
Which file format are you parsing most often with the Extract From File node — CSV exports, JSON payloads, or XML feeds? And have you hit the BOM or binary field name mismatch gotchas? Drop your use case below.