Building on the Tradevynt API: Integrating Extraction into Your TMS or Broker Portal

A technical walkthrough of the /v1/extract endpoint — request shape, response schema, webhook delivery, and how to handle low-confidence field flags in your own UI.

Building with the customs declaration API

Most freight forwarders and customs brokers who use Tradevynt start with our web interface — upload a document, review the extraction, approve and route. That workflow handles the majority of cases well. But if you're processing 200+ entries monthly and your team already has a TMS, a broker portal, or an internal ops tool that your entry staff live in every day, adding a separate browser tab to the workflow creates friction and introduces the risk of data re-entry errors between systems.

The API integration path exists for exactly this case: get Tradevynt's extraction running inside your existing tools, so the entry data appears pre-populated in the fields where your team already works, without a UI context switch. This post covers the technical mechanics of that integration for developers setting it up — request shape, response structure, the webhook delivery model, and specifically how to handle the confidence flags that Tradevynt returns on fields where extraction uncertainty warrants human review.

The /v1/extract Endpoint: What Goes In

A document extraction request is an HTTP POST to /v1/extract with a multipart form body. The request requires three components: the document file itself (PDF, JPEG, PNG, or TIFF; max 25MB), a document_type hint (one of commercial_invoice, bill_of_lading, packing_list, arrival_notice, air_waybill, or auto), and your API key in the Authorization header as a Bearer token.

The document_type hint is optional but meaningful. Passing auto tells the extraction pipeline to classify the document type before extracting — which adds a small amount of processing latency (typically 200–400ms) compared to passing the type explicitly. If your integration point is in a workflow where the document type is already known from context — for example, your ops staff designate "bill of lading" before uploading — pass the type explicitly to skip the classification step.

An optional extraction_profile parameter lets you specify which field groups to extract. The default profile extracts all standard fields. If you only need shipper/consignee data and declared value for a quick entry pre-check, use header_fields_only to reduce response size and latency. Field group options are documented in the API reference; full, header_fields_only, and line_items_only cover the most common cases.

Response Schema: What Comes Back

The response is a JSON object. The top-level structure has three sections: document (metadata about the submission), fields (the extracted data), and confidence (per-field confidence scores and flags).

The fields object is keyed by field name. For a commercial invoice, the standard fields include shipper_name, shipper_address, consignee_name, consignee_address, notify_party, invoice_number, invoice_date, incoterms, currency, total_value, country_of_origin, and line_items (an array of objects, each with description, quantity, unit, unit_price, total_price, hs_code, and country_of_origin if specified per line).

The confidence object mirrors the fields structure, with each field key mapped to a confidence object containing score (0.0–1.0), flag (one of high, medium, low, or review), and an optional reason string for review-flagged fields. The reason string is human-readable and is intended to be surfaced in your UI — it explains why confidence is low, such as "handwritten amendment detected over printed field" or "value inconsistent with declared quantity range."

Handling Low-Confidence Fields in Your UI

This is the integration decision that has the most impact on the quality of your operators' review experience. The extraction pipeline returns a confidence score and flag for every field; what your integration does with those flags determines whether the human review step is efficient or frustrating.

The simplest approach is to render all extracted fields as pre-populated form inputs, with fields flagged review or low highlighted in your UI (yellow border, warning icon, whatever your design system uses) and automatically focused or scrolled to at the start of the review step. This is better than hiding low-confidence fields from the operator or treating all extracted values as equally reliable — but it still requires the operator to actively process every flagged field.

A more ergonomic pattern, which we've seen work well in integrations where the operator is reviewing 15–30 entries per day, is to use a diff-style review step: show only the flagged fields in a focused review panel, one at a time, with the raw document region highlighted alongside the extracted value. The operator either confirms the extracted value or corrects it; the correction feeds back to the entry record. Non-flagged fields are auto-populated without requiring explicit confirmation. This reduces review time significantly compared to reviewing all fields on every entry.

For the hs_code field specifically, we recommend a different treatment regardless of confidence score: always surface the extracted HS code with the matched HTS description text alongside it, so the operator can confirm that the description matches what the goods actually are. An HS code can be extracted with high confidence — it's structurally valid, it appeared clearly in the document — while still being wrong because the goods description on the invoice is too vague to support a correct classification. High OCR confidence on the extracted value is not the same as high classification confidence, and the field presentation should make that distinction visible.

Webhook Delivery for Async Workflows

For synchronous integrations where you submit a document and wait for the response in the same request cycle, the standard POST-and-wait approach works for most document types. Typical end-to-end processing time is 2–8 seconds depending on document complexity, page count, and whether classification is running. For a single-page commercial invoice, you'll usually be under 3 seconds.

For higher-volume integrations where you're submitting documents in batches, or for workflows where document submission happens asynchronously from review (for example, documents arriving via email are ingested automatically and prepared for morning review), the webhook delivery model is more appropriate. Instead of waiting for the response, you POST to /v1/extract/async, receive a job ID immediately, and configure a webhook endpoint to receive the completed extraction when processing finishes.

Webhook payloads have the same schema as synchronous responses, with an additional job_id field for correlation. Your webhook endpoint should return HTTP 200 within 5 seconds; if it times out or returns a non-2xx status, Tradevynt will retry with exponential backoff up to 5 attempts. Implement idempotent webhook handling using the job ID as the key — retries will carry the same job ID and you don't want duplicate entries created in your system if a webhook is delivered more than once.

Multi-Document Submission and Cross-Document Validation

A single customs entry often draws from multiple source documents: the commercial invoice provides value and classification data, the bill of lading provides routing and piece count, the packing list provides per-carton detail. Submitting these as separate extraction requests and merging the results in your code works, but it misses an opportunity: Tradevynt can cross-validate fields across documents when multiple documents are submitted as a set.

The /v1/extract/set endpoint accepts multiple documents in a single request, each labeled with its document type. The response includes the per-document extractions plus a cross_validation section that flags discrepancies: a piece count on the bill of lading that doesn't match the packing list total, a declared value on the commercial invoice that doesn't match the total on the packing list line items, a country of origin that differs between the invoice and the BOL. These are exactly the discrepancies that cause CBP to hold entries, and catching them before filing is the point.

Cross-document validation requires that all documents in the set relate to the same shipment. The system uses internal signals (shipper/consignee match, container number, invoice references) to verify the documents are related before running cross-validation. If the documents don't appear to relate to the same shipment, the response flags this in the cross_validation.set_coherence field rather than running potentially misleading field comparisons.

Rate Limits and Scaling Considerations

The API is rate-limited at the account level. Default limits are adequate for operations processing up to several hundred entries per day in a typical distribution pattern. For operations running batch ingestion — for example, overnight processing of the day's document inbox before the morning review queue — sustained submission rates may approach or exceed default limits. Contact us to discuss higher rate-limit tiers if your integration pattern requires them; the limit is adjustable for accounts with predictable volume profiles.

One design note for integrations that are processing email-attached documents: don't submit every email attachment as an extraction job. Pre-filter for file types and minimum file size to avoid submitting email signatures, tracking pixel images, or tiny attachment fragments that will return trivial or error responses. The extraction pipeline handles non-document files gracefully (returning an appropriate error status rather than failing silently), but filtering obviously non-document content before submission reduces unnecessary API calls and makes your webhook error handling simpler.

The full API reference is available at docs.tradevynt.com/api with interactive examples and schema definitions for every request and response type. For integration support or questions about field behavior on specific document types, reach us at [email protected] — we read every technical question and typically respond within one business day.

Continue reading

All articles