Document Processing Income

This guide covers adding document processing for income and employment verification.

Overview

Truv's document processing solution helps you collect user documents. These files are then processed to extract structured data and analysed for fraud. The two points below cover common reasons for using document processing:

  • Back up plan for unsuccessful payroll connection attempts or user data isn't processed
  • Standalone solution to process collect user documents and parse those into structured data schema.

Configuring your document processing

The points below cover user information to collect when integrating your document processing solution.

  • Document type required, such as paystubs, W2, and 1099
📘

Note

For 1099 tax documents, Truv supports parsing for formatting from any year after 2021. This includes the following 1099 forms.

  • 1099-DIV, 1099-G, 1099-INT, 1099-MISC, 1099-NEC, 1099-R
  • Number of documents required for each type
  • Applicant first_name and last_name

In addition, you can add a specific message for your users displayed in Truv Bridge. Contact Truv for additional configuration settings.

Bridge Token specifications

When creating a Bridge Token for the User, document upload requires additional values in the request. Include the product_type = income and data_sources = ["docs"] in the configuration from the cURL sample below.

curl --request POST \
     --url https://prod.truv.com/v1/users/{user_id}/tokens/ \
     --header 'X-Access-Client-Id: {{client_id}}' \
     --header 'X-Access-Secret: {{access_key}}	' \
     --header 'accept: application/json' \
     --header 'content-type: application/json' \
     --data '
{
     "product_type": "income",
     "data_sources": ["docs"],
     "tracking_info": "any data for tracking current transaction"
}
'

Example response

The JSON data below contains a sample report payload when including document uploads.

{
  "id": "24d7e80942ce4ad58a93f70ce4115f5c",
  "status": "new",
  "finished_at": "2021-04-06T11:30:00Z",
  "completed_at": "2021-04-06 11:30:00+00:00",
  "access_token": "48427a36d43c4d5aa6324bc06c692456",
  "tracking_info": "user123456",
  "refresh_status": null,
  "employments": [
    {
      "income": null,
      "income_unit": null,
      "pay_frequency": null,
      "statements": [
        {
          "id": "24d7e80942ce4ad58a93f70ce4115f5c",
          "check_number": null,
          "pay_date": "2018-05-15",
          "net_pay": "11500.32",
          "net_pay_ytd": "31980.64",
          "gross_pay": "13900.11",
          "gross_pay_ytd": "49200.00",
          "bonus": "100.00",
          "commission": "12000.00",
          "hours": "40.00",
          "basis_of_pay": "S",
          "period_start": "2018-05-01",
          "period_end": "2018-05-15",
          "regular": "1695.11",
          "regular_ytd": "23000.00",
          "other_pay_ytd": "700.00",
          "bonus_ytd": "1000.00",
          "commission_ytd": "24000.00",
          "overtime": "45.00",
          "overtime_ytd": "500.00",
          "other_pay": "60.00",
          "earnings": [
            {
              "name": "Regular",
              "amount": "1935.77",
              "category": "regular",
              "rate": null,
              "units": null
            },
            {
              "name": "Overtime",
              "amount": "60.58",
              "category": "overtime",
              "rate": "30.29",
              "units": "2"
            }
          ],
          "earnings_ytd": [
            {
              "name": "Regular",
              "amount": "1935.77",
              "category": "regular",
              "rate": null,
              "units": null
            },
            {
              "name": "Overtime",
              "amount": "60.58",
              "category": "overtime",
              "rate": "30.29",
              "units": "2"
            }
          ],
          "deductions": [
            {
              "amount": "127.01",
              "category": "socialsec",
              "name": "Social Security Tax"
            },
            {
              "amount": "46.23",
              "category": "state",
              "name": "VA State Income Tax"
            },
            {
              "amount": "29.7",
              "category": "medicare",
              "name": "Medicare Tax"
            }
          ],
          "deductions_ytd": [
            {
              "amount": "127.01",
              "category": "socialsec",
              "name": "Social Security Tax"
            },
            {
              "amount": "46.23",
              "category": "state",
              "name": "VA State Income Tax"
            },
            {
              "amount": "29.7",
              "category": "medicare",
              "name": "Medicare Tax"
            }
          ],
          "md5sum": "03639d6a6624f69a54a88ea90bd25e9d",
          "file": "https://citadelid-resources.s3-us-west-2.amazonaws.com/paystub_sample.pdf",
          "derived_fields": [
            "basis_of_pay"
          ],
          "missing_data_fields": [
            "earnings_ytd"
          ]
        }
      ],
      "annual_income_summary": [
        {
          "id": "24d7e80942ce4ad58a93f70ce4115f5c",
          "year": 2018,
          "regular": "23000.00",
          "bonus": "1000.00",
          "commission": "24000.00",
          "overtime": "500.00",
          "other_pay": "700.00",
          "net_pay": "31980.64",
          "gross_pay": "49200.00"
        }
      ],
      "bank_accounts": [
      ],
      "w2s": [
        {
          "file": "https://citadelid-resources.s3-us-west-2.amazonaws.com/W2_sample.pdf",
          "md5sum": "f65e30c39124ad707ac4b3aeaee923a7",
          "year": 2020,
          "wages": "900.50",
          "federal_tax": "75.01",
          "social_security_wages": "900.50",
          "social_security_tax": "56.30",
          "medicare_wages": "900.50",
          "medicare_tax": "13.15"
        }
      ],
      "id": "24d7e80942ce4ad58a93f70ce4115f5c",
      "is_active": false,
      "job_title": null,
      "job_type": null,
      "start_date": "2018-01-01",
      "original_hire_date": null,
      "end_date": "2022-06-16",
      "external_last_updated": "2022-06-16",
      "dates_from_statements": true,
      "derived_fields": [
        "is_active"
      ],
      "missing_data_fields": [
      ],
      "manager_name": "Jenny McDouglas",
      "profile": {
        "first_name": "John",
        "last_name": "Doe",
        "middle_initials": "K",
        "email": null,
        "ssn": "6789",
        "date_of_birth": null,
        "home_address": {
          "street": "1 Morgan Ave",
          "city": "Los Angeles",
          "state": "CA",
          "zip": "90210",
          "country": "US"
        }
      },
      "company": {
        "name": "Facebook Demo",
        "address": {
          "street": "1 Morgan Ave",
          "city": "Los Angeles",
          "state": "CA",
          "zip": "90210",
          "country": "US"
        },
        "phone": null
      }
    }
  ],
  "pdf_report": "https://citadelid-resources.s3-us-west-2.amazonaws.com/report.pdf",
  "provider": "doc_upload",
  "data_source": "docs"
}

Testing document upload solutions

When implementing VOIE in your workflow for document uploads, use the Testing guide to try different scenarios within the sandbox. The sections below and the Document processing testing page provide PDFs to upload and test various scenarios.

Document processing testing

When testing for Integrating Document Processing, the PDF downloads in the list below cover different scenarios for sandbox use.

Upload the documents for testing different sandbox response results. Pay statements and tax documents return data when successful. Unsuccessful uploads respond with status updates.

Suspicious document detection

Documents from uploads may have fraudulent, consistent, or unacceptable data. When encountering these issues, mark specific instances for review. Prevent malicious activity with additional analysis and attention.

View the table below for various scenarios and PDF examples.

ScenarioDescriptionDownloads
Tampered documentsInformation is falsified or manipulatedTampered 1, Tampered 2, Tampered 3
Different Social Security NumbersPersonal information is inconsistentSSN 1, SSN 2, SSN 3
Different company namesCompany information is inconsistentCompany 1, Company 2, Company 3
Different applicant namesPersonal information is inconsistentApplicant 1, Applicant 2, Applicant 3
Documents without data, or with invalid dataInformation is missing or unable to be parsedNo data 1, No data 2, No data 3
📘

Note

Test scenarios use the file name to return results. Testing ignores the file contents when in the sandbox.


Suspicious Flag (is_suspicious)

Overview

When processing documents, our system may flag them as suspicious while still completing the task. The is_suspicious flag indicates potential document manipulation or irregularities that warrant additional review, but don't necessarily indicate definitive fraud.

When Documents Are Marked as Suspicious

We mark is_suspicious = true when we complete the task but detect potential issues in the following categories:

PDF Metadata Indicators

Documents are flagged as suspicious when PDF metadata indicates potential manipulation. Our system checks against a comprehensive list of PDF software commonly used for document editing, including:

  • FoxitPDF
  • Adobe Scan
  • CamScanner
  • Other document editing applications

Paystub Date Inconsistencies

For paystubs specifically, we flag documents when pay dates don't align with expected timelines:

Too Early: Pay date occurs before the pay period start date
Too Late: Pay date is 30 or more days after the pay period end date

Browser-Generated PDFs

Documents created through browser PDF generation are flagged as suspicious because they indicate the original content may have been modified before conversion to PDF.

Most common scenario: Using browser "Print → Save as PDF" functionality on non-PDF files (images, documents, HTML pages). This creates a new PDF from potentially altered source material.

Not flagged: Direct downloads of existing PDF files from browsers.

Rare cases: Using "Print → Save as PDF" on PDFs opened in browsers may occasionally trigger this flag, though testing shows this is uncommon across most browser/OS combinations.

When Documents Are Failed (Not Just Flagged)

In cases where we have high confidence of fraudulent activity, we fail the task entirely rather than just marking it suspicious:

  • Known fraud templates: Our database returns very high confidence that the document matches a known fraudulent template
  • Obvious manipulation tools: PDF metadata indicates use of tools clearly intended for document manipulation, such as:
    • iLovePDF
    • Canva
    • Similar document editing platforms

Important Notes

  • Documents flagged as is_suspicious = true are still processed and results are returned
  • The suspicious flag serves as an indicator for additional review, not definitive fraud detection
  • Failed documents (due to high fraud confidence) do not return results and require resubmission with valid documentation

What’s Next

Use the guide below to begin implementing Truv VOIE into your workflow.