aws-textract-client
TypeScript icon, indicating that this package has built-in type declarations

1.1.1 • Public • Published

AWS Textract Client

A helpful class to perform server-sided AWS Textract actions.

Install

npm i aws-textract-client

Features

  • upload documents to S3 bucket
  • create and delete SNS Topics
  • create and delete SQS Queues and subscribe to SNS topic (required for async multi-page processing)
  • various AWS Textract actions
    • Analyze document
    • Detect document
    • Analyze invoices
  • process & simplify results

Example

Create SNS Topic and SQS Queue

Topics and Queue for Textract should start with AmazonTextract!

import { AWSTextractClient } from "aws-textracr-client";

// config contains AWS credentials settings and is used for all client instances
const textractClient = new AWSTextractClient(config);

// create topic and queue once
const topicArn = await textractClient.createTopic("AmazonTextractMyTopic");

const queueUrl = await textractClient.createQueue(
  "AmazonTextractMyQueue",
  topicArn
);

Process Invoce

import { AWSTextractClient } from "aws-textracr-client";

// config contains AWS credentials settings and is used for all client instances
const textractClient = new AWSTextractClient(config);

// set role and topic of notification channel
textractClient.setNotificationChannel(roleArn, topicArn);

// upload document to S3 bucket
await textractClient.uploadDocument(
  S3_BUCKET,
  file,
  createReadStream(pathToYourFile)
);

// process document and get results
const results = await textractClient.processDocument(
  AWSTextractClient.TYPE_EXPENSE,
  S3_BUCKET,
  file,
  queueUrl
);

Results

All results contain up to 3 confidence values, usually for the label and the value and additionally for the pre-trained invoice field mapping.

Additionally you can get the position of the text on the document in form of boundary boxes.

Example response:

{
  "$metadata": {
    "httpStatusCode": 200,
    "requestId": "d15315f4-239b-40d8-87f2-8e2731a31e4c",
    "attempts": 1,
    "totalRetryDelay": 0
  },
  "AnalyzeExpenseModelVersion": "1.0",
  "DocumentMetadata": {
    "Pages": 1
  },
  "ExpenseDocuments": [
    {
      "Blocks": [...],
      "ExpenseIndex": 1,
      "LineItemGroups": [...],
      "SummaryFields": [...]
    }
  ],
  "JobStatus": "SUCCEEDED"
}

The field JobStatus is only present in multi-page processes.

Links

Readme

Keywords

Package Sidebar

Install

npm i aws-textract-client

Weekly Downloads

35

Version

1.1.1

License

ISC

Unpacked Size

52.8 kB

Total Files

12

Last publish

Collaborators

  • markus.michalsky