Get PDF Text Layer
PDF - Get Text Layer
Section titled “PDF - Get Text Layer”This operation extracts the text layer from the PDF document provided.
Endpoint
Section titled “Endpoint”POST /api/v1/Pdf/GetPdfTextLayerRequest Parameters
Section titled “Request Parameters”| Name | Type | Required | Default | Description |
|---|---|---|---|---|
| Filename | string | Yes | - | Filename of the source PDF file |
| FileContent | string | Yes | - | Base64-encoded PDF content |
| StartPage | integer | No | 1 | Page number from which to begin text extraction |
| EndPage | integer | No | Last page | Page number on which to end text extraction |
| Pages | string | No | - | Comma separated list of pages or page ranges (e.g., “1,3,5-7”) |
| TextEncodingType | string | No | ”UTF8” | Encoding type used for text extraction. Options: “UTF8”, “Latin1”, “BigEndianUnicode”, “UTF16”, “ASCII” |
Response
Section titled “Response”| Name | Type | Description |
|---|---|---|
| textLayer | string | The text layer extracted from the PDF document |
Implementation Details
Section titled “Implementation Details”The operation extracts text content from the specified pages of a PDF document. It processes the document and identifies the text layer, which contains machine-readable text.
Usage Notes
Section titled “Usage Notes”- If both page ranges and specific pages are provided, they will be combined
- The operation handles PDF files with embedded text layers
- For scanned documents without a text layer, OCR processing would be required (not part of this operation)
- The encoding type parameter allows handling different character encodings in the PDF
Credit Cost
Cost: 1 credit(s) per 5 pages
Note: Cost depends on the number of pages in the document