Get PDF Text Layer

PDF - Get Text Layer

This operation extracts the text layer from the PDF document provided.

Endpoint

POST /api/v1/Pdf/GetPdfTextLayer

Request Parameters

Name	Type	Required	Default	Description
Filename	string	Yes	-	Filename of the source PDF file
FileContent	string	Yes	-	Base64-encoded PDF content
StartPage	integer	No	1	Page number from which to begin text extraction
EndPage	integer	No	Last page	Page number on which to end text extraction
Pages	string	No	-	Comma separated list of pages or page ranges (e.g., “1,3,5-7”)
TextEncodingType	string	No	”UTF8”	Encoding type used for text extraction. Options: “UTF8”, “Latin1”, “BigEndianUnicode”, “UTF16”, “ASCII”

Response

Name	Type	Description
textLayer	string	The text layer extracted from the PDF document

Implementation Details

The operation extracts text content from the specified pages of a PDF document. It processes the document and identifies the text layer, which contains machine-readable text.

Usage Notes

If both page ranges and specific pages are provided, they will be combined
The operation handles PDF files with embedded text layers
For scanned documents without a text layer, OCR processing would be required (not part of this operation)
The encoding type parameter allows handling different character encodings in the PDF

Credit Cost

Cost: 1 credit(s) per 5 pages

Note: Cost depends on the number of pages in the document