PDF Functions
get_pdf_fonts
get_pdf_fonts(ibocr)
Get PDF Fonts associated with provided input NOTE: The flavour of the function that takes INPUT_IBOCR will be deprecated after September 30th 2019. Please use in INPUT_IBOCR_RECORD. Args: ibocr (Union[IBOCRRecordDict, IBOCRRecord]): Could be either a: - Dictionary with info about one ibocr record - The IBOCRRecord itself Returns: Returns pdf fonts used across this entire document Examples: get_pdf_fonts(INPUT_IBOCR) -> [{'name': 'TimesNewRoman', 'type': 'Type1', 'encoding': 'PDFEncoding'}] get_pdf_fonts(INPUT_IBOCR_RECORD) -> [{'name': 'TimesNewRoman', 'type': 'Type1', 'encoding': 'PDFEncoding'}]
get_pdf_metadata
get_pdf_metadata(ibocr, field_name)
Get PDF Metadata associated with provided input NOTE: The flavour of the function that takes INPUT_IBOCR will be deprecated after September 30th 2019. Please use in INPUT_IBOCR_RECORD. Args: ibocr (Union[IBOCRRecordDict, IBOCRRecord]): Could be either a: - Dictionary with info about one ibocr record - The IBOCRRecord itself field_name (string): PDF metadata field name to retrieve. Valid field names are: title, author, subject, keywords_str, creator, producer, creation_timestamp, modification_timestamp, trapped_str. Timestamps are provided in seconds since epoch. See PDDocumentInformation for information about what each field indicates. Returns: Returns pdf metadata given the specified field Examples: get_pdf_metadata(INPUT_IBOCR, 'title') -> "title of the PDF" get_pdf_metadata(INPUT_IBOCR_RECORD, 'title') -> "title of the PDF"