User Guide > FAQs > PDF Direct Translation

PDF Direct Translation

Translation for PDF returned plain text file

If you want to receive translated PDF instead of plain text file, you have to choose "PDF Direct" when you select files to translate. You'll get a prompt asking your choice - plain text or PDF Direct. When PDF Direct is selected, a tag should appear next to the filename as shown below

If you do not see the PDF Direct tag next to your filename, please cancel current process and restart.

Texts in PDF are unreadable, full of garbage characters.

The text in a pdf document contains an encoding which maps internal character codes in the pdf file to unicode character codes. We use this encoding to extract the text from the pdf document for translation. Your original pdf may contain garbage characters in the encoding. In order to fix this problem, your pdf needs to be created in different way. Or, fixing encoding directly in the pdf is possible but need extra human labor. Please submit a request if you want our technical support team to take a close look at your pdf files.

Error returned for PDF with scanned image

When you upload PDF generated with scan image, OCR will be processed to recognize text out of scan image. Some OCR results are excellent. However, OCRed texts are often not complete and not properly segmented for MT translation. We recommend not to use scan image PDF for MT translation.

PDF file is not readable

There seems to be compatibility issue with your PDF. PDF files are different by the software that generated the PDF. When this message shows, please open the PDF using Acrobat X or higher and save as new PDF. Try again with this newly saved PDF. If same problem continues, submit a request to our support team.

Translated texts mixed with source

If your PDF was processed by OCR like HP Smart Scan, OCR text may be hidden behind the original scanned images. In this case, translated texts are inserted over the original scanned image like below example.

In order to fix this problem, ensure that your OCR software doesn't keep the original image but just converts the PDF into an editable document. This is what PDF Direct will do. Or use special software like InFix to remove the original image from the PDF.