Problem with Convert to word

life24

Senior member
Mar 25, 2014
283
0
76
Hello,
When i convert pdf file with english language to word, all document and words were to square shape (except image).
What i do? thanks :(
 

life24

Senior member
Mar 25, 2014
283
0
76
q7ml_convert.jpg


After convert PDF to Word


8hz6_converrt2.jpg
 

C1

Platinum Member
Feb 21, 2008
2,352
100
106
Check that the PDF is not protected in some way (eg, check security/allow-ability settings)'

If the PDF has no security set, then try an on line converter.

Another possibility is just to copy (select all ?) & past the PDF material into "WORD".
 

Mike64

Platinum Member
Apr 22, 2011
2,108
101
91
Hello,
When i convert pdf file with english language to word, all document and words were to square shape (except image).
What i do? thanks :(
Does the PDF contain actual "text", or just an image of the original document? If it's just an image, you'd have to use a converter capable of OCR ("optical character recognition"), not just a "format conversion"-type program. If you don't have a standalone program capable of OCR, there are some basic web-based utilities out there...
 

life24

Senior member
Mar 25, 2014
283
0
76
Check that the PDF is not protected in some way (eg, check security/allow-ability settings)'

If the PDF has no security set, then try an on line converter.

Another possibility is just to copy (select all ?) & past the PDF material into "WORD".

Thanks for your reply,
I use desktop and online convertor and doesn't success.
Could you more explain about it? what i do exactly?
Check that the PDF is not protected in some way (eg, check security/allow-ability settings)'

Thanks
 

life24

Senior member
Mar 25, 2014
283
0
76
Does the PDF contain actual "text", or just an image of the original document? If it's just an image, you'd have to use a converter capable of OCR ("optical character recognition"), not just a "format conversion"-type program. If you don't have a standalone program capable of OCR, there are some basic web-based utilities out there...

Both.
Image and text.
Image can convert and no problem . but all text crash. :(
 

Mike64

Platinum Member
Apr 22, 2011
2,108
101
91
Image can convert and no problem . but all text crash.
That's very strange...

Just to clarify, what I meant is, are you sure the text is actually composed of individual characters (ASCII, Unicode, whatever), or could the written material be just a scanned image (in jpg, png, etc. format) of the original document? Just because it looks like "text" when displayed or printed doesn't mean it's actually stored in the PDF file in a "character-based" format.
 
Last edited:

Mike64

Platinum Member
Apr 22, 2011
2,108
101
91
Another possibility is just to copy (select all ?) & past the PDF material into "WORD".
Could you more explain about it? what i do exactly?
If you're using Adobe's PDF reader, check the document's security settings by selecting "Properties" under the "File" menu at the top of the window, then click on the "Security" tab. Specifically, check to see if "Content Copying" is "allowed." Other PDF readers should have the same function, though it might be under a different menu.

Assuming the security settings allow it, to copy & paste from the PDF, just open it up with your PDF reader then use your mouse to highlight the text you want to copy, use CTRL-C or the "copy" function in the "edit" menu, and then "paste" that text into an open Word document.
 
Last edited:

life24

Senior member
Mar 25, 2014
283
0
76
If you're using Adobe's PDF reader, check the document's security settings by selecting "Properties" under the "File" menu at the top of the window, then click on the "Security" tab. Specifically, check to see if "Content Copying" is "allowed." Other PDF readers should have the same function, though it might be under a different menu.

Assuming the security settings allow it, to copy & paste from the PDF, just open it up with your PDF reader then use your mouse to highlight the text you want to copy, use CTRL-C or the "copy" function in the "edit" menu, and then "paste" that text into an open Word document.

Thanks, but it doesn't effect.
hr2u_perm.jpg
 

life24

Senior member
Mar 25, 2014
283
0
76
That's very strange...

Just to clarify, what I meant is, are you sure the text is actually composed of individual characters (ASCII, Unicode, whatever), or could the written material be just a scanned image (in jpg, png, etc. format) of the original document? Just because it looks like "text" when displayed or printed doesn't mean it's actually stored in the PDF file in a "character-based" format.

Yes , i'm sure it is text.
 

life24

Senior member
Mar 25, 2014
283
0
76
Now i test more than 20 convertor software.
Just one website can convert it to word, but this site doesn't free and just first two page each document can convert (free).
what's the matter?
 

mikeymikec

Lifer
May 19, 2011
19,121
12,404
136
Adobe Reader has an option to save the pdf as text only. If you use this feature, does the resulting text file have the text in that you wanted?
 

sm625

Diamond Member
May 6, 2011
8,172
137
106
Just one website can convert it to word, but this site doesn't free and just first two page each document can convert (free).
what's the matter?

How many pages are we talking about here? If you go through your document and select two pages at a time, you can do a "print to pdf" for each pair of pages and use that site to get a series of two page Word documents.
 

C1

Platinum Member
Feb 21, 2008
2,352
100
106
Based on the Permission Settings shown, it would seem that the document is protected from content copying and that is what is causing the inability to convert.

"Content Copying and Extraction: If this is disabled, selecting document contents and copying it to the clipboard for repurposing the contents is prohibited."

https://www.pdflib.com/knowledge-base/pdf-security/permissions/


Addendum:
"Page Extraction would encompass removing pages from a PDF for either dis-assembly of a document or taking any pages out to make as separate documents. When this security feature is enabled, the document and its pagination cannot be manipulated within the program."

http://www.experts-exchange.com/que...ions-content-copying-and-page-extraction.html

PS: The impression I get from the literature is that the two permission fields you show as "not allowed" typically are set when encryption of the PDF is employed. That would explain the messed up text but with the graphic being okay. Supposedly Adobe permission settings are not all that secure (ie, capable of being removed via hacking application software). Good luck.
===================================
http://www.pdfconverter.com/resources/pdftips/howtounlockpdf/
 
Last edited:

Mike64

Platinum Member
Apr 22, 2011
2,108
101
91
Based on the Permission Settings shown, it would seem that the document is protected from content copying and that is what is causing the inability to convert.
The screen grab the OP posted shows that while "page extraction" is disallowed, "content copying" is in fact allowed.

ETA: I'm having trouble finding out exactly how the formal PDF specification differentiates "page extraction" from "content copying," but they are clearly different permissions even though the page you linked appears to treat them as one and the same. So allowing "content copying" must allow some sort of copying even if "page extraction" disallows pulling out entire pages in their original, full format-within-the-overall-file-format.

I'm guessing that "page extraction" allows copying of more of the internal control/format structure in the file, while "content copying" allows for literally only copying specific content elements without reference to their "place" in the PDF file's overall structure. if that's correct, the latter should allow copying individual text characters or images, assuming they are in fact stored in the file as separate elements.
 
Last edited:

Mushkins

Golden Member
Feb 11, 2013
1,631
0
0
It's also possible that this is a font issue. PDF is a published format, you can embed additional fonts in it so whoever views it can view that font without needing to download/purchase that font and install it on their PC.

If you're converting to Word and it's trying to preserve font information for a font you don't have installed, you're gonna get pages of garbage symbols. If the file *is* definitely using text and not an image, and there's no copy protection on it, your best bet is really to copy/paste into a word document as raw text (no formatting information or font/style) then clean it up.

PDFs are honestly not a format that's designed to be converted to other document types, they're meant to be the final copy for distribution.