Odd HTML output

MonKENy · Sep 8, 2012

Im not sure why but when I try to add an apostrophe to a word exp. Ken's it comes out Ken?s on the website.

Im using Filezilla not sure whats wrong

Markbnj · Sep 8, 2012

Well, a question mark often shows up in place of an unprintable character. Are you sure you're adding an apostrophe? What character encoding is the page using?

MonKENy · Sep 9, 2012

How do i tell what encoding its using? Im new to this. Someone else built the site. I just update the text changes.

Markbnj · Sep 9, 2012

Not all pages specify the character encoding. Look for a tag something like this near the top:

Code:

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />

It may not be there, and I think there are other ways to specify it as well.

MonKENy · Sep 9, 2012

Markbnj · Sep 9, 2012

As an experiment change that 'charset=utf-8' to 'charset=ISO-8859-1' and see what happens on the page.

Cerb · Sep 9, 2012

Are you changing OSes? Like using a Windows client with a *n*x server? If so, you may want to try forcing FileZilla to transfer all files as binary, then re-download them.

Are you using a proper text editor, that is aware if and able to use different character encodings? If so, you should be able to open the current page in it, and select some common encoding that causes a strange representation for '.

I'll throw out a guess that the server is serving it as UTF-8, and the browser expecting UTF-8, but the file was edited as 8859; or that the reverse is the case.

MonKENy · Sep 10, 2012

I just opened the page on chrome, hit f12 to see all the source and then pasted what was there for you

rotflmao I have no Idea what you guys are talking about. I have no experience here. My web Designer gave me the usr and pswrd to be able to make basic changes so he doesnt charge us for little things. If one of you wants to grab it and check it out ill be more than happy to give you the info.

Markbnj · Sep 10, 2012

At this point I would suggest you take it up with the designer. It sounds like he needs to be your main resource on this.

MonKENy · Sep 10, 2012

hes damn near impossible to get a hold of. Thats why I took over the basic editing. I have made alot of changes its just that pesky ' thats biting me in the back of the neck.

Cerb · Sep 10, 2012

/me whips out a collapsible podium

MonKENy said:
rotflmao I have no Idea what you guys are talking about. I have no experience here.

In ye olden days, letters and such were encoded using as few bits as possible. Eventually the English world settled on ASCII, pretty much, in which basically everything could use 7 bits. Stuff got added, and ANSI was born, using all 8 bits in a byte.

Well, they didn't just have printed characters, but also control characters, decorations, and the like. So you could use a text file as a control file for a printer, for instance. Tab, line feed carriage return, etc., really meant something.

Microsoft more or less standardized on ISO 8859-1, which then became Windows Code Page 1251 after a few additions.

Time went on, and non-english people started using computers in their native languages, with their own ways of representing their own characters. So the same byte value could represent a wealth of different graphical or control characters, depending on what computer read it.

Numbers and letters in text files pretty much stayed the same, for compatibility's sake. But, some characters beyond that might not come out the same on every platform, if it is expecting a different encoding.

Now, there have multiple attempts to come up with a good encoding that can satisfy compatibility, yet also include the rest of the world's characters. Several have been D:, from a technical standpoint, including Unicode. It's kind of a neat story if you're geeky enough, but anyway, Ken Thompson figurd out UTF-8's basic encoding over a meal, and implemented in just a few days in Plan 9, saving the world from another bad character encoding standard. UTF-8 can encode most written languages with relative ease, and garbled data doesn't cause any major risks of ruining the rest of the file.

So, OK, what does that have to do with "Ken's"?
Well, UTF-8 is good, the world is migrating over to it entirely, including the web, and if you aren't using a good text editor (I've used Scite on Windows for so long I've forgotten what else is out there), compatibility will generally be maintained. Among the features of UTF-8 are that you can tell if you're at the first byte of any character or not. But, to do that, 1-byte characters can only be 0-127, and multibyte ones are all higher values for all bytes in them. So a <127 byte followed by a >127 byte, followed by a <127 byte, indicates that the 1st byte is an ASCII character, the second byte is garbage, and the 3rd byte is an ASCII character.

Now, an apostrophe is the same value in ASCII, 8859-1, CP1251, and UTF-8, at 39. So, if you use a real text editor, the ' key will should give you what you need--just do a binary transfer both ways to be safe, in case FileZilla is part of the problem--but in a word processor, that may not be the case, because Windows code page 1251 defines a closing single quote as the value 146.

So, if "Ken's" is saved in 8859-1, with a quote instead of an apostrophe, then served up as UTF-8, you should see K, e, n, ?, s (? usually in a black diamond).

Markbnj · Sep 10, 2012

^^ The definitive answer.

MonKENy · Sep 10, 2012

Im using notepad for text editing as I figured that would be the most universal editor. I liked the info though. I only understood 3/4 of it but I love to learn so thanks!

Charles Kozierok · Sep 10, 2012

Is this a Word document? OP suggests so. Word has a feature called "Smart Quotes" or similar that changes regular apostrophes to angled ones. Try disabling that.

MonKENy · Sep 10, 2012

no its notepad, I never used word to edit HTML

Cerb · Sep 11, 2012

Notepad would certainly not be my first choice, and you would do well to get a text editor that either supports a wide range of encodings, or is grounded in UTF, but it should not be the source of the problem.

While actual features vary, one of the key marks of a good text editor is the ability to show characters that might be wrong on common systems. IOW, the reason not to use Notepad is so that you can see a garbage character instead of an apostrophe, if the apostrophe is encoded differently than for ASCII, 8859-1, or UTF-8.

I still suspect the problem itself lies with either something the server is doing, or that FileZilla is doing. Again, make FileZilla do only binary transfers. If the apostrophe is set to the ASCII value, then serving it as a different encoding (assuming the browser strictly adheres to the content-type in the HTTP header) won't change anything, since it is valid and identical in any common encoding.

I could easily see FileZilla, or the server, though, trying to be, "smart," with text transfers, and screwing the file up somewhere between your filesystem and the web host's file system, by doing a bad conversion. White space is usually what causes grief, but text transfers in FTP have caused problems when crossing system boundaries (usually Win<->*n*x) for many years.

MrScott81 · Sep 11, 2012

Agreed with the above.

Always choose binary for transfers, and for a free text editor, Notepad++ is not a bad choice.

Search

Odd HTML output

MonKENy

Platinum Member

Markbnj

Elite Member <br>Moderator Emeritus

MonKENy

Platinum Member

Markbnj

Elite Member <br>Moderator Emeritus

MonKENy

Platinum Member

Markbnj

Elite Member <br>Moderator Emeritus

Cerb

Elite Member

MonKENy

Platinum Member

Markbnj

Elite Member <br>Moderator Emeritus

MonKENy

Platinum Member

Cerb

Elite Member

Markbnj

Elite Member <br>Moderator Emeritus

MonKENy

Platinum Member

Charles Kozierok

Elite Member

MonKENy

Platinum Member

Cerb

Elite Member

MrScott81

Golden Member

TRENDING THREADS