There are different spaces: & nbsp; β‰  C2A0

image



I just spent over two hours troubleshooting a seemingly simple HTML issue. When I copied and pasted the small section of HTML, the web browser displayed the section I just pasted differently from the original. The horizontal spacing between some elements was slightly different, which made the entire page look wrong. But how could this be? The two sections of HTML were identical - the new one was literally a copy of the old one.



This seemingly simple problem defied all my attempts to explain it. I have come up with a lot of great theories: problems with my CSS classes or with margins and padding. Inappropriate HTML tags. Browser errors. I tried three different browsers and all got the same results.



Feeling confused, I looked again at the two HTML sections in the WordPress editor (text view) and confirmed that they were completely identical. Then I tried Firefox's built-in web developer tools to view rendered page elements and compared all their CSS properties. Identical, but somehow rendered differently. I used the developer tools to check the exact HTML received from my web server, checked the two sections again and made sure they are symbolically identical. Firefox's page source tool has also confirmed that the two sections are completely identical.



At this point, I was ready to blame cosmic rays or voodoo magic. I found that every time I copy any similar section of HTML, the section I just pasted will render in the browser with the wrong spacing between the elements. How could this be? Then I tried W3C Validator, which found some other problems with my page, but nothing could explain this behavior. Again, he confirmed that despite the different rendering in the browser, the two sections of HTML are identical.



It is clear that something did not work out. I used curl to download a web page from my web server, looked at the local copy and saw the same behavior as before. But when I opened the saved .html document with a hex editor, I finally got a response. The two HTML sections were not identical: one section used a different type of space than the other.



What the hell.



I found that the original HTML section contains non-breaking spaces. But instead of encoding them with & nbsp; they were encoded with C2A0 unicode characters. I don’t know when or how it happened, but I blame it on WordPress. When viewed in this section in the WordPress HTML editor, the C2A0 spaces looked like normal spaces, and when the section was copied inside the editor, the non-breaking spaces were automatically converted to normal hexadecimal spaces of 20. Thus, the copied version displayed differently, although the original HTML turned out to be the same.



It looks like a remake of 0 β‰  Oh, only worse. I didn't even know that non-breaking spaces have their own Unicode encoding - I thought & nbsp; was the only way to code them. I changed the HTML again to use & nbsp; and everything is working fine now.



I'm surprised how many different tools have failed to identify this subtle but important difference between the types of whitespace in your HTML source code. The WordPress HTML editor was unable to show or handle the difference correctly. Firefox web developer tools and page source tools crash. W3C validator initial representation error. Curl plus a hex editor was the only way to definitively establish reliable information about the exact content of the HTML source code.



All Articles