If you're trying to import an RSS feed (or feeds) in your website and seem to have some problems with funny characters or formatting, read on.
The Problem
When printing the content from your RSS feed using PHP or other programming language, some of the characters do not print right.
The Cause
The problem is most likely being caused by incorrect character encoding by the web browser. Most RSS feeds I work with use the ISO-8859-1 (Western) character set. This could be a problem because many web pages specifically define UTF-8 as the character set to use.
When these two worlds collide, the web browser will try to render characters from its own character set instead of the one intended. In some cases you may see an incorrect character, or you may see a question mark or some other character the browser uses to indicate it can't tell what character to print.
In some cases, the web browser will try to auto-detect the character set in use. However, if you define the character set to use in the header of your web page, then the browser will enforce using that character set, and you may see the funny characters.
The Solution
To display the characters properly, you need to:
- Find out what character set your web page is using
- Find out what character set the XML RSS feed is using
- Convert the strings of thee RSS feed to the appropriate character set
A quick fix to this problem could be to do the following:
In the header of the web page add the following line to force the web browser to use the UFT-8 character set, just before the <title>:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
Convert the character set to UTF-8 when printing your RSS lines. Using PHP, you would use the utf8_encode function:
<?php print utf8_encode( $rss->title ); ?>
This will convert the RSS title to UTF-8, thus matching the character set used in your web browser. If you don't know the character set of the web page and RSS feed, or are just too lazy to find out, you can play with different encoding/decoding schemes until you find the one that prints properly.
Different languages will have different methods for converting, e.g. perl will require the Unicode module. Look up the proper reference for the language you are using. Good luck.
