jens-schaller.de

typo3 - full scale news posts in XML-feeds

synopsis

What do you get in this article?

  • full length news posts exported into an XML feed
  • including valid XHTML formatting
  • working links to files, internal and external targets, etc.
  • support for prefix symbols, e.g. for the dh_linklayout extension

I recommend reading the whole article. If you can't wait, you can of jump directly to the solution part.

introduction

As you may know, the typo3 news extension (tt_news) can be used to export news posts into an XML based news feed. You can choose between RSS2.0, RSS0.91, Atom 0.3 and RDF feeds.

A news post always consists of a summary (subheader in tt_news) and a full length version (bodytext in tt_news). If no summary is specified, the full length version will be shortened after a configurable length.

When the news feed is rendered, all HTML markup will be removed. "That's ok, because we're talking about XML feeds" you'll probably say and you're right, but only if you got a short, descriptive summary of the article.

If you'd like to have the full length article exported into the feed, you can simply set the crop length of the summary to a large number using

displayXML.subheader_stdWrap.crop = 10000 | ... | 1

for example. What you get now, is the whole text of the news post without any markup (thus in one single line without any linebreaks, etc.) and without any other content, like images, etc.

You can also disable the stripping of the HTML markup using

displayXML.subheader_stdWrap.stripHtml = 0

Well, it's not that easy, as it seems.

problems

So, what's this buzz all about? There are a some major problems with this solution:

<br> instead of <br />

When exporting something into an XML feed, we want to make sure that the included HTML markup is rendered as valid XHTML. If you are using the WYSIWYG HTML editor rtehtmlarea like me, and you enter a simple linebreak, you can configure htmlArea RTE to render a [br /]-tag (valid XHTML) instead of [br] (normal HTML). (I can't use pointy braces, because then the tags would be rendered as real linebreaks ;)

When saving the HTML, only the normal [br] will be saved to the database. Later, when rendering the output of a page on the website, the [br] will be rendered as [br /].

The problem is, that the XML rendering method, somewhere in the core of typo3, doesn't know anything about rendering the needed [br /]-tag.

wrapping all XHTML markup into a CDATA section

To really ensure, that the XML feed is valid, I wanted to wrap every single news post in a CDATA section. This wasn't possible since before the XML rendering the news post is been encoded, resulting in showing all source code of the news post, when receiving the news feed in a news aggregator.

prefixing all internal links with the domain

This one was really naughty! When saving content including links to internal targets, like files, other pages, etc. the domain of the website will be stripped from the links.

In the rendering phase of the XML feed those links won't be prefixed with the domain, resulting in links like "?id=123" instead of "http://myWebsite.com/?id=123". OK, you can treat every internal link as an external and use the full URL of the target, but hey who is that nuts? Also, when using extensions like dh_linklyout you don't even have the chance to fix this.

CSS formatting of news posts in the XML feed

Last but not least: If I am creating a news post, I'd like it to be seen, as I want it to. Meaning: I have a CSS styled website, why can't I have CSS styled news post in an XML feed.

Let me get something straight: I'm not a fan of formatting every single tag in an HTML markup using inline styles. I can almost here some of you out there scream: "RSS is for content, not for styling!" and you are right. But I'd like to have basic styling of the news content, like font sizes, headings and removing borders around images/symbols with links. Is that too much to ask? Why not include a small CSS snippet?

solution

patching tt_news

The solution to all of our problems lies in the file class.tx_ttnews.php, which is located either in

typo3/ext/tt_news/pi/

if you installed tt_news globally, or in

typo3conf/ext/tt_news/pi/

if you installed it locally on your webserver.

What I did is the following:

  • replaced the summary used for generating the XML feed with the full length version ($row['short'] -> $row['bodytext'])
  • decoded this content using html_entity_decode()
  • prefixed all links with the domain, if necessary using $htmlParse->prefixResourcePath()
  • replacing all [br]-tags with [br /] using eregi_replace(). (This will hopefully be fixed in the future. A bug report has already been send.)

To apply these changes you can either

WARNING: The patch was created based on tt_news v2.2.24. It may not work with future releases, but I plan to post updates if necessary.

modifying the XML feed templates

If you liked what I wrote about basic CSS styling, you can use my templates for the different news feeds:

For information on how to use these template, please consult the tt_news manual.

That's it!

future

By altering the tt_news class to render the bodytext instead of the summary of a news post, setting the summary for a post has no effect on the XML feed anymore. It works for me at the moment, but it would be nice to have to possibility to choose between the summary and the full length post when defining XML feeds. I've seen many times, that blogs provide both feeds for summaries and for full length posts.

Maybe it will be integrated into tt_news sometime, or maybe I'll do it myself.