RSS Feeds of BBC News

Max Kleiner
5 min readFeb 1, 2021

Posted on February 1, 2021 by maxbox4

Version:0.9 StartHTML:0000000105 EndHTML:0000046646

//////////////////////////////////////////////////////
RSS Feeds of BBC News
______________________________________________________
maXbox Starter 81 -RSS feeds in Code - Max Kleiner

"Time goes, you say? Ah, no! alas, time stays, we go."
- Henry Austin Dobson


At its core, RSS refers to simple text files (XML/RDF) with more or less important, updated information — news pieces, articles, weather info, opinion mining that sort of thing.

In the following I want to show this topic thing with the BBC-News feeder. News feeds allow you to see when websites have added new content. You can get the latest headlines and video in one place, as soon as its published, without having to visit the websites you have taken the feed from.

BBC News as our example provides feeds for both the desktop website as well as for our mobile site and the most popular feeds are listed here: https://www.bbc.co.uk/news/10628494.

First we define the URL to get the content from:

Const
RSS_NewsFeed = 'http://feeds.bbci.co.uk/news/world/rss.xml';

RSS is an XML based document format for syndicating news and other timely news-like information. It provides headlines, URLs to the source document and brief description information in an easy to understand and use format.

RSS based “News Readers” and “News Aggregators” allow the display of RSS headlines on workstation desktops. Users of RSS content use apps called feed 'readers' or 'aggregators' (newer versions of Web browsers offer built in support for RSS feeds): a user subscribes to a feed by entering the link of the RSS feed into their RSS feed reader; of course software libraries exist to read the RSS format and present RSS headlines on webpages and other online applications like in our script example with SimpleRSS. So you can feed a memo or text component in your form.

In the XML baseline you see this:

<?xml version="1.0" ?>
- <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://purl.org/rss/1.0/">
- <!-- XML Generated by SimpleRSS http://simplerss.sourceforge.net at Sat, 15 Jan 2021 11:43:18 -->
- <channel xmlns="" rdf:about="">
<title>Title Required</title>
<link>Link Required</link>
<description>Description Required</description>
- <items>
<rdf:Seq />
</items>
</channel>
</rdf:RDF>


This structure is a convention and suitable for a record:

type
TRSSItem2 = record
FPubDate: TDate;
FLink: string;
FTitle: string;
FDescription: string;
end;

The FPubDate made me a bit upset because I had to convert it in TRFC822DateTime Format. This specified date-time value, while technically valid, is likely to cause interoperability issues.

The value specified must meet the Date and Time specifications as defined by RFC822, with the exception that the year should be expressed as four digits.

The syntax is like (all single spaced and no comments):

<pubDate>Wed, 02 Oct 2020 08:00:00 EST</pubDate>
<pubDate>Wed, 02 Oct 2020 13:00:00 GMT</pubDate>


There, of course, is knowledge on the web, but in my case I use a component with specific objects to adapt. But the good news is the core code is straight and simple:

//RSS Script Feed Snippet:
with TSimpleRSS.create(self) do begin
XMLType:= xtRDFrss;
IndyHTTP:= TIdHTTP.create(self);
LoadFromHTTP(RSS_NewsFeed);
//LoadFromHTTP(Climatefeed);
writeln('RSSVersion: '+Version)
writeln('SimpleRSSVersion: '+SimpleRSSVersion)
for it:= 0 to items.count-1 do
writeln(itoa(it)+': '+Items[it].title+':
'+items[it].pubdate.getdatetime);
end;


Items are sticked together in the class TRSSItems and inherits from TOwnedCollection. These CollectionItem objects in turn contain their own published child property which descends from TCollection and contain their own TCollectionItem descendant - so a nested TCollection/TCollectionItem scenario.

The http provider is by default Indy and we can load the feeds as a stream with the LoadFromHTTP() Method. The RSS Version is based with 2. These modern supplied RSS documents use the RSS 2.0 format. Each RSS item links to the html/web documents are described. Additional technical information is available from the following non-US Government website like BBC News.

The XMLType is based on the RDF proposal. RDF is a kind of sematic web. The Semantic Web enables devices to seek out knowledge distributed throughout the Web, mesh or mix it, and then take action based on it. Simply said: The Resource Description Framework (RDF) is the W3C standard for encoding knowledge.

After running the script you get this similar output:

29: Life in a Day: Kevin Macdonald says film 'reinforces everyones similarities': Mon, 01 Feb 2021 00:00:57 GMT
link: https://www.bbc.co.uk/news/entertainment-arts-55861945
descript: Kevin Macdonalds documentary features personal videos from across the world - all shot on the same day.
rssitem.title BBC News - World
rssitem.link https://www.bbc.co.uk/news/

To build your own provider for example with TLS1.3 or to link with your own XML-Parser/DOM vendor you can use the LoadFromStream() Method. Now the code block for a weather service with HTTPS and LoadFromStream():

Const Weatherfeed5Bern=
'https://weather-broker-cdn.api.bbci.co.uk/en/forecast/rss/3day/2661552';

function GetBlogStream8(const S_API, pData: string;
astrm: TStringStream): TStringStream;
begin
HttpGET(S_API, astrm) //maps HTTPS from WinInet_HttpGet
result:= astrm;
end;

strm:= TStringStream.create('');
strm:= GetBlogStream8(WeatherFeed5Bern,'', strm);

with TSimpleRSS.create(self) do begin
XMLType:= xtRDFrss; // bbcnews: xtRDFrss;
//( xtRDFrss, xtRSSrss, xtAtomrss, xtiTunesrss )');
//GenerateXML;
LoadFromStream((strm));
SaveToFile('C:\maXbox\Lazarus\rssbbctest.xml');
writeln('RSSFeedVersion: '+Version)
writeln('SimpleRSSVersion: '+SimpleRSSVersion)
for it:= 0 to items.count-1 do
writeln(itoa(it)+': '+Items[it].title+':'+items[it].pubdate.getdatetime);
strm.Free;
end;


And the output as iterated items from RSS-Reader will be (3-day forecast):

RSSFeedVersion: 2.0
SimpleRSSVersion: ver 0.4 (BlueHippo) Release 1
0: Today: Light Snow, Min Temperature:-5°C (23°F) Max Temperature: 0°C
(32°F): Sat, 15 Jan 2021 10:37:30 Z
1: Saturday:Light Cloud,Min Temperature:-3°C (27°F) Max
Temperature:-1°C (30°F):Sat,15 Jan 2021 10:37:30 Z
2: Sunday:Sleet Showers,Min Temperature:-1°C (31°F) Max Temperature:3°C
(38°F):Sat,15 Jan 2021 10:37:30 Z


By the way, weather data have their own format, which is called NWS, and is not to be confused with RSS and cannot be read by RSS readers and aggregators. These files present more detailed information than the RSS feeds in strings friendly for parsing. Both the RSS and XML feeds offer URLs to icon images.

Conclusion:
Really Simple Syndication (RSS) is a family of web formats used to publish frequently updated digital content. Most commonly used to update news articles, weather reports or traffic services and other content that changes quickly, RSS feeds may also include audio files (PodCasts) or even video files (VodCasts). SimpleRSS components provides methods for accessing, importing, exporting and working with RSS, RDF, Atom & iTunes Feeds. This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License.


Next time I show an an online translation service with sentiment analysis:
Trump's false election fraud claims face a dead end
Trumps falsche Wahlbetrugsansprüche stehen vor einer Sackgasse
Trump's valse verkiezingsfraudeclaims lopen dood
Les fausses allégations de fraude électorale de Trump font face à une impasse


Ref Script & Component:

http://www.softwareschule.ch/examples/bbcnews.txt

http://simplerss.sourceforge.net

script: 1017_XmlDocRssParser.pas

Doc:

https://maxbox4.wordpress.com

http://feedvalidator.org/docs/rss2.html

http://web.resource.org/rss/1.0/modules/content/

--

--

Max Kleiner

Max Kleiner's professional environment is in the areas of OOP, UML and coding - among other things as a trainer, developer and consultant.