Getting started
>>> import feedparser >>> d = feedparser.parse("http://feedparser.org/docs/examples/atom10.xml") >>> d['feed']['title'] # feed data is a dictionary u'Sample Feed' >>> d.feed.title # get values attr-style or dict-style u'Sample Feed' >>> d.channel.title # use RSS or Atom terminology anywhere u'Sample Feed' >>> d.feed.link # resolves relative links u'http://example.org/' >>> d.feed.subtitle # parses escaped HTML u'For documentation <em>only</em>' >>> d.channel.description # RSS terminology works here too u'For documentation <em>only</em>' >>> len(d['entries']) # entries are a list 1 >>> d['entries'][0]['title'] # each entry is a dictionary u'First entry title' >>> d.entries[0].title # attr-style works here too u'First entry title' >>> d['items'][0].title # RSS terminology works here too u'First entry title' >>> e = d.entries[0] >>> e.link # easy access to alternate link u'http://example.org/entry/3' >>> e.links[1].rel # full access to all Atom links u'related' >>> e.links[0].href # resolves relative links here too u'http://example.org/entry/3' >>> e.author_detail.name # author data is a dictionary u'Mark Pilgrim' >>> e.updated_parsed # parses all date formats (2005, 11, 9, 11, 56, 34, 2, 313, 0) >>> e.content[0].value # sanitizes dangerous HTML u'<div>Watch out for <em>nasty tricks</em></div>' >>> d.version # reports feed type and version u'atom10' >>> d.encoding # auto-detects character encoding u'utf-8' >>> d.headers.get('Content-type') # full access to all HTTP headers u'application/xml'
Like the parser? Buy the t-shirt!