Skip to content Skip to sidebar Skip to footer

Python How To Add Exception?

@martineau I have updated my codes, is this what you meant ? How do i handle KeyError instead of NameError ? url = 'http://app2.nea.gov.sg/anti-pollution-radiation-protection/air-

Solution 1:

You need to add something to set data[bold_time]:

if td.find('strong'):
        bold_time = cur_time
        data[bold_time] = ????? # whatever it should be
    cur_time += datetime.timedelta(hours=1)

This should avoid both the NameError and KeyError exceptions as long as the word strong is found. You still might want to code defensively and handle one or both of them gracefully. That what exception where meant to do, handle those exceptional cases that shouldn't happen...

Solution 2:

I had read your previous post before it disappeared, and then I've read this one. I find it a pity to use BeautifulSoup for your goal, because, from the code I see, I find its use complicated, and the fact is that regexes run roughly 10 times faster than BeautifulSoup.

Here's the code with only re, that furnishes the data you are interested in. I know, there will people to say that HTML text can't be parsed by regexs. I know, I know... but I don't parse the text, I directly find the chunks of text that are interesting. The source code of the webpage of this site is apparently very well structured and it seems there is little risk of bugs. Moreover, tests and verification can be added to keep watch on the source code and to be instantly informed on the possible changings made by the webmaster in the webpage

import re
from httplib import HTTPConnection

hypr = HTTPConnection(host='app2.nea.gov.sg',
                      timeout = 300)
rekete = ('/anti-pollution-radiation-protection/''air-pollution/psi/''psi-readings-over-the-last-24-hours')

hypr.request('GET',rekete)
page = hypr.getresponse().read()


patime = ('PSI Readings.+?''width="\d+%" align="center">\r\n'' *<strong>Time</strong>\r\n'' *</td>\r\n''((?: *<td width="\d+%" align="center">''<strong>\d+AM</strong>\r\n'' *</td>\r\n)+.+?)''width="\d+%" align="center">\r\n'' *<strong>Time</strong>\r\n'' *</td>\r\n''((?: *<td width="\d+%" align="center">''<strong>\d+PM</strong>\r\n'' *</td>\r\n)+.+?)''PM2.5 Concentration')
rgxtime = re.compile(patime,re.DOTALL)


patline = ('<td align="center">\r\n'' *<strong>'# next line = group 1'(North|South|East|West|Central|Overall Singapore)''</strong>\r\n'' *</td>\r\n''((?: *<td align="center">\r\n'# group 2 start' *[.\d-]+\r\n'#' *</td>\r\n)*)'# group 2 end' *<td align="center">\r\n'' *<strong style[^>]+>''([.\d-]+)'# group 3'</strong>\r\n'' *</td>\r\n')
rgxline = re.compile(patline)

rgxnb = re.compile('<td align="center">\r\n'' *([.\d-]+)\r\n'' *</td>\r\n')


m= rgxtime.search(page)

a,b = m.span(1) # m.group(1) contains the data AM
d = dict((mat.group(1),
          rgxnb.findall(mat.group(2))+[mat.group(3)])
         for mat in rgxline.finditer(page[a:b]))

a,b = m.span(2) # m.group(2) contains the data PMfor mat in rgxline.finditer(page[a:b]):
    d[mat.group(1)].extend(rgxnb.findall(mat.group(2))+[mat.group(3)])


print'last 3 values'for k,v in d.iteritems():
    print'%s  :  %s' % (k,v[-3:])

Post a Comment for "Python How To Add Exception?"