Hashtag Counter Python
Solution 1:
Fundamentally, your function doesn’t work because this line
hash_index = post_string.find(char)
Will always find the index of the first hash tag in the string. This could be fixed by providing a start index to str.find
, or, better, by not calling str.find
at all and instead maintaining the index when iterating over the string (you can use enumerate
for that). Better yet, don’t use an index, you don’t need it if you restructure your parser to use a state machine.
That said, a Pythonic implementation would replace the whole function with a regular expression, which would make it drastically shorter, correct, more readable, and likely more efficient.
Solution 2:
This should work:
import string
alpha = string.ascii_letters + string.digits
def analyze(posts):
hashtag_dict = {}
for post in posts:
for i in post.split():
if i[0] == '#':
current_hashtag = sanitize(i[1:])
if len(current_hashtag) > 0:
if current_hashtag in hashtag_dict:
hashtag_dict[current_hashtag] += 1
else:
hashtag_dict[current_hashtag] = 1
return hashtag_dict
def sanitize(s):
s2 = ''
for i in s:
if i in alpha:
s2 += i
else:
break
return s2
posts = [
"hi #weekend",
"good morning #zurich #limmat",
"spend my #weekend in #zurich",
"#zurich <3",
"#lindehof4Ever(lol)"
]
print(analyze(posts))
Solution 3:
With your help, I managed to get 2.75 points out of 4. Thanks a lot! I didn't copy-paste any of your solutions into the correction tool, I used my own version that I tried to improve with your suggestions. (I am sure if I posted any of your solutions I would've gotten 4/4.)
According to them, the official solution would have been:
def analyze(posts):
tags = {}
for post in posts:
curHashtag = None
for c in post:
is_allowed_char = c.isalnum()
if curHashtag != None and not is_allowed_char:
if len(curHashtag) > 0 and not curHashtag[0].isdigit():
if curHashtag in tags.keys():
tags[curHashtag] += 1
else:
tags[curHashtag] = 1
curHashtag = None
if c == "#":
curHashtag = ""
continue
if c.isalnum() and curHashtag != None:
curHashtag += c
if curHashtag != None:
if len(curHashtag) > 0 and not curHashtag[0].isdigit():
if curHashtag in tags.keys():
tags[curHashtag] += 1
else:
tags[curHashtag] = 1
return tags
This is of course not an elegant solution, but a solution using exclusively what we have learned so far. Maybe this helps another beginner, who wants to use the tools they have to solve this exercise.
Solution 4:
Well,
this task can be done with regexes, don't be afraid to use them ;) Some quick solution.
#!/usr/bin/python3.4
import re
PATTERN = re.compile(r'#(\w+)')
posts = [
"hi #weekend",
"good morning #zurich #limmat",
"spend my #weekend in #zurich",
"#zurich <3"]
container = {}
for post in posts:
for element in PATTERN.findall(elements):
container[element] = container.get(element, 0) + 1
print(container)
Result:
{'zurich': 3, 'limmat': 1, 'weekend': 2}
EDIT
I would like to use here Counter from collections aswell.
#!/usr/bin/python3.4
import re
from collections import Counter
PATTERN = re.compile(r'#(\w+)')
posts = [
"hi #weekend",
"good morning #zurich #limmat",
"spend my #weekend in #zurich",
"#zurich <3"]
words = [word for post in posts for word in PATTERN.findall(post)]
counted = Counter(words)
print(counted)
# Result: Counter({'zurich': 3, 'weekend': 2, 'limmat': 1})
Post a Comment for "Hashtag Counter Python"