Monday, February 20, 2012

Python from scratch- The journey continues



Three weeks ago I have published my first post about how I started my journey into Python.
It was a huge hit and it led me to second post about me facing more learning and more problems in my way to be a Pythoner.
That was 2 weeks ago; I was so eager to read more, to study more, solve issues and eventually write about it.
All is true but when you are lack of time it can't happen; to study language seriously you need to have time and clear mind which I didn’t have at that time since I have my family duties.

Last night finally I found some quiet hours and continue to read Google’s Class Chapter #5- Sorting and Chapter #6- Dictionaries and Files.
The Sorting chapter was actually follow-up of the previous chapter, and for those of you who read my previous post know that I was complaining about the function sorted() not explained at all; well the function has been found in this chapter.

So I read the sorting page and learnt what tuple is and when I came to solve some issues I found out that accidently I already solved them when study chapter of 'lists'.

From here jumped into new page- Dict and files, learned it, memorized it, did all the tasks needed and then my last task was to solve the exercise 'wordcount' (See below) and guys this exercise was very hard to understand.  
I read the question few times until I got a sense what is needed. It got to a point where I became a bit frustrated and even I had thought of leaving Google class and moved to another one.
Somehow after wandering in some python sites and getting help it hit me- The fact that I am looking and reading more sites it is a strength I am developing thanks for Google search engine and it is a bless to use Google’s class.
I would like to thank the Python community out there that willing to solve any python issue or help- Just go to Google Search Engine and type- “Python <your question>”


As you may think I was able to finish eventually the test and when I compared my code to the provided solution by Google I found out that we are not that different which is very inspiring.
So here is the question:
# Define print_words(filename) and print_top(filename) functions.
# You could write a helper utility function that reads a file
# and builds and returns a word/count dict for it.
# Then print_words() and print_top() can just call the utility function.
1. For the --count flag, implement a print_words(filename) function that counts how often each word appears in the text and prints:
word1 count1
word2 count2
...
Print the list in order sorted by word (python will sort punctuation to come before letters- that's fine).
Store all the words as lowercase, so 'The' and 'the' count as the same word.
2. For the --topcount flag implement a print_top(filename) which is similar to print_words() but which prints just the top 20 most common words sorted so the most common word is first, then the next most common and so on

Now, the question is a bit longer but I shortcut it for you.
The point that made me lose some valuable time was the fact that nowhere was told what the target of this exercise is.
Only when I ran the python command as is and got the following reply I got realized that I should type on the command line --count or --topcount which activate some calculation over a text file of my own.



Understanding the task is a must have key; once you got it, it makes everything easier.
From here it took me around one hour to solve the case and here is my solution (Which is not far from Google’s)
It will be my pleasure to see how you can handle it better.



Click here for the next post- Python and RegEx