Contact me with any typos.

Individual Homework 2
File to submit to T-Square:  
    HW02.py

This is an INDIVIDUAL assignment!
Collaboration at a reasonable level will not result in substantially
similar code. Students may only collaborate with fellow students
currently taking CS 2316 and the lecturer.  Collaboration means 
talking through problems, assisting with debugging, explaining a
concept, etc.  You should not exchange code or write code for others.

For Help:
    - Piazza (but do not post code!)
    - Collaborate with others in the class, but do not exchange code!

Notes:
    - Don't forget to include the required comments and collaboration
    statement (as outlined in the syllabus)
    - Don't wait until the last minute.  Coding takes time and is
    usually filled with unexpected issues.

Quality of code:
    Be sure to give yourself enough time so that even once you have
    code that works, rethink and refine your code to make it more
    elegant, more efficient, etc.  Rarely is the first design the best
    design.  

    Also be sure to write small amounts of code and test it
    before adding more.  You'll end up with fewer and less complicated
    issues to fix if you embrace step-wise refinement.

Good design:
    ABSTRACTION!  Be sure to use good abstraction within your code.  
    Only the most trivial of problems should be solved by writing one 
    function.  Both of these HW problems should result in several
    small functions and a main driver to accomplish the goal.


1. Copy from the course website the saving and loading functions that
   utilized the pickle module.  (Or write your own that use pickle.)
   Integrate calls to those functions into your foreign dictionary 
   translation program.  Call the main function translate.  It has
   one optional parameter - the filename to load that already has a
   pickled translation dictionary in it.  Use the default parameter
   value feature of Python to accomplish this.  Use the default 
   filename "FrenchDictionary.pickle".

   The first thing your program should do is try to open the pickle file.
   If that fails, then let the user know that the file does not exist and
   prompt the user asking if they want create a new dictionary or provide
   a new filename or quit.  If the new filename also fails, then keep 
   asking them giving those same options.
  
   (If you know how to use a fancy file chooser, feel free to use it 
   within your prompting of the user.  Be sure the same logic applies, 
   in that the user could opt to quit or if the file still doesn't open
   for some reason, the prompting continues.)

   Ask the user for something to translate, and then use the dictionary
   to perform a rough translation of the words.  Display the complete
   translation to the user.  
   Keep repeating this until the user wants to quit.

   As we did in class, if a word is not in the dictionary, then prompt 
   the user for the word's translation.

      a) if the user does not know the translation, then let the English
         word slide through into the translated text, and do not add the
         word to the dictionary.  (We want only English to French 
         translations in the dictionary.)

      b) if the user does know a translation, then trust that the user is
         right and add that to the dictionary and use that word 
         translation to translate the English to French.  (Basically
         pretend this thing is crowd sourced and later there are ways
         to down vote a bad translation, so we'll just trust it, use it,
         add it, for now.)

   So what kind of input should we expect (and handle) from the user?  
   First off, use looping so that the user can perform numerous (0 
   to infinite) translations until they want to quit.  They should 
   be able to quit immediately without doing any translations.

   Make code fancier a step at a time adding in ability to handle these: 
      a) a word as input
      b) a phrase as input
      c) a sentence with no punctuation as input
      d) a sentence with punctuation - including commas, periods, etc.
      e) numbers (like "42") should just slide through to the translation
      f) looping using numerous entries of any and all of the above

   No matter what the user types in, your translation can be all lowercase.
   The user can type in any case they want - uppercase, lowercase, mixed case,
   capitalized, etc.

   You can also assume there will always be a one word to one word mapping 
   - which is not really guaranteed to be true in real life.

   Once the user wants to quit, write out the pickled dictionary using
   the default name or whatever the user may have specified.  (This should
   work even if the dictionary is empty.)

   (In the beginning just as we did in class, you may want to just hard 
   code a starting dictionary within your function and remove that later.
   While the dictionary is small, I recommend printing it out constantly 
   so you can see every modification happening while you work on your 
   code.  print(dictionary) should suffice for this.)

   HINTS:
   pickle module
   string module
   str class
   >>> dir(str)
   >>> import string
   >>> dir(string)
   >>> dir(dict)

   CODING REMINDERS:
   1) Use good abstraction.  If well designed, your solution will consist 
      of many functions, not just one huge one.  This helps with testing 
      and development since each function will do just a little bit of the 
      logic.  Every function should be extremely short.
   2) You are required to use the pickle module.
   3) Tricky bit - the filename (default or user provided) is prone to not
      really exist.  Use try/except to handle this.
   4) The default filename is to be provided in your code as a parameter
      having the default filename.
   
   BONUS CHALLENGES:  (these are completely optional, but worth bonus)
   1) CHALLENGE LEVEL SILVER:
      Make your saving to the file safer by having your code realize when 
      the file already exists before saving.  Ask the user if they want 
      to replace it or instead provide a different filename.  (Of course 
      if that alternative filename already exists, repeat until they agree
      to replace it or a filename for a non-exisiting file is given or
      they decide to quit without saving.)  HINT: os module.
   2) CHALLENGE LEVEL GOLD:
      Have an option to display in alphabetic order the English/French word
      pairings.  (Alphabetized by the English word.)  This should display nicely
      as one English word and one French word per line.  (With no dictionary,
      list, tuple, etc. notation mixed in.)
   3) CHALLENGE LEVEL PLATINUM: (yes, higher than gold)
      Handling upper and lowercase is a mess.  
      Add in features so that words that are capitalized remain capitalized.
      (Capitalized means the first letter is uppercase.)  
      Words that are in all uppercase, remain in all uppercase.  
      Words that are in all lowercase remain all lowercase.
      HINTS: dir(str) & import string & dir(string) are your friends.


2. Using the Alice in Wonderland text file provided, you are to calculate
   the frequency of each word used in the book.

   Call your function wordFrequency.  
   For flexibility, it has two default parameters.
   - the first one the book filename 
     (default to "AlicesAdventuresInWonderland.txt")
   - the second one the csv filename to save the frequency as csv
     (default to "AlicesAdventuresInWonderlandWordFrequency.csv".

   Use a dictionary as your data structure to accumulate the frequencies where
   the key:value pairs are word:count.

   After processing the frequencies, your function writes out the word and 
   frequency pairs sorted alphabetically by word to a proper csv file named
   using the csv filename.

   The format of the csv file:
   The csv file will have a header labeling the columns as WORD and COUNT.
   The format follows the standard csv format (like we used in class).
   (csv files are standard ascii text files, so you can open using a simple
   editor to see what's in there, as well as load the file into excel to 
   be sure things look good.)

   SPECIFICS:
   1) The book file is a normal book with upper and lowercase letters, lots of
      punctuation, a preamble, a postamble, etc.
   2) For the purposes of this assignment:
      a) make all words lowercase.  So even "Alice" will be recorded as "alice".
      b) lose all the punctuation and do not count it.
      c) if any numbers happen to appear the in the text, go ahead and count 
         those as words.  Certainly II, III, IV, etc. appear at least in the 
         chapter titles and can/should be counted as words.  (And yes, all the
         words (like "chapter") in the chapter headings will count.)
   3) Go ahead and include counting of words, numbers, etc. found within the 
      entire file.  (This includes the preamble, postamble, etc.)

   BONUS CHALLENGES:
   1) CHALLENGE LEVEL SILVER:
      Change your function so it takes only one filename.  The second filename
      is automatically formed by dropping the extension of the first filename
      and appending "WordFrequency.csv" as the new ending.
   2) CHALLENGE LEVEL GOLD:
      The file actually has both a preamble and huge postamble discussing the 
      fact that this is from the Project Gutenberg.  Have your code drop those
      chunks (smartly) so that those parts do not figure into your word
      frequencies.  By "smartly" I mean in the end your code should be smart 
      enough to trim off these parts when presented with other like-formatted
      Gutenberg files.  (Searching should be involved, and not just something
      like skip the first 10 lines.)  
      This feature does not alter the physical file - that will be the same as
      it ever was.
   3) CHALLENGE LEVEL PLATINUM:
      Figure out a way to deal with the single quote ' when it is used within
      a contraction.  Things such as "Alice's" and "I'll" will be broken into
      "alice", "s", "i", "ll" unless this is avoided somehow.  It'd be nice
      if "Alice's", "I'll", "shan't", etc. were dealt with as is, keeping the 
      single quote and without the single quote splitting them.  
      NOTE: Lots of other single quotes appear in the text though which should
      act just like normal punctuation.