Lecture Notes
Introduction to Programming and Algorithm Design (COP-1000)

Case study: word play

This chapter provides some very interesting examples of using Python's string facilities to experiment with English words.


Reading word lists

  • Remember our definition of computer science: The study of what can be computed.
  • Can we "compute" the answer to these problems?
  • What percentage of the words in the English language do not contain the letter 'e'?
  • How many words in English have all letters appearing in alphabetic order?
  • How would you answer these questions without a computer? What would you need to have to try to answer these questions with a computer?
  • We would need two basic things:
  1. A way to represent the data (English words) in the computer
  2. An algorithm to manipulate the data to answer the question
  • A file of words is available here: words.txt (the first official crossword/Scrabble word list compiled by Grady Ward and contributed to the public domain as part of the Moby lexicon project)
  • The file consists of a list of words in alphabetic order, separated from each other by the characters \r\n (carriage return and newline)
  • Once you download the file, you can access it with this code:
     
       fin = open("words.txt")
     
    Note that the file must be in the same folder as the program you are running, or you will need to provide the drive/path to the file.
  • One way to input a word from the file after it has been opened is like this:
     
       nextWord = fin.readline()
     
  • nextWord will be a string; if you do not want it to include the trailing \r\n , use the string strip method:
     
       nextWord = fin.readline().strip()
     
  • To traverse the file (reading all the words), the simplest way is with a for loop:
     
       for nextWord in fin:
          word = nextWord.strip()
          do something with word ...
     
  • See example program longWords.py

Exercises

  • This section poses several problems, which we will solve below
  • Read the problems and think about how you might solve them before moving on to the solution; here they are:
  1. How many words do not contain an 'e'?
  2. Find all the words that do not use any of the letters in a specified string.
  3. Find all the words that use only the letters in a specified string.
  4. How many abecedarian words (words whose letters appear in alphabetical order) are there?

Search

  • All the problems posed above can be solved with the same technique: encapsulate a search in a Boolean function that returns True or False depending on the result of a search of the argument (the word) provided
  • What we will be searching in these functions is a string, so we will use a string traversal for the search
  • The general format of the function will be:
def foo(word):
   for letter in word:
      if letter meets some condition:
         return False
   return True
  • The idea is that we traverse a string looking for a letter that causes the word to fail the search (i.e., it does not meet the search criteria); if we find one, we can immediately return False; if we complete the traversal without returning, the word must meet the search criteria, and we return True
  • See the example program hasNoE.py
  • See the example program avoids.py
  • See the example program uses_only.py
  • See the example program is_abecedarian.py
 Updated: 12.13.2010