python-CS 230

1 CS 230 Program 4: Data Structures and Functions Due on Tuesday, March 30 11:59 PM Project Description: Word Cloud A word cloud is an image composed of words, usually from a particular text or topic, in which the size of the word indicates its frequency or importance. In this project you will use a Python module to create word clouds and manage a data file containing their specifications. Your program will allow a user to save, edit, and delete word clouds from the data file and save or delete their image files from your computer’s storage. When making a word cloud from an article or other text, you usually would remove all stop words (common words such as a, an, the, is, are, of, etc,) from the word list before creating the word cloud. A list of stopwords is provided in stopwords.txt in the project starter folder. Folders Download the wordcloud.zip folder which contains starter and supporting files for this project. Study and run the wordcloud_example.py program for an example of how to generate and save a word cloud using the WordCloud module. (You will need to import this module into your Python program). Your wordcloud project has four folders: data, containing the text files to be used for making word clouds fonts, containing several fonts to use when designing word cloud masks, containing several mask files, or shapes to use when designing a word cloud images, containing word cloud images generated by your program Here are example shape and image files. Several sample shapes, fonts, and texts are provided for your use. You can add more! Python Libraries The project makes use of these Python libraries. You will need to add them to your Python project if they are not there already. To do so, go to File… Settings / Project / Python Intepreter and click the + in the upper right corner. Locate the modules if they are not shown. 2 wordcloud – contains methods for creating a word cloud with selected options os – (operating system utilities) – gets the path on your computer to locate the folders containing fonts, text files, masks, or word cloud images PIL (Python Image Library) – used to interact with word cloud image files numpy (numerical py) – needed to specify how to store the mask file in memory helper – contains functions for input validation. These functions were written especially for you to use in this project, so you don’t have to write them yourself. To access a function in the helper.py file from your code, precede the function name with helper followed by a dot. For example, to call the default_input() function in the helper module, you would write fname = helper.default_input(“Enter the filename:”, “article.txt”) Word Cloud Options When creating a word cloud, you can specify these options: filename name of the file in the data directory containing the text from which to generate word cloud max_words maximum number of words to include in a word cloud font name of the font to use in the word cloud mask name of the mask or shape to use in the word cloud bg_color background color for the word cloud text_color_map color scheme for the text in the word cloud The value of bg_color can be any web friendly color name (so most common colors will probably work) and the value of text_color_map can be any color map scheme that Python supports. The color_map_select() function in helper.py selects a few to try, but feel free to use any of these. Storing Cloud Options The file clouds.txt located in the same folder as your Python program is a text file containing specifications for each word cloud that you create. A sample clouds.txt file containing three word clouds, looks like this: cloud1,article.txt,20,junegull,clover,white,purples cloud2,article.txt,50,junegull,snowman,papayawhip,greens cloud3,onefish.txt,50,crayons,fish,navy,cool 3 . Your program will represent the word clouds as a dictionary of lists, where the key is the cloud name and the options are stored in a list. For example, dict_clouds has the value: { ‘cloud1’: [‘onefish.txt’, 50, ‘bebas’, ‘square’, ‘blue’, ‘purples’], ‘cloud2’: [‘article.txt’, 50, ‘junegull’, ‘snowman’, ‘papayawhip’, ‘greens’], ‘cloud3’: [‘onefish.txt’, 50, ‘crayons’, ‘fish’, ‘navy’, ‘cool’] } Program Structure This program is menu-driven. The main function loads previously saved clouds from the clouds.txt file to a dictionary called dict_clouds, and then runs a loop so the user can select one of the options below. Users enter the first letter of the option and the program calls the appropriate corresponding function. All user input for selecting the file name, fonts, masks, colors, and max_words makes use of helper.default_input() and the other _select() methods in the helper.py file. New Cloud Asks the user for the file name and cloud options. Calls create_cloud() to create the cloud passing in the options through the parameter list. The function returns the cloud image. Shows the cloud image on the screen Asks if the user wants to save this cloud If yes: Asks the user for a cloud name (and validates that it is unique) Saves the cloud image to images folder in in a file named _image.png Stores the cloud options in dict_clouds Writes the updated dict_clouds dictionary to clouds.txt text file. List Clouds Lists each cloud name and its options from dict_clouds on the screen Open Cloud Asks the user for the name of the cloud to open. Validates that the name is correct (or shows names of existing clouds) Obtains the cloud options from the dictionary Calls create_cloud() to create the cloud Converts the cloud to an image 4 Displays the image Delete Cloud Asks the user for the name of the cloud to delete. Validates the name is exists in dict_clouds. Removes the cloud info from dict_clouds Writes the new dict_cloud to the clouds.txt file Removes the cloudname_image.png file from the images folder Edit Cloud Asks the user for the name of the cloud to edit Validates the name is stored in dict_clouds Obtains the options for this cloud from dict_clouds Displays each option; user presses ENTER to keep, or enter a new value for the option Stores the cloud options in dict_clouds Call create_cloud to create the updated word cloud Show the image on the screen Saves the updated cloud image to images folder in in a file named _image.png Writes the dict_clouds to clouds.txt text file. Word List Opens the specified text file (uses default file if none specified) Calls get_freq( )to get a dictionary of words and their frequencies Calls remove_stopwords() to remove the stop words from the dictionary Prints the top ten words by frequency (sort the frequency dictionary in reverse order by value and show the first ten words and their frequencies) Quit Displays a thank you message and exits the program. This chart shows which functions are called when and may help understand how they all fit together. main() new_ cloud() create_cloud() save_dict_to_file() list_ clouds() open_ cloud() create_cloud() delete_ cloud() edit_ cloud() create_cloud() save_dict_to_file() word_ list() get_freq() remove_stopwords() quit() load_file_to_dict 5 Additional Functions create_cloud() Sets up variables containing the file names (with paths) for the font, text file, and mask Calls get_freq( )to get a dictionary of words and their frequencies Calls remove_stopwords() to remove the stop words from the dictionary calls WordCloud() and wc.generate_from_frequencies() (from the WordCloud module you imported) to generate the word cloud Returns wc as the value of the function get_freq() Opens the text file Creates a dictionary dict_freq containing each unique word and the number of times it appears in the file Uses helper.clean() to remove any punctuation found in each word Returns dict_freq remove_stopwords() Opens stopwords.txt (located in the same folder as your program) For each word in the frequency dictionary, If a word is one of the stop words, removes it and its frequency from the dictionary Returns the updated frequency dictionary (with stopwords removed) load_file_to_dict() Open the clouds.txt file for read Read each line: the first value is the cloud name, and the remaining values are the cloud options Store the cloud in a dict_clouds Print a message notifying that the file was loaded. Return dict_clouds as the value of this function save_dict_to_file() Open clouds.txt for write For each cloud in dict_clouds Get the key (cloud name) and the cloud options from dict_clouds Write that info as a line in the clouds.txt file (make sure the line ends with n) Print a message notifying that the file was updated. Helper Functions These functions are provided in the helper.py file to assist with obtaining and validating input. Call them by prefixing each name with helper. For example, s = helper.clean(s) create_cloud() get_freq() remove_ stopwords() WordCloud() wc.generate_from_ frequencies() 6 default_input() Takes a prompt and a default value. If the user presses ENTER, it returns the default value. font_select() Obtains user input that the selected font is valid. mask_select() Obtains user input that the selected mask is valid. bg_color_select() Obtains user input that the selected background color is valid. color_map_select() Obtains user input that the selected color map for the text is valid. clean() Takes a string, and returns the string with any spaces, punctuation or other special characters removed. Useful Lines of Code Run the wordcloud_example.py program provided for an example of how to create wordcloud given a dictionary of words and their frequencies. That’s all you need to know about wordclouds for this project. os.get_cwd() DIR = os.getcwd() calls the OS module’s get current working directory function to get the path to the current working directory on your computer containing your Python program. On my computer, that gives the value C:Usersjxu CS230S21-CS230Codewordcloud path.join cloud_file_path = path.join(DIR, “images”, cloud_name+”_image.png”) Uses path.join to create the path to a subdirectory, starting with a root directory, then specifying the images folder and the cloud file name. path.join gives an absolute path name based on your devices’ operating system. On my computer, this code above results in the value C:UsersjxuCS230S21-CS230Codewordcloudimagescloud1_image.png wc.to_file(cloud_file_path) Writes a word cloud image file to disk. os.remove(path.join(DIR, “images”, cloud_name+”_image.png”)) Removes a file on your computer’s hard drive. image = wc.to_image() image.show() Creates an image from the word cloud and displays it on the screen. mask_fname = np.array(Image.open(path.join(DIR, “masks”, MASKS[mask]))) Opens a mask file in the masks directory to be passed as an option to WordCloud(). 7 wc = WordCloud(background_color= bg_color, max_words = max_words, font_path= font_fname, mask = mask_fname, contour_width = 5, contour_color = ‘black’) wc.generate_from_frequencies(dictFreq) Creates a word cloud with the options specified, and with the given frequencies. Requirements and Rubrics Add code to the menu functions and the functions listed in red above to complete the program. You may keep or remove the comments provided with each function. Add print statements as would be helpful to you to track how your program runs. You must follow the instructions strictly. I will run the grading program against your code before I manually check it. Check out the sample runs at the end of this sheet to get better ideas. This assignment will be worth 10% toward your course grade. # Function Points 1 New Cloud 5 2 List Clouds 2 3 Open Cloud 5 4 Delete Cloud 3 5 Edit Cloud 5 6 Word List 5 Quit 7 create_cloud 5 8 save_dict_to_file 5 9 load_file_to_dict 5 10 get_freq 5 11 remove_stopwords 5 Total: 50 Hints Look at the code in wordcloud_example.py to see a working example of how to create a word cloud from a dictionary containing words and frequencies. Run it, and make sure you can explain what each line of code does. Look at the code in helper.py to see the functions provided and make sure you understand what they do. Look at the wordcloud_starter.py file to find all of the functions that you need to complete. Get the List Cloud option working first. This way you’ll be able to look at your clouds once you create them. 8 Get New Cloud to work next. This will require you to write the functions that it depends on (see the chart) so work on those one by one. Then work on the rest of the functions as you wish. Test each function that you write. Submission You must name your Python script file as wordcloud.py. If you name it differently or place any space in your file name, you will receive 0 for this assignment. Submit wordcloud.py only and do not submit helper.py, or any of the image, masks, fonts, or text files. 9 Sample Run Highilighted lines are for debugging only. You may include these or other optional output. List Clouds [N] New Cloud [D] Delete a Cloud [L] List Clouds [E] Edit a Cloud [O] Open Cloud [Q] Quit [W] Word List (Top Ten) Please enter your choice: L Cloud File Words Font Mask BG Color Text Color” cloud1 onefish.txt 50 bebas square blue purples cloud5 constitution.txt 80 prata star gray blues cloud2 article.txt 20 highland square white purples New Cloud [N] New Cloud [D] Delete a Cloud [L] List Clouds [E] Edit a Cloud [O] Open Cloud [Q] Quit [W] Word List (Top Ten) Please enter your choice: n Enter the filename:[Enter for article.txt ]: onefish.txt Available Fonts: highland alpaca bebas crayons junegull prata Select a font. [Enter for highland]: bebas Available masks: square fish snowman star clover Select a mask: [Enter for square]: star Enter a web color name for the background.[Enter for white]: orange Select a color scheme for the text: purples oranges greens reds blues bugn cool Enter a color map for the text. [Enter for purples]: reds How many words[Enter for 20]: 80 Do you want to save this cloud y/n y Enter a name for this cloud: cloud6 Cloud image saved in C:UsersmfrydenbergOneDrive – Bentley UniversityCS230S21-CS230Codewordcloudimagescloud6_image.png Saving Dict to File {‘cloud1’: [‘onefish.txt’, ’50’, ‘bebas’, ‘square’, ‘blue’, ‘purples’], ‘cloud5’: [‘constitution.txt’, ’80’, ‘prata’, ‘star’, ‘gray’, ‘blues’], ‘cloud2’: [‘article.txt ‘, ’20’, ‘highland’, ‘square’, ‘white’, ‘purples’], ‘cloud6’: [‘onefish.txt’, 80, ‘bebas’, ‘star’, ‘orange’, ‘reds’]} Saving cloud cloud1: cloud1,onefish.txt,50,bebas,square,blue,purples Saving cloud cloud5: cloud5,constitution.txt,80,prata,star,gray,blues 10 Saving cloud cloud2: cloud2,article.txt ,20,highland,square,white,purples Saving cloud cloud6: cloud6,onefish.txt,80,bebas,star,orange,reds Cloud File Updated. Cloud database file updated. [N] New Cloud [D] Delete a Cloud [L] List Clouds [E] Edit a Cloud [O] Open Cloud [Q] Quit [W] Word List (Top Ten) Please enter your choice: L Cloud File Words Font Mask BG Color Text Color” cloud1 onefish.txt 50 bebas square blue purples cloud5 constitution.txt 80 prata star gray blues cloud2 article.txt 20 highland square white purples cloud6 onefish.txt 80 bebas star orange reds Open Cloud [N] New Cloud [D] Delete a Cloud [L] List Clouds [E] Edit a Cloud [O] Open Cloud [Q] Quit [W] Word List (Top Ten) Please enter your choice: o Enter the name of the cloud to open: cloud1 In create cloud onefish.txt 50 bebas square blue purples Using Mask Fname: C:UsersmfrydenbergOneDrive – Bentley UniversityCS230S21-CS230Codewordcloudmaskssquare.png MASKS square.png Word List [N] New Cloud [D] Delete a Cloud [L] List Clouds [E] Edit a Cloud [O] Open Cloud [Q] Quit [W] Word List (Top Ten) Please enter your choice: w Please enter your choice: w Enter the filename:[Enter for article.txt]: onefish.txt fish 12 1 hop 12 2 oh 10 3 wish 8 4 bed 7 5 drink 7 6 fun 6 7 hello 6 8 11 box 6 9 likes 6 10 Delete Cloud [N] New Cloud [D] Delete a Cloud [L] List Clouds [E] Edit a Cloud [O] Open Cloud [Q] Quit [W] Word List (Top Ten) Please enter your choice: d Enter cloud to delete: cloud5 Saving Dict to File {‘cloud1’: [‘onefish.txt’, ’50’, ‘bebas’, ‘square’, ‘blue’, ‘purples’], ‘cloud2’: [‘article.txt’, ’20’, ‘highland’, ‘square’, ‘white’, ‘purples’], ‘cloud6’: [‘onefish.txt’, ’80’, ‘bebas’, ‘star’, ‘orange’, ‘reds’]} Saving cloud cloud1: cloud1,onefish.txt,50,bebas,square,blue,purples Saving cloud cloud2: cloud2,article.txt,20,highland,square,white,purples Saving cloud cloud6: cloud6,onefish.txt,80,bebas,star,orange,reds Cloud File Updated. Cloud cloud5 deleted. Cloud database file updated. [N] New Cloud [D] Delete a Cloud [L] List Clouds [E] Edit a Cloud [O] Open Cloud [Q] Quit [W] Word List (Top Ten) Please enter your choice: L Cloud File Words Font Mask BG Color Text Color cloud1 onefish.txt 50 bebas square blue purples cloud2 article.txt 20 highland square white purples cloud6 onefish.txt 80 bebas star orange reds Edit Cloud [N] New Cloud [D] Delete a Cloud [L] List Clouds [E] Edit a Cloud [O] Open Cloud [Q] Quit [W] Word List (Top Ten) Please enter your choice: e Enter the name of the cloud to edit: cloud6 Filename: [Enter for onefish.txt]: Max Words: [Enter for 80]: Font : [Enter for bebas]: prata Mask : [Enter for star]: clover BG Color : [Enter for orange]: Color Map :[Enter for reds]: Cloud image saved in C:UsersmfrydenbergOneDrive – Bentley UniversityCS230S21-CS230Codewordcloudimagescloud6_image.png 12 Saving Dict to File {‘cloud1’: [‘onefish.txt’, ’50’, ‘bebas’, ‘square’, ‘blue’, ‘purples’], ‘cloud2’: [‘article.txt’, ’20’, ‘highland’, ‘square’, ‘white’, ‘purples’], ‘cloud6’: [‘onefish.txt’, 80, ‘prata’, ‘clover’, ‘orange’, ‘reds’]} Saving cloud cloud1: cloud1,onefish.txt,50,bebas,square,blue,purples Saving cloud cloud2: cloud2,article.txt,20,highland,square,white,purples Saving cloud cloud6: cloud6,onefish.txt,80,prata,clover,orange,reds Cloud File Updated. Cloud database file updated. Quit [N] New Cloud [D] Delete a Cloud [L] List Clouds [E] Edit a Cloud [O] Open Cloud [Q] Quit [W] Word List (Top Ten) Please enter your choice: qThank your for making PyClouds!