Skip to content

Exception handling and bug removal #211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 14, 2024

Conversation

MYlab10
Copy link
Contributor

@MYlab10 MYlab10 commented Jul 14, 2024

Related Issue

[using efficient data structure to reduce memory and add exceptio handling code]

Description

[Removed bug in code which caused OSError and PermissionError and added error handling code incase the directory already exists to prevent exception by adding code snippet: import os
os.makedirs('data_scrapped', exist_ok=True)
df.to_csv('data_scrapped/data_rotten_tomatoes.csv', index=False)

Also added additional exception handling blocks in case movie titles or reviews doesn't exist def getReviewText(review_url):
'''Returns the user review text given the review soup.'''
tag = review_url.find('p', attrs={'class': 'review-text'}) # Use select_one for efficient CSS selector
if tag:
return tag.get_text(strip=True) # Use strip=True to remove extra whitespace
return None # Handle case where review text is not found

def getMovieTitle(review_url):
'''Returns the movie title from the review soup.'''
tag = review_url.find('title')
if tag:
title_tag = list(tag.children)[0].get_text()
movie_title = title_tag.split(' - Movie Reviews | Rotten Tomatoes')[0]
return movie_title
return None # Handle case where title is not found

To use less memory use set instead of dict.fromkeys() to remove duplicates # remove duplicate links
unique_movie_links = list(set(tag['href'] for tag in movie_tags))

To remove ModuleNotFoundError: No module named 'textblob' exception added pip install textblob]

Type of PR

  • [1 ] Bug fix
  • [ 1] Feature enhancement
  • Documentation update
  • Other (specify): _______________

Screenshots / videos (if applicable)

[Attach any relevant screenshots or videos demonstrating the changes]
image

Checklist:

  • [X ] I have performed a self-review of my code
  • [X ] I have read and followed the Contribution Guidelines.
  • [X ] I have tested the changes thoroughly before submitting this pull request.
  • [X ] I have provided relevant issue numbers, screenshots, and videos after making the changes.
  • [ X] I have commented my code, particularly in hard-to-understand areas.

Additional context:

[I would also like to add more documentation to code snippets to help others understand code better]

Removed bug in code which caused OSError and PermissionError and added error handling code incase the directory already exists to prevent exception by adding code snippet:
import os
os.makedirs('data_scrapped', exist_ok=True)
df.to_csv('data_scrapped/data_rotten_tomatoes.csv', index=False)


Also added additional exception handling blocks in case movie titles or reviews doesn't exist
def getReviewText(review_url):
    '''Returns the user review text given the review soup.'''
    tag = review_url.find('p', attrs={'class': 'review-text'})  # Use select_one for efficient CSS selector
    if tag:
        return tag.get_text(strip=True)  # Use strip=True to remove extra whitespace
    return None  # Handle case where review text is not found

def getMovieTitle(review_url):
    '''Returns the movie title from the review soup.'''
    tag = review_url.find('title')
    if tag:
        title_tag = list(tag.children)[0].get_text()
        movie_title = title_tag.split(' - Movie Reviews | Rotten Tomatoes')[0]
        return movie_title
    return None  # Handle case where title is not found

To use less memory use set instead of dict.fromkeys() to remove duplicates
# remove duplicate links
unique_movie_links = list(set(tag['href'] for tag in movie_tags))

To remove ModuleNotFoundError: No module named 'textblob' exception added pip install textblob
@sanjay-kv sanjay-kv merged commit 25efcb0 into recodehive:main Jul 14, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants