web scraping using jupyter notebook

johnharveyusa · July 2, 2023, 11:28pm

I’m trying to scrape a jail website using jupyter notebook and getting some errors. Can someone review this code and tell me how to fix the errors?

!pip install django requests beautifulsoup4

import csv
import requests
from bs4 import BeautifulSoup
from django.http import HttpResponse
from django.views import View

from django.conf import settings

Django app

from django.urls import path
from django.core.wsgi import get_wsgi_application
from django.http import HttpResponse

application = get_wsgi_application()

class InmateScraperView(View):
def get(self, request):
# Send a GET request to the website
url = ‘https://imljail.shelbycountytn.gov/IML’
response = requests.get(url)

    # Parse the HTML content using BeautifulSoup
    soup = BeautifulSoup(response.content, 'html.parser')

    # Find the inmate table
    table = soup.find('table', {'id': 'ctl00_ContentPlaceHolder1_grdIML'})

    # Extract the table data
    rows = table.find_all('tr')
    data = []
    for row in rows:
        cells = row.find_all('td')
        data.append([cell.text.strip() for cell in cells])

    # Create a CSV response
    response = HttpResponse(content_type='text/csv')
    response['Content-Disposition'] = 'attachment; filename="inmate_table.csv"'

    # Write the data to the CSV file
    writer = csv.writer(response)
    writer.writerows(data)

    return response

URL pattern

urlpatterns = [
path(‘scrape/’, InmateScraperView.as_view(), name=‘scrape’),
]

Run the Django app

from django.core.management import execute_from_command_line
execute_from_command_line([‘manage.py’, ‘runserver’])

KenWhitesell · July 2, 2023, 11:33pm

First, when posting code here, enclose the code between lines of three backtick - ` characters. This means you’ll have a line of ```, then your code, then another line of ```. This forces the forum software to keep your code properly formatted.

Second, instead of just saying that you are “getting some errors”, post the actual error with the complete traceback message. (And, enclose the traceback between lines of ```, just like you would any other code.)

Gulshan212 · October 3, 2023, 5:25pm

Well, I think it is because of a logical error in your code.
Can you try this code?

!pip install django requests beautifulsoup4

import requests
from bs4 import BeautifulSoup
from django.http import HttpResponse
from django.views import View

class InmateScraperView(View):
    def get(self, request):
        # Replace 'your_url_here' with the actual URL of the jail website
        url = 'your_url_here'
        response = requests.get(url)

        # Parse the HTML content using BeautifulSoup
        soup = BeautifulSoup(response.content, 'html.parser')

        # Find the inmate table
        table = soup.find('table', {'id': 'ctl00_ContentPlaceHolder1_grdIML'})

        # Extract the table data
        rows = table.find_all('tr')
        data = []
        for row in rows:
            cells = row.find_all('td')
            data.append([cell.text.strip() for cell in cells])

        # Create a CSV response
        response = HttpResponse(content_type='text/csv')
        response['Content-Disposition'] = 'attachment; filename="inmate_table.csv"'

        # Write the data to the CSV file
        writer = csv.writer(response)
        writer.writerows(data)

        return response
````Preformatted text`

Thanks

Topic		Replies	Views
cannot download csv correctly Getting Started	0	613	August 4, 2023
Web Scraping with Django and Scrapy via the Zyte API Using Django	1	218	August 4, 2024
I don't understand what is happening here Mystery Errors	1	155	February 22, 2024
Django dynamic website purely created with python. Mystery Errors	1	364	July 2, 2023
Django Paypal Simple Page Forms & APIs	11	165	August 9, 2024

web scraping using jupyter notebook

Django app

URL pattern

Run the Django app

Related topics