Building a Laravel Service with Python Web Scrapers

Websolutionstuff | Nov-14-2024 | Categories : Laravel Python

In this tutorial, I’ll walk you through how to set up a Laravel service that uses Python web scrapers to gather data. This integration enables Laravel applications to access external data sources by harnessing Python's robust web scraping libraries like BeautifulSoup or Scrapy.

We’ll use Laravel's command line and a simple Python script to gather and process data, which we can then store or display within Laravel. Let’s dive into this powerful combination and see how it can simplify data gathering.

Building a Laravel Service with Python Web Scrapers

Building a Laravel Service with Python Web Scrapers

 

Step 1: Set Up the Laravel Command

First, we’ll create a Laravel command to run the Python script from within Laravel. This command will allow us to trigger the web scraping task whenever needed.

Open a terminal, navigate to your Laravel project directory, and run:

php artisan make:command ScrapeData

Open the generated command file located at app/Console/Commands/ScrapeData.php and update it to run the Python script.

<?php

namespace App\Console\Commands;

use Illuminate\Console\Command;
use Symfony\Component\Process\Process;
use Symfony\Component\Process\Exception\ProcessFailedException;

class ScrapeData extends Command
{
    protected $signature = 'scrape:data';
    protected $description = 'Run the Python web scraper';

    public function __construct()
    {
        parent::__construct();
    }

    public function handle()
    {
        // Define the Python script path
        $process = new Process(['python3', base_path('scripts/scraper.py')]);
        $process->run();

        // Check if the process was successful
        if (!$process->isSuccessful()) {
            throw new ProcessFailedException($process);
        }

        // Output the result to the console
        $this->info($process->getOutput());
    }
}

In this code, we use Laravel’s Process class to run a Python script located in the scripts directory. The scrape:data command will trigger this script and output any results or errors to the console.

 

Step 2: Create the Python Script for Web Scraping

 

Next, we’ll create the Python script responsible for web scraping. For this example, we’ll use BeautifulSoup to scrape data from a sample website.

In your Laravel project, create a new directory called scripts and add a file named scraper.py

Install BeautifulSoup and Requests if they’re not already installed:

pip install beautifulsoup4 requests

Inside scraper.py, write the Python scraping logic:

import requests
from bs4 import BeautifulSoup

# URL to scrape data from
url = 'https://example.com'

response = requests.get(url)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # Example: Extract all titles
    titles = [title.get_text() for title in soup.find_all('h2')]
    
    for title in titles:
        print(title)
else:
    print('Failed to retrieve data')

This script fetches all <h2> elements from a sample website and outputs their text content. You can modify this script to scrape other types of data as needed.

 

Step 3: Test the Integration

Now that we have both the Laravel command and Python script in place, we can test the integration.

Run the following command in your terminal:

php artisan scrape:data

If everything is set up correctly, you should see the extracted data printed in the console.

 

Step 4: Storing Scraped Data in Laravel's Database

We can store the scraped data in the database to make it available within Laravel.

First, add code to the Python script to output data in a structured JSON format:

import json

data = {"titles": titles}
print(json.dumps(data))

Next, update the ScrapeData command in Laravel to capture and store this JSON data in the database.

public function handle()
{
    $process = new Process(['python3', base_path('scripts/scraper.py')]);
    $process->run();

    if (!$process->isSuccessful()) {
        throw new ProcessFailedException($process);
    }

    $output = json_decode($process->getOutput(), true);
    
    if (isset($output['titles'])) {
        foreach ($output['titles'] as $title) {
            \App\Models\ScrapedData::create(['title' => $title]);
        }
    }

    $this->info('Data scraped and saved successfully.');
}

This code snippet reads the JSON output from the Python script, decodes it, and stores each title in a ScrapedData model in Laravel.

 

Step 5: Displaying Scraped Data in a Blade View

Create a simple Blade view to display the scraped data.

resources/views/scraped_data.blade.php

@extends('layouts.app')

@section('content')
<h1>Scraped Titles</h1>
<ul>
    @foreach ($titles as $title)
        <li>{{ $title }}</li>
    @endforeach
</ul>
@endsection

Finally, a route and controller method will be added to display this view.

web.php

Route::get('/scraped-data', [ScrapeController::class, 'index']);

// ScrapeController.php
public function index()
{
    $titles = \App\Models\ScrapedData::all();
    return view('scraped_data', compact('titles'));
}

 


You might also like:

Recommended Post
Featured Post
Remove/Hide Columns While Export Data In Datatable
Remove/Hide Columns While Expo...

In this article, we will see remove or hide columns while exporting data in datatable. When we are using jquery dat...

Read More

Aug-24-2020

How To Get Client IP Address In Laravel 9
How To Get Client IP Address I...

In this article, we will see how to get a client's IP address in laravel 9. Many times you need a user IP addre...

Read More

Oct-26-2022

How To Remove Package From Laravel
How To Remove Package From Lar...

In this article, we will see how to remove the package from laravel. there are different ways to remove packages fr...

Read More

Oct-31-2022

How To Downgrade PHP 8 to 7.4 in Ubuntu
How To Downgrade PHP 8 to 7.4...

As a web developer, I understand the significance of embracing the latest technologies to stay ahead in the dynamic worl...

Read More

Aug-04-2023