laravel php cli performance pdo

Get Inserted Id from DB::insert()

Sureshkumar

Developer · 2024-01-01

Laravel Company

I need to insert more than 100000 records from one of the SOAP response. I used Laravel eloquent and it took much time, after watching Laracasts https://laracasts.com/series/how-to-build-command-line-apps-in-php, I started to migrate the codes to CLI, and I got very good improvements. I used PDO as he explained in video. It took 30 seconds to insert 30000 records.

Bulk inserts represent one of the most common scalability challenges when migrating data from legacy SOAP services into modern Laravel applications. When dealing with 100,000-plus records, Eloquent's beautiful syntax can become a bottleneck because each insert is wrapped in its own transaction, ORM event hooks fire, and model hydration carries overhead. PDO, by contrast, lets you stream rows directly to the database server using a single connection. In practice, the difference is dramatic: Eloquent batch inserts might take several minutes for 30,000 rows, while raw PDO can complete the same workload in under half a minute. This post explores why that gap exists and provides concrete recipes for achieving high-performance imports without losing access to the identifiers you need for downstream processing.

Why Eloquent Slows Down at Scale

Eloquent provides an expressive ActiveRecord interface that is ideal for typical CRUD workloads, but it pays a tax for every model instance it constructs. Each insert triggers attribute casting, event dispatching, relationship inspection, and timestamp automation. When you call Model::insert($array) with thousands of rows, Laravel still builds a query for each chunk under the hood, and the hydration step allocates memory for every attribute array. For SOAP responses that already exist as structured XML, you are essentially decoding data twice: once to build your PHP arrays, and again when Eloquent reconstructs models.

PDO and Unbuffered Queries for Streaming Inserts

The PDO extension in PHP supports unbuffered queries via PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => false. This tells the driver to stream rows to the server instead of buffering the entire result set in memory. Combined with a single prepared statement executed inside a transaction, throughput improves substantially. A typical pattern uses DB::connection()->getPdo() to access the raw PDO instance, prepare one INSERT statement, then loop through the dataset calling execute(). Because the PDO instance maintains the last insert ID at the connection level, you can capture $pdo->lastInsertId() after the final execute if your table uses an auto-incrementing primary key. Be aware, however, that lastInsertId() returns only the very last inserted value within that connection, so intermediate IDs remain undocumented unless you batch the work deliberately.

Laravel Commands for Import Workflows

Moving the logic to an Artisan command is more than an organizational preference. It gives you access to Laravel's dependency injection container, allows seamless use of queues for follow-up processing, and guarantees that memory is freed when the command exits. By wrapping the import in $this->info() and $this->comment() calls, you can also log progress in real time, which is invaluable when processing datasets that take several minutes. If you need to monitor performance over time, consider emitting custom events from the command so that other parts of your application can react to import completions or failures. Tools like Laravel Octane benchmark comparing Swoole, OpenSwoole, RoadRunner, FrankenPHP provide context for how much execution time matters at scale, and why minimizing framework overhead during batch operations pays dividends.

Retrieving Inserted IDs Efficiently

When you rely on raw PDO, capturing the inserted ID requires an explicit call to lastInsertId(). If your dataset requires tracking each generated ID for relational inserts, consider a hybrid approach: use Eloquent only for the parent table (where you need the ID), then use PDO for children tables. Alternatively, generate your own UUIDs before the insert so that identifiers are known ahead of time, eliminating the need to ask the database for them retroactively. For MySQL specifically, LAST_INSERT_ID() is connection-scoped and safe to call inside transactions, but it returns the first auto-increment value generated in the most recent statement, not the highest value. If you are inserting thousands of rows, you can calculate the range by knowing the starting value and the count.

Handling SOAP Responses Gracefully

SOAP payloads are notoriously verbose and namespace-heavy. Using PHP's SoapClient with the trace option enabled helps you inspect the raw XML and map nested elements to flat database columns. One common pattern is to transform the SOAP object into a simple array, then pass that array into a PDO execute loop. If your source data contains dates in non-standard formats, normalize them before insertion to avoid MySQL strict-mode errors. Casting numeric strings to integers or floats early also prevents silent type coercion that could corrupt analytics downstream.

Recommended Packages and Patterns

Although raw PDO wins on speed, some teams still prefer a middle ground. Packages like laravel-doctrine/migrations can manage schema changes, while league/csv helps parse delimited exports that often accompany SOAP archives. For chunking, Laravel's Collection::chunk() method keeps memory bounded by breaking the dataset into 500-row segments. Each chunk can then be wrapped in its own database transaction, giving you resilience without sacrificing throughput. If you find yourself repeating this pattern across multiple commands, extract a BatchInserter class that accepts a PDO instance, table name, column list, and an iterable data source. This makes your import logic testable and reusable across projects. For teams struggling with general database performance as queries grow in complexity, reviewing resources like MySQL deadlock on cache database can help surface hidden locking patterns that even fast inserts might trip over.

Conclusion

Retrieving the last inserted ID when using DB::insert() in Laravel is straightforward with PDO but requires understanding connection scoping. Combine raw PDO inside an Artisan command for the best blend of speed and framework integration, and always measure with real data rather than synthetic benchmarks. The thirty-second improvement you observed is real and repeatable: the key is embracing lower-level database access for the parts of your application that need raw throughput.

Related Posts

These related posts cover database performance, high-throughput PHP patterns, and common MySQL pitfalls that help deepen your understanding of insert-heavy workflows.

Techniques for Retrieving Last Insert IDs with PDO

When migrating from Eloquent to PDO for bulk inserts, one common concern is retrieving the auto-incremented ID of the last inserted row. With PDO, you can use lastInsertId() after executing an insert statement. However, for bulk inserts of multiple rows, lastInsertId() only returns the ID of the first row inserted in that statement. If you need all generated IDs, strategies include inserting in smaller batches and collecting IDs per batch, or using a staging table approach. For SOAP data imports, a common pattern is to insert records without capturing individual IDs immediately, then run follow-up queries to associate related data. Alternatively, if your database supports it (like PostgreSQL with RETURNING), you can retrieve all IDs in a single roundtrip. MySQL 8.0.19+ also supports RETURNING for DML statements, which can simplify this.

Batch Size and Memory Management

Processing 100,000 records from a SOAP response requires careful memory management. Even with PDO, loading the entire SOAP response into memory at once can cause issues. PHP's XML parser (SimpleXML or DOM) can handle large files via streams, and you should consider processing records in chunks of 1,000 to 5,000 at a time.

public function handle()
{
    $batchSize = 1000;
    $records = $this->parseSoapInBatches($batchSize);
    
    foreach ($records as $batch) {
        DB::transaction(function () use ($batch) {
            $this->insertBatch($batch);
        });
    }
    
    $this->info('Import completed successfully.');
}

This ensures each batch is wrapped in its own transaction, preventing a single failure from rolling back the entire 100,000-record import. It also keeps memory usage predictable and allows for progress tracking and resumable imports if the process is interrupted.

Parallel Processing with Multiple Processes

For extreme throughput, you can split the SOAP response into multiple files or segments and run several CLI processes in parallel. Laravel's task scheduling or system-level tools like GNU Parallel can coordinate this. Each process handles a subset of records, writing to the same database. Be mindful of database connection limits and consider using a queue system like Laravel Horizon to distribute the work across workers. See also Laravel Octane benchmark for performance insights when processing high-volume jobs.

Related Posts