Performance Testing of Python ORMs Based on the TPC-C benchmark
Object-relational mappers (ORMs) are often used in Python programming when one needs to develop an application that works with databases. Examples of Python ORMs are SQLAlchemy, Peewee, Pony-ORM and Django. When choosing an ORM, performance plays a crucial role. But how are these toolsets compared? ORM performance benchmarks offer a measure of clarity but leave considerable room for improvement. I examine and extend the qualitative ORM benchmark to develop a stronger metric. The qualitative Python ORM benchmark, Tortoise ORM, (link to the repository) analyzes the speed of six ORMs for eleven types of SQL queries. In general, the Tortoise benchmark makes it possible to evaluate the speed of query execution for the various ORMs. However there is a flaw with this approach to testing: most ORMs are selected for use in web applications. In such contexts multiple users send all manner of queries to a database often at the same time. Because no benchmark measurement tools evaluated could rate performance of Python ORMs in a scenario like this, I decided to write my own comparing PonyORM and SQLAlchemy. As a basis, I took the TPC-C benchmark. Since 1988, TPC has been developing tests in the field of data processing. They have long become an industry standard and are used by almost all vendors of equipment on various samples of hardware and software. The main feature of these tests is that they are focused on testing under enormous load in conditions as close as possible to real ones. TPC-C simulates a warehouse network. It includes a combination of five simultaneously executed transactions of various types and complexity.The purpose of the test is to evaluate the speed of transaction processing when several virtual users simultaneously access the database. I decided to test two Python ORMs (SQLALchemy and PonyORM) using the TPC-C testing method adapted for this task. The purpose of the test is to evaluate the speed of transaction processing when several virtual users access the database at the same time. Step one is to create and populate the database of a warehouse network. The database schema looks like this: The database consists of eight relations: Warehouse District Order OrderLine Stock Item Customer History Databases for Pony and SQLAlchemy are identical. Only primary and foreign keys are indexed. Pony made those indexes automatically. In SQLAlchemy I made it manually. During the test, different types of transactions are sent to the database from several virtual users. Each transaction consists of several requests. In total, there are five types of transactions that are submitted for processing with different probability of occurrence: Transactions: new_order – 45% payment – 43% order_status – 4% delivery – 4% stock_level – 4% The probability of occurrence of transactions is the same as in the original TPC-C test. However, bearing in mind that toriginal TPC-C test is conducted on servers with 64 GB of RAM (requiring a large number of processors and huge disk space), due to technical limitations and and the fact that I wanted to test the performance of ORMs and not the resistance of hardware to enormous load, this test is somewhat simplified. The main differences from the TPC-C test are as follows: Main differences: The test runs with fewer virtual users than in the original test My test has less table entries. » Read More
Like to keep reading?
This article first appeared on hackernoon.com. If you'd like to keep reading, follow the white rabbit.