Benchmark: Multiplying lists of vectors - SoA vs AoS vs interlaced array

Multiplying lists of vectors - SoA vs AoS vs interlaced array (version: 0)

What's the fastest way to process a list of vectors? Do you go the OO route, with an array of Vector() objects? Do you create a new data structure, using Float32Arrays for x, y, and z? Do you create a single Float32Array and interlace values for x, y, and z within it? Let's find out!

Comparing performance of: AoS vs SoA - one loop vs SoA - one component per loop vs Interlaced Array - no loop unrolling vs Interlaced Array - with loop unrolling vs AoS (bracket notation)

Created: 8 years ago by: Guest

Jump to the latest result

Script Preparation code:

var N = 1000000

var x = new Float32Array(N);
var y = new Float32Array(N);
var z = new Float32Array(N);

var interlaced = new Float32Array(3*N);

var vectors = [];

for (var i = 0, li=x.length; i < li; ++i) {
	x[i] = Math.random();
	y[i] = Math.random();
	z[i] = Math.random();
	vectors.push( {x:Math.random(), y:Math.random(), z:Math.random()} );
}

for (var i = 0, li=interlaced.length; i < li; ++i) {
	interlaced[i] = Math.random();
}

​x
 
var N = 1000000​var x = new Float32Array(N);var y = new Float32Array(N);var z = new Float32Array(N);​var interlaced = new Float32Array(3*N);​var vectors = [];​for (var i = 0, li=x.length; i < li; ++i) {    x[i] = Math.random();    y[i] = Math.random();    z[i] = Math.random();    vectors.push( {x:Math.random(), y:Math.random(), z:Math.random()} );}​for (var i = 0, li=interlaced.length; i < li; ++i) {    interlaced[i] = Math.random();}

Tests:

AoS

var vector;
for (var i = 0, li=vectors.length; i < li; ++i) {
	vector = vectors[i];
	vector.x = 2 * vector.x;
	vector.y = 2 * vector.y;
	vector.z = 2 * vector.z;
}

 
var vector;for (var i = 0, li=vectors.length; i < li; ++i) {    vector = vectors[i];    vector.x = 2 * vector.x;    vector.y = 2 * vector.y;    vector.z = 2 * vector.z;}

SoA - one loop

for (var i = 0, li=x.length; i < li; ++i) {
	x[i] = 2 * x[i];
	y[i] = 2 * y[i];
	z[i] = 2 * z[i];
}

 
for (var i = 0, li=x.length; i < li; ++i) {    x[i] = 2 * x[i];    y[i] = 2 * y[i];    z[i] = 2 * z[i];}

SoA - one component per loop

for (var i = 0, li=x.length; i < li; ++i) {
	x[i] = 2 * x[i];
}
for (var i = 0, li=y.length; i < li; ++i) {
	y[i] = 2 * y[i];
}
for (var i = 0, li=z.length; i < li; ++i) {
	z[i] = 2 * z[i];
}

 
for (var i = 0, li=x.length; i < li; ++i) {    x[i] = 2 * x[i];}for (var i = 0, li=y.length; i < li; ++i) {    y[i] = 2 * y[i];}for (var i = 0, li=z.length; i < li; ++i) {    z[i] = 2 * z[i];}

Interlaced Array - no loop unrolling
for (var i = 0, li=interlaced.length; i < li; ++i) { interlaced[i] = 2*interlaced[i]; }
for (var i = 0, li=interlaced.length; i < li; ++i) {
interlaced[i] = 2*interlaced[i];
}
Interlaced Array - with loop unrolling
for (var i = 0, li=interlaced.length; i < li; i+=3) { interlaced[i] = 2*interlaced[i]; interlaced[i+1] = 2*interlaced[i+1]; interlaced[i+2] = 2*interlaced[i+2]; }
for (var i = 0, li=interlaced.length; i < li; i+=3) {
interlaced[i] = 2*interlaced[i];
interlaced[i+1] = 2*interlaced[i+1];
interlaced[i+2] = 2*interlaced[i+2];
}

AoS (bracket notation)

var vector;
for (var i = 0, li=vectors.length; i < li; ++i) {
	vector = vectors[i];
	vector['x'] = 2 * vector['x'];
	vector['y'] = 2 * vector['y'];
	vector['z'] = 2 * vector['z'];
}

 
var vector;for (var i = 0, li=vectors.length; i < li; ++i) {    vector = vectors[i];    vector['x'] = 2 * vector['x'];    vector['y'] = 2 * vector['y'];    vector['z'] = 2 * vector['z'];}

Rendered benchmark preparation results:

Suite status: <idle, ready to run>

Previous results

Experimental features:

Memory measurements supported only in Chrome.
For precise memory measurements Chrome must be launched with --enable-precise-memory-info flag.
More information: Monitoring JavaScript Memory

Test case name	Result
AoS
SoA - one loop
SoA - one component per loop
Interlaced Array - no loop unrolling
Interlaced Array - with loop unrolling
AoS (bracket notation)

Fastest: N/A

Slowest: N/A

Latest run results:

Run details: (Test run date: 8 years ago)

User agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0

Browser/OS: Firefox 49 on Windows 7

View result in a separate tab

Test name	Executions per second
AoS	97.3 Ops/sec
SoA - one loop	32.3 Ops/sec
SoA - one component per loop	53.0 Ops/sec
Interlaced Array - no loop unrolling	52.5 Ops/sec
Interlaced Array - with loop unrolling	53.8 Ops/sec
AoS (bracket notation)	94.1 Ops/sec

Autogenerated LLM Summary (model llama3.2:3b, generated 5 months ago):

I'll provide an explanation of the benchmark and its various aspects.

Benchmark Definition

The benchmark is designed to measure the performance of different approaches for multiplying lists of vectors in JavaScript. The test case creates a list of 1,000,000 random vectors and multiplies each vector by 2 using different methods:

Array of Objects (AoS): Each vector is an object with x, y, and z properties.
Structured Array (SoA): Vectors are stored in separate arrays for x, y, and z components.
Interlaced Array: A single array stores the values for all three components, interleaved.

Options Compared

The benchmark compares four variants of AoS and SoA:

One Loop (SoA): All vector components are multiplied in a single loop.
One Component Per Loop (SoA): Each component is multiplied separately in different loops.
Interlaced Array - No Loop Unrolling: The values for all three components are stored in a single array, but no loop unrolling is used.
Interlaced Array - With Loop Unrolling: The values for all three components are stored in a single array, and the loop unrolls to process each component in parallel.

Pros and Cons

Array of Objects (AoS):
- Pros: Easier to implement, more flexible.
- Cons: May incur additional overhead due to property access.
Structured Array (SoA):
- Pros: Can be faster for large datasets, reduces memory allocation/deallocation.
- Cons: More complex implementation, may require explicit bounds checking.
Interlaced Array:
- Pros: Can be more efficient in terms of memory usage and cache locality.
- Cons: May have higher overhead due to loop unrolling and data access patterns.

Latest Benchmark Result

The benchmark result shows the executions per second (FPS) for each test case on a specific device (Firefox 49, Windows 7). The top performer is Interlaced Array - With Loop Unrolling, followed closely by Structured Array (SoA). The AoS variants trail behind, with the "bracket notation" variant performing slightly better than the others.

Conclusion

The benchmark highlights the importance of data structure choice and implementation details in achieving optimal performance. Interlacing arrays can provide a significant performance boost due to improved cache locality, but may require more complex loop unrolling logic. The best approach will depend on specific use cases and requirements.

LLMs can make mistakes. Check important info.

I'll provide an explanation of the benchmark and its various aspects.

**Benchmark Definition**

1. **Array of Objects (AoS)**: Each vector is an object with x, y, and z properties.
2. **Structured Array (SoA)**: Vectors are stored in separate arrays for x, y, and z components.
3. **Interlaced Array**: A single array stores the values for all three components, interleaved.

**Options Compared**

The benchmark compares four variants of AoS and SoA:

1. **One Loop (SoA)**: All vector components are multiplied in a single loop.
2. **One Component Per Loop (SoA)**: Each component is multiplied separately in different loops.
3. **Interlaced Array - No Loop Unrolling**: The values for all three components are stored in a single array, but no loop unrolling is used.
4. **Interlaced Array - With Loop Unrolling**: The values for all three components are stored in a single array, and the loop unrolls to process each component in parallel.

**Pros and Cons**

1. **Array of Objects (AoS)**:
	* Pros: Easier to implement, more flexible.
	* Cons: May incur additional overhead due to property access.
2. **Structured Array (SoA)**:
	* Pros: Can be faster for large datasets, reduces memory allocation/deallocation.
	* Cons: More complex implementation, may require explicit bounds checking.
3. **Interlaced Array**:
	* Pros: Can be more efficient in terms of memory usage and cache locality.
	* Cons: May have higher overhead due to loop unrolling and data access patterns.

**Latest Benchmark Result**

The benchmark result shows the executions per second (FPS) for each test case on a specific device (Firefox 49, Windows 7). The top performer is **Interlaced Array - With Loop Unrolling**, followed closely by **Structured Array (SoA)**. The AoS variants trail behind, with the "bracket notation" variant performing slightly better than the others.

**Conclusion**

Related benchmarks: