Tag Archives: simulation

JBullet concurrent/parallel simulations, the ObjectPool problem

I’ve been using JBullet for some simulations recently and wanted to run several concurrently to make the most of my PC’s multiprocessing capabilities. It seems that the stock JBullet jar from jbullet.advel.cz contains code that cannot be used concurrently by more than one thread. You’ll see Exceptions like ArrayIndexOutOfBoundsException, IndexOutOfBoundsException and NullPointerException as your threads modify each other’s objects.

The problem is caused by the com.bulletphysics.util.ObjectPool class which is obviously intended for use by several threads but not properly synchronized. There are two ways to get around the problem – you can remove ObjectPool from the source code, replace all uses of its “get” method by direct (and not reusable) instantiation of the desired object and comment out or delete references to ObjectPool’s release method.

The alternative is to properly synchronize ObjectPool so that it is thread-safe. The code below works for me when it replaces the content of com/bulletphysics/util/ObjectPool.java, the jbullet jar is rebuilt and moved to the place your build tools find it (I replace the jbullet-20101010-1.jar in my local maven repository):

package com.bulletphysics.util;

import java.util.*;

/**
 * Object pool.
 *
 * @author sean
 */
public class ObjectPool<T> {

	private final Class<T> cls;
  private final Queue<T> list = new LinkedList<T>();
  private final static Map<Class, ObjectPool> MAP_CLASS_POOL = new HashMap<Class, ObjectPool>();

	private ObjectPool(Class<T> cls) { this.cls = cls; }

	/**
	 * Returns instance from pool, or create one if pool is empty.
	 *
	 * @return instance
	 */
	public synchronized T get() {
      if (!list.isEmpty())
        return list.remove();
    try { return cls.newInstance(); }
    catch (Exception e) { throw new IllegalStateException("Couldn't create a " + cls.getCanonicalName(), e); }
	}

	/**
	 * Release instance into pool.
	 *
	 * @param obj previously obtained instance from pool
	 */
	public synchronized void release(T obj) {
		list.add(obj);
	}

	/**
	 * Returns per-class object pool for given type, or create one if it doesn't exist.
	 *
	 * @param cls type
	 * @return object pool
	 */
	public synchronized static <T> ObjectPool<T> get(Class<T> cls)
	{
		ObjectPool<T> pool = (ObjectPool<T>)MAP_CLASS_POOL.get(cls);
		if (pool == null) {
			pool = new ObjectPool<T>(cls);
			MAP_CLASS_POOL.put(cls, pool);
		}

		return pool;
	}
	public static void cleanCurrentThread() { /* does nothing */ }
}

In my tests the synchronized ObjectPool executes about 5% faster than plain old object creation and garbage collection. Whatever benefit ObjectPool brings for the objects it reuses seems to be dwarfed by the amount of int[]s created (and not pooled) by JBullet.

Update 2014 August 17th

Spoke too soon – as soon as I extended my very trivial simulation I got similar Exceptions from the ArrayPool class. Applying similar edits to that class yields the following code and simulations that seem to work reasonably (see later update – synchronization in this code is slow) well:

package com.bulletphysics.util;

import java.lang.reflect.Array;
import java.util.*;

/**
 * Object pool for arrays.
 *
 * @author sean
 */
public class ArrayPool<T> {

  private final static HashMap<Class, ArrayPool> MAP_CLASS_QUEUE = new HashMap<Class, ArrayPool>();
	private final Class componentType;
	private final ObjectArrayList list = new ObjectArrayList();
	private final Comparator comparator;
	private final IntValue key = new IntValue();

	/**
	 * Creates object pool.
	 *
	 * @param componentType
	 */
	public ArrayPool(Class componentType) {
		this.componentType = componentType;

		if (componentType == float.class) {
			comparator = floatComparator;
		}
		else if (componentType == int.class) {
			comparator = intComparator;
		}
		else if (!componentType.isPrimitive()) {
			comparator = objectComparator;
		}
		else {
			throw new UnsupportedOperationException("unsupported type "+componentType);
		}
	}

	@SuppressWarnings("unchecked")
	private T create(int length) {
		return (T)Array.newInstance(componentType, length);
	}

	/**
	 * Returns array of exactly the same length as demanded, or create one if not
	 * present in the pool.
	 *
	 * @param length
	 * @return array
	 */
	@SuppressWarnings("unchecked")
	public synchronized T getFixed(int length) {
		key.value = length;
		int index = Collections.binarySearch(list, key, comparator);
		if (index < 0) {
			return create(length);
		}
		return (T)list.remove(index);
	}

	/**
	 * Returns array that has same or greater length, or create one if not present
	 * in the pool.
	 *
	 * @param length the minimum length required
	 * @return array
	 */
	@SuppressWarnings("unchecked")
	public synchronized T getAtLeast(int length) {
		key.value = length;
		int index = Collections.binarySearch(list, key, comparator);
		if (index < 0) {
			index = -index - 1;
			if (index < list.size()) {
				return (T)list.remove(index);
			}
			else {
				return create(length);
			}
		}
		return (T)list.remove(index);
	}

	/**
	 * Releases array into object pool.
	 *
	 * @param array previously obtained array from this pool
	 */
	@SuppressWarnings("unchecked")
	public synchronized void release(T array) {
		int index = Collections.binarySearch(list, array, comparator);
		if (index < 0) index = -index - 1;
		list.add(index, array);

		// remove references from object arrays:
		if (comparator == objectComparator) {
			Object[] objArray = (Object[])array;
			for (int i=0; i<objArray.length; i++) {
				objArray[i] = null;
			}
		}
	}

	////////////////////////////////////////////////////////////////////////////

	private static Comparator floatComparator = new Comparator() {
		public int compare(Object o1, Object o2) {
			int len1 = (o1 instanceof IntValue)? ((IntValue)o1).value : ((float[])o1).length;
			int len2 = (o2 instanceof IntValue)? ((IntValue)o2).value : ((float[])o2).length;
			return len1 > len2? 1 : len1 < len2 ? -1 : 0;
		}
	};

	private static Comparator intComparator = new Comparator() {
		public int compare(Object o1, Object o2) {
			int len1 = (o1 instanceof IntValue)? ((IntValue)o1).value : ((int[])o1).length;
			int len2 = (o2 instanceof IntValue)? ((IntValue)o2).value : ((int[])o2).length;
			return len1 > len2? 1 : len1 < len2 ? -1 : 0;
		}
	};

	private static Comparator objectComparator = new Comparator() {
		public int compare(Object o1, Object o2) {
			int len1 = (o1 instanceof IntValue)? ((IntValue)o1).value : ((Object[])o1).length;
			int len2 = (o2 instanceof IntValue)? ((IntValue)o2).value : ((Object[])o2).length;
			return len1 > len2? 1 : len1 < len2 ? -1 : 0;
		}
	};

	private static class IntValue
  {
		public int value;
	}

	/**
	 * Returns per-class array pool for given type, or create one if it doesn't exist.
	 *
	 * @param cls type
	 * @return object pool
	 */
	@SuppressWarnings("unchecked")
	public synchronized static <T> ArrayPool<T> get(Class cls)
  {
    ArrayPool<T> pool = MAP_CLASS_QUEUE.get(cls);
		if (pool == null)
    {
			pool = new ArrayPool<T>(cls);
			MAP_CLASS_QUEUE.put(cls, pool);
    }
		return pool;
	}

	public static void cleanCurrentThread() { /* does nothing */	}

}

Update 2014 September 26th

I’d got reasonable behaviour from my simulation and set about running it on my ‘low performance cluster’ – a collection of random PCs on my desk. It was obvious something was wrong when some old dual-core PCs ran my parallel simulations faster than my recent 8-core workstation! VisualVM made the problem obvious: my worker threads were blocked on the synchronization I’d added in ArrayPool. Using ThreadLocal and paring the class down to the 2 cases for which it’s used in JBullet (int[]s and float[]s) resulted in several times faster concurrent DynamicWorlds.

There’s an obvious kludge at “12 elements should be enough for anyone” – a guess at the maximum length of pooled arrays.

package com.bulletphysics.util;

import java.util.*;


/**
 * Object pool for arrays.
 * 
 * @author sean
 */
public abstract class ArrayPool<T>
{
	private final static ThreadLocal<ArrayPool<int[]>> INT_ARRAY_POOL = new ThreadLocal<ArrayPool<int[]>>()
	{
		@Override
		protected ArrayPool<int[]> initialValue()
		{
			return new ArrayPool<int[]>()
			{
				@Override
				protected int getLength(int[] array)
				{
					return array.length;
				}
				@Override
				protected int[] makeTypedArray(int length)
				{
					return new int[length];
				}
			};
		}
	};
	private final static ThreadLocal<ArrayPool<float[]>> FLOAT_ARRAY_POOL = new ThreadLocal<ArrayPool<float[]>>()
	{
		@Override
		protected ArrayPool<float[]> initialValue()
		{
			return new ArrayPool<float[]>()
			{
				@Override
				protected int getLength(float[] array)
				{
					return array.length;
				}
				@Override
				protected float[] makeTypedArray(int length)
				{
					return new float[length];
				}
			};
		}
	};
	private final Queue[] recyclingQueues = new Queue[12]; // 12 elements should be enough for anyone
	protected abstract T makeTypedArray(int length);
	protected abstract int getLength(T array);
	public ArrayPool()
	{
		for (int i = 0; i < recyclingQueues.length; i++)
			recyclingQueues[i] = new LinkedList<T>();
	}
	/**
	 * Returns array of exactly the same length as demanded, or create one if not
	 * present in the pool.
	 * 
	 * @param length
	 * @return array
	 */
	public T getFixed(int length)
	{
		Queue<T> recyclingQueue = recyclingQueues[length];
		T reusableArray;
		if (recyclingQueue.isEmpty())
			reusableArray = makeTypedArray(length);
		else
			reusableArray = recyclingQueue.remove();
		return reusableArray;
	}

	/**
	 * Releases array into object pool.
	 * 
	 * @param array previously obtained array from this pool
	 */
	public void release(T array)
	{
		Queue<T> recycledArrays = recyclingQueues[getLength(array)];
		recycledArrays.add(array);
	}
	
	/**
	 * Returns per-class array pool for given type, or create one if it doesn't exist.
	 * 
	 * @param cls type
	 * @return object pool
	 */
	public static <T> ArrayPool<T> get(Class cls)
  {
		if (cls == int.class)
			return (ArrayPool<T>)INT_ARRAY_POOL.get();
		else if (cls == float.class)
			return (ArrayPool<T>)FLOAT_ARRAY_POOL.get();
		else
			throw new IllegalArgumentException("No ArrayPool for " + cls.getCanonicalName());
	}

	public static void cleanCurrentThread() { /* does nothing */	}

}

As a side-note, another source of concurrent slow-down was careless use of java.lang.Math.random() to add bit-flipping noise to simulations. The API docs say

This method is properly synchronized to allow correct use by more than one thread. However, if many threads need to generate pseudorandom numbers at a great rate, it may reduce contention for each thread to have its own pseudorandom-number generator.

Using a non-shared, fast random number generator made a huge difference to the speed of my simulations.