Find self-references in (possibly nested) collections

March 16, 2018

I found a nice trick reading part of the ElasticSearch client for Java.

Say you are given an object (could be a map, a list, an array or anything) and you need to make sure the same reference does not show up in any of the children objects (or theirs).

Here is how this the ElasticSearch guys solved this problem:

static void ensureNoSelfReferences(Object value) {
    ensureNoSelfReferences(value, Collections.newSetFromMap(new IdentityHashMap<>()));
}

static void ensureNoSelfReferences(final Object value, final Set<Object> ancestors) {
    if (value != null) {

        Iterable<?> it;
        if (value instanceof Map) {
            it = ((Map) value).values();
        } else if ((value instanceof Iterable)) {
            it = (Iterable) value;
        } else if (value instanceof Object[]) {
            it = Arrays.asList((Object[]) value);
        } else {
            return;
        }

        if (!ancestors.add(value)) {
            throw new IllegalArgumentException("Object has already been built and is self-referencing itself");
        }
        for (Object o : it) {
            ensureNoSelfReferences(o, ancestors);
        }
        ancestors.remove(value);
    }
}

Very elegant and concise. They used two APIs of Java I wasn’t aware of:

  • java.util.IdentityHashMap is an implementation of Map that compares its keys using == instead of .equals().
  • java.util.Collections#newSetFromMap() which returns a set backed by the provided map.

In this particular case, calling .add() (which returns false if an element was already present) would use == instead of .equals().

Granted, you could have the same effect by using just IdentityHashMap, in which case you would need to use put() instead. newSetFromMap does precisely this under the hood. But you can’t deny using the set looks cleaner.

© 2017 | Powered by Hugo ♥ | Art by Clip Art ETC