Thursday, 24 September 2015

Brief Overview of Java Collection Framework

Thanks to an excellent Java Concept of the Day, this is a brief description of main interfaces and classes of Java Collection Framework. Hopefully it would be handy for future references.


What is Collection Framework In Java?


Collection Framework in java is a centralized and unified theme to store and manipulate the group of objects. Java Collection Framework provides some pre-defined classes and interfaces to handle the group of objects. Using collection framework, you can store the objects as a list, set, queue or map and perform operations like adding an object or removing an object or sorting the objects without much hard work.


The entire collection framework is divided into four interfaces:

  1. List  —> It handles sequential list of objects. ArrayList, Vector and LinkedList classes implement this interface.
  2. Queue  —> It handles special list of objects in which elements are removed only from the head. LinkedList and PriorityQueue classes implement this interface.
  3. Set  —> It handles list of objects which must contain unique element. This interface is implemented by HashSet and LinkedHashSet classes and extended by SortedSet interface which in turn, is implemented by TreeSet.
  4. Map  —> This is the one interface in Collection Framework which is not inherited from Collection interface. It handles group of objects as Key/Value pairs. It is implemented by HashMap and HashTable classes and extended by SortedMap interface which in turn is implemented by TreeMap.
  5. Three of above interfaces (List, Queue and Set) inherit from Collection interface. Although, Map is included in collection framework it does not inherit from Collection interface.


Collection Interface:


Collection interface is the root level interface in the collection framework. List, Queue and Set are all sub interfaces of Collection interface. JDK does not provide any direct implementations of this interface. But, JDK provides direct implementations of it’s sub interfaces.

Collection interface extends Iterable interface which is a member of java.lang package. Iterable interface has only one method called iterator(). It returns an Iterator object, using that object you can iterate over the elements of Collection.


List Interface:


List Interface represents an ordered or sequential collection of objects. This interface has some methods which can be used to store and manipulate the ordered collection of objects. The classes which implement the List interface are called as Lists. ArrayList, Vector and LinkedList are some examples of lists. You have the control over where to insert an element and from where to remove an element in the list.

Here are some properties of lists:
  1. Elements of the lists are ordered using Zero based index.
  2. You can access the elements of lists using an integer index.
  3. Elements can be inserted at a specific position using integer index. Any pre-existing elements at or beyond that position are shifted right.
  4. Elements can be removed from a specific position. The elements beyond that position are shifted left.
  5. A list may contain duplicate elements.
  6. A list may contain multiple null elements.

List interface extends Collection interface. So, all 15 methods of Collection interface are inherited to List interface. Along with these methods, another 9 methods are included in the List interface to support the properties of lists.


Queue Interface:


The Queue Interface extends Collection interface. It defines queue data structure which is normally First-In-First-Out. Queue is a data structure in which elements are added from one end and elements are deleted from another end. But, exception being the Priority Queue in which elements are removed from one end, but elements are added according to the order defined by the supplied comparator.

Queue is a data structure where elements are added from one end called tail of the queue and elements are removed from another end called head of the queue. Queue is also first-in-first-out type of data structure (except priority queue). That means an element which is inserted first will be the first element to be removed from the queue. You can’t add or get or set elements at an arbitrary position in the queues.

Properties of queues:

  1. Null elements are not allowed in the queue. If you try to insert null object into the queue, it throws NullPointerException.
  2. Queue can have duplicate elements.
  3. Unlike a normal list, queue is not random access. i.e you can’t set or insert or get elements at an arbitrary positions.
  4. In most of cases, elements are inserted at one end called tail of the queue and elements are removed or retrieved from another end called head of the queue.
  5. In the Queue Interface, there are two methods to obtain and remove the elements from the head of the queue. They are poll() and remove(). The difference between them is, poll() returns null if the queue is empty and remove() throws an exception if the queue is empty.
  6. There are two methods in the Queue interface to obtain the elements but don’t remove. They are peek() and element(). peek() returns null if the queue is empty and element() throws an exception if the queue is empty.



Deque Interface:


The Deque Interface is the short name for “Double Ended Queue“. As the name suggest, Deque is a linear collection of objects which supports insertion and removal of elements from both the ends. The Deque interface defines the methods needed to insert, retrieve and remove the elements from both ends.

The Deque interface is introduced in Java SE 6. It extends Queue interface.

The main advantage of Deque is that you can use it as both Queue (FIFO) as well as Stack (LIFO). The Deque interface has all those methods required for FIFO and LIFO operations. Some of those methods throw an exception if operation is not possible and some methods return a special value (null or false) if operation fails. 

How Deque – Double Ended Queue Works?

As already said, Deque is nothing but the double ended queue. That means, you can insert, retrieve and remove the elements from both the ends.

Deque As Queue:

As Deque interface extends Queue interface, it inherits all methods of Queue interface. So, you can use all those inherited methods to perform Queue operations. Along with them, methods defined in the Deque interface can also be used for Queue operations.

Deque As Stack:

Deque interface has two more methods – pop() and push(). These two methods make Deque to function as a stack (Last-In-First-Out). Along with these two methods, you can also use addFirst(), peekFirst() and removeFirst() for stack operations.

Properties of dequeues:

  1. Unlike Queue, Deque can have null elements. But, it is recommended not to insert null elements as many methods return null to indicate Deque is empty.
  2. Deque can have duplicate elements.
  3. You can’t set or get or insert the elements at an arbitrary position of Deque. i.e Random access is not possible with the Deque.
  4. You can use removeFirstOccurrenec(E e), removeLastOccurrence(E e) and remove(E e) methods to delete the elements from the Deque.



Set Interface:


The Set interface defines a set. The set is a linear collection of objects with no duplicates. Duplicate elements are not allowed in a set. The Set interface extends Collection interface. Set interface does not have it’s own methods. All it’s methods are inherited from Collection interface. The only change that has been made to Set interface is that add() method will return false if you try to insert an element which is already present in the set.

Properties of sets:
  1. Set contains only unique elements. It does not allow duplicates.
  2. Set can contain only one null element.
  3. Random access of elements is not possible.
  4. Order of elements in a set is implementation dependent. HashSet elements are ordered on hash code of elements. TreeSet elements are ordered according to supplied Comparator (If no Comparator is supplied, elements will be placed in ascending order) and LinkedHashSet maintains insertion order.
  5. Set interface contains only methods inherited from Collection interface. It does not have it’s own methods. But, applies restriction on methods so that duplicate elements are always avoided.
  6. One more good thing about Set interface is that the stronger contract between equals() and hashCode() methods. According to this contract, you can compare two set instances of different implementation types (HashSet, TreeSet and LinkedHashSet).
  7. Two set instances, irrespective of their implementation types, are said to be equal if they contain same elements.



SortedSet Interface:


The SortedSet interface extends Set interface. SortedSet is a set in which elements are placed according to supplied comparator. This Comparator is supplied while creating a SortedSet. If you don’t supply comparator, elements will be placed in ascending order.

Properties of sorted sets:
  1. SortedSet can not have null elements. If you try to insert null element, it gives NullPointerException at run time.
  2. As SortedSet is a set, duplicate elements are not allowed.
  3. SortedSet elements are sorted according to supplied Comparator. If you don’t mention any Comparator while creating a SortedSet, elements will be placed in ascending order.
  4. Inserted elements must be of Comparable type and they must be mutually Comparable.
  5. You can retrieve first element and last elements of the SortedSet. You can’t access SortedSet elements randomly. i.e Random access is denied.
  6. SortedSets returned by headSet(), tailSet() and subSet() methods are just views of the original set. So, changes in the returned set are reflected in the original set and vice versa.



NavigableSet Interface:


The NavigableSet is a SortedSet with navigation facilities. The NavigableSet interface provides many methods through them you can easily find closest matches of any given element. It has the methods to find out less than, less than or equal to, greater than and greater than or equal of any element in a SortedSet.

Properties of navigable sets:
  1. NavaigableSet can’t have null elements.
  2. NavigableSet doesn’t support duplicate elements.
  3. NavigableSet can be traversed and accessed in either ascending or descending order.
  4. Methods subSet(), headSet() and tailSet() differ from SortedSet interface in taking additional arguments describing whether upper bound and lower bound are inclusive or exclusive.



ArrayList Class:


ArrayList class, in simple terms, can be defined as re-sizable array. ArrayList is same like normal array but it can grow and shrink dynamically to hold any number of elements. ArrayList is a sequential collection of objects which increases or decreases in size as we add or delete the elements.

In ArrayList, elements are positioned according to Zero-based index. That means, elements are inserted from index 0. Default initial capacity of an ArrayList is 10. This capacity increases automatically as we add more elements to arraylist. You can also specify initial capacity of an ArrayList while creating it.

ArrayList class implements List interface and extends AbstractList. It also implements 3 marker interfaces – RandomAccess, Cloneable and Serializable.

Some properties of array lists:

  1. Size of the ArrayList is not fixed. It can increase and decrease dynamically as we add or delete the elements.
  2.  ArrayList can have any number of null elements.
  3. ArrayList can have duplicate elements.
  4. As ArrayList implements RandomAccess, you can get, set, insert and remove elements of the ArrayList from  any arbitrary position.
  5. When you insert an element in the middle of the ArrayList, the elements at the right side of that position are shifted one position right and when you delete an element, they will be shifted one position left. This feature of the ArrayList causes some performance issues as shifting of elements is time consuming if ArrayList has lots of elements.
  6. Elements are placed according to Zero-based index. That means, first element will be placed at index 0 and last element at index n-1, where ‘n’ is the size of the ArrayList.
  7. ArrayList is not synchronized. That means, multiple threads can use same ArrayList simultaneously.
  8. If you know the element, you can retrieve the position of that element.



Difference Between Iterator And ListIterator:


Iterator and ListIterator are two interfaces in Java collection framework which are used to traverse the collections. Although ListIterator extends Iterator, there are some differences in the way they traverse the collections.

1. Using Iterator, you can traverse List, Set and Queue type of objects. But using ListIterator, you can traverse only List objects. In Set and Queue types, there is no method to get the ListIterator object. But, In List types, there is a method called listIterator() which returns ListIterator object.
2. Using Iterator, we can traverse the elements only in forward direction. But, using ListIterator you can traverse the elements in both the directions – forward and backward. ListIterator has those methods to support the traversing of elements in both the directions.


Vector Class:


The Vector Class is also dynamically grow-able and shrink-able collection of objects like an ArrayList class. But, the main difference between ArrayList and Vector is that Vector class is synchronized. That means, only one thread can enter into vector object at any moment of time.

Vector class is preferred over ArrayList class when you are developing a multi threaded application. But, precautions need to be taken because vector may reduce the performance of your application as it is thread safety and only one thread is allowed to have object lock at any moment of time and remaining threads have to wait until a thread releases the object lock which is held by it. So, it is always recommended that if you don’t need thread safety environment, it is better to use ArrayList class than the Vector class.

Vector class has same features as ArrayList. Vector class also extends AbstractList class and implements List interface. It also implements 3 marker interfaces – RandomAccess, Cloneable and Serializable.

Some properties of vectors:

  1. The main feature of Vector class is that it is thread safety. All methods of Vector class are synchronized so that only one thread can execute them at any given time. This feature of Vector class is useful when you need thread safety code.
  2. Thread safety property of Vector class effects the performance of an application as it makes threads to wait for object lock.
  3. Capacity Increment: Capacity increment is an amount by which the capacity of the vector is automatically incremented whenever size of the vector exceeds it’s capacity. You can pass this capacity increment while creating a vector. If you don’t pass, capacity increment will be treated as zero and capacity of the vector will be doubled whenever size exceeds capacity.
  4. Unlike an ArrayList, you can set the size of the Vector manually. If the new size is greater than the current size, the new slots will be filled with null elements. If the new size is smaller than current size, then the extra elements will be discarded.
  5. You can traverse the vector using Enumeration object. Vector class has a method called elements() which returns an Enumeration object consisting of all elements of Vector.
  6. Vector class has separate methods to retrieve first and last element of vector object. You will not find these methods in ArrayList class. firstElement() retrieves first element and lastElement()method retrieves last element of the vector.



LinkedList Class:


In general terms, LinkedList class is a data structure where each element consist of three things. First one is the reference to previous element, second one is the actual value of the element and last one is the reference to next element.

The LinkedList class in Java is an implementation of doubly linked list which can be used both as a List as well as Queue. The LinkedList in java can have any type of elements including null and duplicates. Elements can be inserted and can be removed from both the ends and can be retrieved from any arbitrary position.

The LinkedList class extends AbstractSequentialList and implements List and Deque interfaces. It also implements 2 marker interfaces – Cloneable and Serializable.

Properties of linked lists:

  1. Elements in the LinkedList are called as Nodes. Where each node consist of three parts – Reference To Previous Element, Value Of The Element and Reference To Next Element. Below diagram shows how LinkedList looks like.
  2. Reference To Previous Element of first node and Reference To Next Element of last node are null as there will be no elements before the first node and after the last node.
  3. You can insert the elements at both the ends and also in the middle of the LinkedList. Below is the list of methods for insertion operations.
  4. You can remove the elements from the head, from the tail and also from the middle of the LinkedList.
  5. You can retrieve the elements form the head, from the middle and from the tail of the LinkedList.
  6. Insertion and removal operations in LinkedList are faster than the ArrayList. Because in LinkedList, there is no need to shift the elements after each insertion and removal. only references of next and previous elements need to be changed.
  7. Retrieval of the elements is very slow in LinkedList as compared to ArrayList. Becaues in LinkedList, you have to traverse from beginning or end (whichever is closer to the element) to reach the element.
  8. The LinkedList can be used as stack. It has the methods pop() and push() which make it to function as Stack.
  9. The LinkedList can also be used as ArrayList, Queue, SIngle linked list and doubly linked list.
  10. LinkedList can have multiple null elements.
  11. LinkedList can have duplicate elements.
  12. LinkedList class in Java is not of type Random Access. i.e the elements can not be accessed randomly. To access the given element, you have to traverse the LinkedList from beginning or end (whichever is closer to the element) to reach the given element.



PriorityQueue Class:


The PriorityQueue is a queue in which elements are ordered according to specified Comparator. You have to specify this Comparator while creating a PriorityQueue itsel. If no Comparator is specified, elements will be placed in their natural order. The PriorityQueue is a special type of queue because it is not a First-In-First-Out (FIFO) as in the normal queues. But, elements are placed according to supplied Comparator.

The PriorityQueue does not allow null elements. Elements in the PriorityQueue must be of Comparable type, If you insert the elements which are not Comparable, you will get ClassCastException at run time.

PriorityQueue class extends AbstractQueue class which in turn implements Queue interface. PriorityQueue also implements one marker interface – java.io.Serializable interface.

Properties of priority queues:

  1. Elements in the PriorityQueue are ordered according to supplied Comparator. If Comparator is not supplied, elements will be placed in their natural order.
  2. The PriorityQueue is unbounded. That means the capacity of the PriorityQueue increases automatically if the size exceeds capacity. But, how it grows is not specified.
  3. The PriorityQueue can have duplicate elements but can not have null elements.
  4. All elements of the PriorityQueue must be of Comparable type. Otherwise ClassCastException will be thrown at run time.
  5. The head element of the PriorityQueue is always the least element and tail element is always the largest element according to specified Comparator.
  6. The default initial capacity of PriorityQueue is 11.
  7. You can retrieve the Comparator used to order the elements of the PriorityQueue using comparator() method.
  8. PriorityQueue is not a thread safe.



ArrayDeque Class:


The ArrayDeque class in Java is introduced from JDK 1.6. It is an implementation of Deque Interface which allows insertion of elements at both the ends. It does not have any restrictions on capacity. It expands automatically as we add more elements. The ArrayDeque class extends AbstractCollection class and implements Deque interface. It also implements Cloneable and Serializable marker interfaces.

Properties of array dequeues:

  1. ArrayDeque is a resizable-array implementation of Deque interface like ArrayList class which is a resizable-array implementation of List interface. But, ArrayDeque is not a List.
  2. ArrayDeque does not have any capacity limit. It will grow automatically as we add elements.
  3. Default initial capacity of ArrayDeque is 16. It will increase at a power of 2 (24, 25, 26 and so on) when size exceeds capacity.
  4. ArrayDeque can be used as a stack (LIFO) as well as a queue (FIFO). ArrayDeque is faster than the Stack class when used as a stack and faster than the LinkedList class when used as a queue.
  5. Performance of ArrayDeque is sometimes considered as the best among the collection framework. It gives performance of O(1) for insertion, removal and retrieval operations. ArrayDeque class is recommended instead of Stack class (when you want stack data structure) and instead of LinkedList class (when you want queue data structure).
  6. You can’t perform indexed operations on ArrayDeque. ArrayDeque doesn’t have the methods to support those operations.
  7. ArrayDeque is not a thread safe.



HashSet Class:


The HashSet class in Java is an implementation of Set interface. HashSet is a collection of objects which contains only unique elements. Duplicates are not allowed in HashSet. HashSet gives constant time performance for insertion, removal and retrieval operations. It allows only one null element.

The HashSet internally uses HashMap to store the objects. The elements you insert in HashSet will be stored as keys of that HashMap object and their values will be a constant called PRESENT. This constant is defined as private static final Object PRESENT = new Object() in the source code of HashSet class.

HashSet class extends AbstractSet class and implements Set interface. It also implements Cloneable and Serializable marker interfaces.

Properties of hash sets:

  1. HashSet class uses HashMap internally to store the objects. The keys of that HashMap object will be the elements of HashSet and their values will be a constant.
  2. HashSet does not allow duplicate elements. If you try to insert a duplicate element, older element will be overwritten.
  3. HashSet can have maximum one null element.
  4. HashSet doesn’t maintain any order. The order of the elements will be largely unpredictable. And it also doesn’t guarantee that order will remain constant over time.
  5. HashSet offers constant time performance for insertion, removal and retrieval operations.
  6. HashSet class is not synchronized. If you want synchronized HashSet, use Collections.synchronizedSet() method.



LinkedHashSet Class:


The LinkedHashSet in java is an ordered version of HashSet which internally maintains one doubly linked list running through it’s elements. This doubly linked list is responsible for maintaining the insertion order of the elements. Unlike HashSet which maintains no order, LinkedHashSet maintains insertion order of elements. i.e elements are placed in the order they are inserted. LinkedHashSet is recommended over HashSet if you want a unique collection of objects in an insertion order.

The LinkedHashSet class extends HashSet class and implements Set interface. It also implements Cloneable and Serializable marker interfaces.

Properties of linked hash sets:

  1. LinkedHashSet internally uses LinkedHashMap to store it’s elements just like HashSet which internally uses HashMap to store it’s elements.
  2. LinkedHashSet maintains insertion order. This is the main difference between LinkedHashSet and HashSet.
  3. LinkedhashSet also gives constant time performance for insertion, removal and retrieval operations. The performance of LinkedHashSet is slightly less than the Hashset as it has to maintain doubly linked list internally to order it’s elements.
  4. Iterator returned by LinkedHashSet is fail-fast. i.e if the LinkedHashSet is modified at any time after the Iterator is created, it throws ConcurrentModificationException.
  5. LinkedHashSet doesn’t allow duplicate elements and allows only one null element.
  6. LinkedHashSet is not synchronized. To get the synchronized LinkedHashSet, use Collections.synchronizedSet() method.



TreeSet Class:


The TreeSet is another popular implementation of Set interface. We have seen other two implementations of Set interface – HashSet and LinkedHashSet. HashSet doesn’t maintain any order where as LinkedHashSet maintains insertion order. The main difference between these two implementations and Treeset is, elements in TreeSet are sorted according to supplied Comparator. You need to supply this Comparator while creating a TreeSet itself. If you don’t pass any Comparator while creating a TreeSet, elements will be placed in their natural ascending order.

The TreeSet class in java is a direct implementation of NavigableSet interface which in turn extends SortedSet interface (which in turn extends Set interface). Below is the hierarchy diagram of TreeSet class.

Properties of tree sets:

  1. The elements in TreeSet are sorted according to specified Comparator. If no Comparator is specified, elements will be placed according to their natural ascending order.
  2. Elements inserted in the TreeSet must be of Comparable type and elements must be mutually comparable. If the elements are not mutually comparable, you will get ClassCastException at run time.
  3. TreeSet does not allow even a single null element.
  4. TreeSet is not synchronized. To get a synchronized TreeSet, use Collections.synchronizedSortedSet() method.
  5. TreeSet gives performance of order log(n) for insertion, removal and retrieval operations.
  6. Iterator returned by TreeSet is of fail-fast nature. That means, If TreeSet is modified after the creation of Iterator object, you will get ConcurrentModificationException.
  7. TreeSet internally uses TreeMap to store it’s elements just like HashSet and LinkedHashSet which use HashMap and LinkedHashMap respectively to store their elements.



Map Interface:


The Map interface in java is one of the four top level interfaces of Java Collection Framework along with List, Set and Queue interfaces. But, unlike others, it doesn’t inherit from Collection interface. Instead it starts it’s own interface hierarchy for maintaining the key-value associations. Map is an object of key-value pairs where each key is associated with a value. This interface is the replacement for ‘Dictionary‘ class which is an abstract class introduced in JDK 1.0.

HashMap, LinkedHashMap and TreeMap are three popular implementations of Map interface. 

Properties of maps:

  1. Map interface is a part of Java Collection Framework, but it doesn’t inherit Collection Interface.
  2. Map interface stores the data as a key-value pairs where each key is associated with a value.
  3. A map can not have duplicate keys but can have duplicate values.
  4. Each key at most must be associated with one value.
  5. Each key-value pairs of the map are stored as Map.Entry objects. Map.Entry is an inner interface of Map interface.
  6. The common implementations of Map interface are HashMap, LinkedHashMap and TreeMap.
  7. Order of elements in map is implementation dependent. HashMap doesn’t maintain any order of elements. LinkedHashMap maintains insertion order of elements. Where as TreeMap places the elements according to supplied Comparator.
  8. The Map interface provides three methods, which allows map’s contents to be viewed as a set of keys (keySet() method), collection of values (values() method), or set of key-value mappings (entrySet() method).



What is the difference between Collection and Collections in java?


This is one of the most confusing java interview question asked many a times to java freshers. Most of time, this question has been asked to java freshers to check their basic knowledge about the Java Collection Framework. 

This question seems confusing because both “Collection” and “Collections” look similar. Both are part of java collection framework, but both serve different purpose. Collection is a top level interface of java collection framework where as Collections is an utility class. In this article, we will discuss the differences between Collection and Collections in java.

Collection Interface:

Collection is a root level interface of the Java Collection Framework. Most of the classes in Java Collection Framework inherit from this interface. List, Set and Queue are main sub interfaces of this interface. JDK doesn’t provide any direct implementations of this interface. But, JDK provides direct implementations of it’s sub interfaces. ArrayList, Vector, HashSet, LinkedHashSet, PriorityQueue are some indirect implementations of Collection interface. Map interface, which is also a part of java collection framework, doesn’t inherit from Collection interface. Collection interface is a member of java.util package.

Collections Class:

Collections is an utility class in java.util package. It consists of only static methods which are used to operate on objects of type Collection. For example, it has the method to find the maximum element in a collection, it has the method to sort the collection, it has the method to search for a particular element in a collection. Below is the list of some important methods of Collections class.

1 comment:

Online Encyclopedia of Statistical Science (Free)

Please, click on the chart below to go to the source: