CHAPTER 15
We have so far introduced Array type to the readers. Among all the container types, Array is the most integrated with the language, but is also the least flexible: it must store elements of a single type (or derived from the defined element type), and it has a fixed capacity.
Julian's standard library provides a few collection types which can be used an alternative to Array, or functions as a totally different data structure. This chapter will cover some of the most important types in the module of System.Collection
.
A list is a serial, self-scalable data structure that can grow its capacity on demand. For full API documentation, see here.
import System.Collection;
List l = new List();
l.add(3);
l.add("abc");
int size1 = l.size();
int x1 = l.get(0);
string s1 = l.get(1);
l.remove(0);
int size2 = l.size();
string s2 = l.get(0);
List implements IIndexable, so you can use indexer to access to its elements as if you were talking to an array. Be careful, though, that trying to access to a position the list is yet to grow to will result in runtime error.
import System.Collection;
List l = new List();
l.add(100);
l.add(200);
l.add(300);
var v = l[1]; // 200
v[2] = v[0] + 500;
v[0] += 100;
v[3] = 150; // throws System.ArrayOutOfRangeException
List implements IIterable, so you can use fast-for to iterate over the list.
// Continuing previous example ...
int sum = 0;
for(var i : l){
sum += i;
}
List also implements IComparable, allowing you to call its sort()
method to sort the elements stored within, but certain conditions apply to ensure the correctness. For more details, see here.
// Continuing previous example ...
l.sort(true); // sort in descending order
for(var i : l){
Console.print(i);
Console.print(' ');
}
// 300 200 100
A hash map to store data based on the calculated hash value, allowing element-targeting operation at constant cost. For full API documentation, see here.
import System.Collection;
Map m = new Map();
m.put("a", 3);
m.put("b", "xyz");
m.put("c", true);
int size1 = m.size();
var a = m.get("a");
var b = m.get("b");
var c = m.get("c");
// Overwrite
m.put("d", "delta");
m.put("d", "gamma");
var d = m.get("d");
m.put("e", "epsilon");
var e1 = m.remove("e");
var e2 = m.get("e");
var nex = e2 == null;
int size2 = m.size();
Unlike List
, Map
can only store unique elements. It determines uniqueness by the combination of hashCode()
and equals()
method defined on the value (if it's an object) to be stored. When storing an object as key, Map would first call hashCode()
to decide the internal slot for the storage, and if that is already taken (meaning another object has yielded same hashcode), it will invoke equals()
against the stored values at that slot. If none of the calls returns true, the new object is considered a unique key and placed at the slot along with others. If any call returns true, the key is considered to be already existing, and only the data will be replaced.
To illustrate this, consider the example below, where a customized class Key
is used as the key to Map. Since Key's hashCode()
only uses the integer field, we are effectively creating same Keys for ka, kb and kc, despite of them having different string fields.
import System.Collection;
class Key {
private int k1;
private string k2;
Key(int k1, string k2){
this.k1 = k1;
this.k2 = k2;
}
int hashCode(){
return k1;
}
bool equals(Object another){
return this.k1 == another.k1;
}
}
Key ka = new Key(1, "a");
Key kb = new Key(1, "b");
Map m = new Map();
m.put(ka, "obj1");
m.put(kb, "obj2");
int size1 = m.size(); // = 1. hashCode() same, equals() same.
var val = m[kb]; // = "obj2"
m.remove(new Key(1, "c"));
int size0 = m.size(); // = 0. hashCode() same, equals() same.
Map implements IIndexable, so you can use indexer to access to its elements as if you were talking to an associative array.
map["a"] = 15;
var v = map["a"]; // v = 15
Map also implements IIterable, so you can use fast-for to iterate over the map in a stream of Entry.
for (var entry : map) {
Console.println(entry.key + " = " + entry.value);
}
Last but not the least, Map supports map initializer, so that you can create a Map with some key-value pairs from the beginning:
import System.Collection;
Map m = new Map() {
"a" = 3, // The key is a string
box = "xyz", // The key is a string! Equivalent to "box".
'c' = true, // The key is a char
12 = getValue(), // The key is an integer
(fun(5)) = new Machine()); // They key is whatever the expression fun(5) returns
Console.println(m["box"]); // xyz
The most peculiar rule in the map initializer is that an identifier will be re-interpreted as a string literal. This is to help lower the burden of typing and render more neat-looking code. We will talk more about this interface when we introduce Dynamic type.
Array, List and Map are all collection types that implement IIterable, which is statically extended with a set of additional API that can be used in a fluent chain-call style. An example:
string[] src = new string[] { "target0", "not", "notag", "target1", "nontarget2", "ttttt" };
var rst = src
.skip(1)
.filter(e => e.length >= 6)
.append("target2")
.concat(new string[] { "nt1", "target3" })
.filter(e => e.startsWith("target"))
.map(e => e)
.take(2);
For people who have used .NET's LINQ or Java's Streaming API, what the above code does should be self-evident. But in case you haven't, here is the gist: for an IIterable object, you get free extension methods that allow you to either transform it to another IIterable, or converge in a single non-IIterable. Apparently, if you have transformed it into another IIterable, you can start over with another transformation call, and so on.
For example, filter() takes a function as a predicate. Once evaluated, it calls this function against each element in the original iterable, and puts any of those, unchanged, to the resultant iterable if the predicate evaluates to true. Other transformation methods may accept different Function types (map
, flatten
), take another IIterable (concat
, append
), or other kinds of arguments.
While in this example we used lambdas everywhere a function is required, you can also use function or method object. Using lambda is a rather common practice when programming with IIterable extension API, as it allows the most succinct code. Refer to Functional Programming on what you can pass as an Function object.
The single most important property of these transformation methods is that the resulting IIterable is only lazily evaluated when in need, otherwise it would be just a nominal iterable that refers to a source and memoizes a sequence of calls. At the end of the above example, none of those methods called in the wake of src
is actually applied to the constituent elements of the original array. You may think of them as for now only being pending for execution, which we refer to the evaluation of iterable.
To demand an evaluation, you may call one of the convergence, a.k.a. termination, methods. count() is one of the most used such methods.
var rst = src
.skip(1)
...
.take(2); // No evaluation. Only some type check.
var rst2 = rst; // Still no evaluation.
var rst3 = rst2.map(e => e + "_" + e); // Still no evaluation.
int cnt = rst3.count(); // Evaluation.
int cnt2 = rst3.count(); // Evaluation again.
The iterable yielded by the transformation methods are, in most cases, neither List nor Array. Once you started the transformation you will be only accessible to extension APIs for the rest of the call chain. Casting it to a concrete container type would likely fail, although there could be some optimized cases where it happens to succeed. If you do decide to re-materialize it into a collection type that you can recognize, you may call one of those to-
methods, such as toArray(). In doing so, you also force an evaluation of the iterable.
Evaluating an IIterable may have side-effect, but unless the side-effect interferes with the source or the evaluation context, you may evaluate it again and expect the same result.
As an p.s. note of this chapter - while not a collection type, String also implements IIterable, so it gets the extension methods too. However, since the element of String is char, it's often not what a user intends to iterate over. A few extension methods therefore have special treatment for the String type. Refer to the documentation when in doubt.