The Now Platform® Washington DC release is live. Watch now!

Help
cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
SlightlyLoony
Tera Contributor

find_real_file.pngNot long ago I came across a piece of code whose purpose was to eliminate the duplicates from a comma-separated list of email addresses. The approach taken was to put the email addresses in an array, sort the array, and then walk through the array eliminating adjacent entries that were the same. The function in question looked like this (with some test code):


var test = 'b@b.com,a@b.com,c@b.com,b@b.com,d@c.com';
test = dedupe(test);
gs.log('Test: ' + test);

function dedupe(emails) {
var email_list = emails.split(',');
email_list.sort();
var last = '';
for (var i = 0; i < email_list.length; i++) {
if (last == email_list) {
email_list.splice(i, 1);
i--;
} else {
last = email_list
;
}
}
return email_list.join(',');
}

While this works just fine, there's an easier way...use the notion of a set, which every JavaScript object is an example of. Here's the same function, rewritten to use a set:

var test = 'b@b.com,a@b.com,c@b.com,b@b.com,d@c.com';
test = dedupe(test);
gs.log('Test: ' + test);

function dedupe(emails) {
var email_list = emails.split(',');
var set = {};
for (var i = 0; i < email_list.length; i++)
set[email_list] = true;
email_list = [];
for (var email in set)
email_list.push(email);
return email_list.join(',');
}

This code is shorter and less tricky to understand. But how does it work? The magic bit is happening in these lines:

var set = {};
for (var i = 0; i < email_list.length; i++)
set[email_list] = true;

This snippet creates an object to be our set (the "set" variable), then walks through the array of email addresses setting properties in the set variable. The last line adds or replaces an entry in the set. The first time we encounter any particular email address, a property with the email address' name is added to the set. But if we encounter any email address again, the property with that email address' name is already in our set, so all we do is replace its value. Since the only value we ever set is "true", no values actually change. At the end of this code, our set object has one property for each unique email address in the input — the duplicates have been eliminated.

This snippet of code puts all those properties in our set back into an array, and then we can conveniently join the array to get back our de-duplicated list:

email_list = [];
for (var email in set)
email_list.push(email);

The for syntax for (x in y) iterates through all the properties of y, putting the names of each property in x. The snippet above starts with an empty array, then populates it with each property name (email address) in our set.

Sets are an example of a collection, a general notion that is central to many programming problems and solutions. If you don't know them well, consider getting better acquainted...

3 Comments