Skip to main content

Accounts Merge


Accounts Merge: Given a list of accounts where each element accounts[i] is a list of strings, where the first element accounts[i][0] is a name, and the rest of the elements are emails representing emails of the account.

Now, we would like to merge these accounts. Two accounts definitely belong to the same person if there is some common email to both accounts. Note that even if two accounts have the same name, they may belong to different people as people could have the same name. A person can have any number of accounts initially, but all of their accounts definitely have the same name.

After merging the accounts, return the accounts in the following format: the first element of each account is the name, and the rest of the elements are emails in sorted order. The accounts themselves can be returned in any order.

Example 1:
Input: accounts = [["John","[email protected]","[email protected]"],
["John","[email protected]","[email protected]"],
["Mary","[email protected]"],["John","[email protected]"]]

Output: [["John","[email protected]","[email protected]","[email protected]"],
["Mary","[email protected]"],["John","[email protected]"]]

Explanation:
The first and second John's are the same person as they
have the common email "[email protected]".
The third John and Mary are different people as none of
their email addresses are used by other accounts.
We could return these lists in any order, for example the answer
[['Mary', '[email protected]'], ['John', '[email protected]'],
['John', '[email protected]', '[email protected]', '[email protected]']]
would still be accepted.

Example 2:

Constraints:
  • 1 <= accounts.length <= 1000
  • 2 <= accounts[i].length <= 10
  • 1 <= accounts[i][j].length <= 30
  • accounts[i][0] consists of English letters.
  • accounts[i][j] (for j > 0) is a valid email.

Try this Problem on your own or check similar problems:

  1. Redundant Connection
Solution:
class Solution {
public List<List<String>> accountsMerge(List<List<String>> accounts) {
int[] parent = new int[accounts.size()];
Arrays.fill(parent, -1);

TreeMap<String, Integer> map = new TreeMap<>();

for(int i = 0; i < accounts.size(); ++i){
for(int j = 1; j < accounts.get(i).size(); ++j){
if(map.containsKey(accounts.get(i).get(j))){
int idx = map.get(accounts.get(i).get(j));
if(idx != i && findParent(parent, i) != findParent(parent, idx)){
union(parent, i, idx);
}
}else{
map.put(accounts.get(i).get(j), i);
}
}
}

Map<Integer, List<String>> mapIndexToEmails = new HashMap<>();

map.entrySet().stream()
.forEach(entry -> {
int idx = findParent(parent, entry.getValue().intValue());
mapIndexToEmails.computeIfAbsent(idx, k -> new ArrayList<String>()).add(entry.getKey());
});

List<List<String>> result = new ArrayList<>();
mapIndexToEmails.entrySet().stream()
.forEach(entry -> {
List<String> emailList = new ArrayList<>();
emailList.add(accounts.get(entry.getKey()).get(0));
emailList.addAll(entry.getValue());
result.add(emailList);
});

return result;
}

private int findParent(int[] parent, int idx){
if(parent[idx] == -1) return idx;
return findParent(parent, parent[idx]);
}

private void union(int parent[], int x, int y)
{
int xset = findParent(parent, x);
int yset = findParent(parent, y);
parent[xset] = yset;
}
}

Time/Space Complexity:
  • Time Complexity: O(e*(l+a(n)+loge))
  • Space Complexity: O(e)

Where e is the total number of all email addresses in all accounts, l is the average length of email, n is the number of accounts and a(n) is inverse Ackermann function that defines time complexity of findParent/union operations.


Explanation:

The function first creates an array called parent and initializes it with the value -1 for every element. This array will be used as a data structure for union-find. We use union find since it's obvious that the records are connected if they share at least one email address, so we could join them into groups with union or disjoint sets operations which are straightforward to implement. It also creates a TreeMap called map, which will be used to map email addresses to the accounts that contain them. We chose TreeMap since it provides out of box sorted order of keys. While we iterate over the emails we check if the address is already present in the map and under which account, if the account that we currently have stored inside the map is different than the current one and if there a disjoint set (findParent(parent, i) != findParent(parent, idx)) we can merge the two disjoint sets by creating their union (merging accounts). If the email is not already in the map, it adds it to the map with the current account's index.

Afterwards, it creates a map mapIndexToEmails that maps the index of an account to its associated email addresses, merging any accounts that were found to have the same email. Then it constructs the result, which is a list of lists of strings, where each inner list contains the name of the account followed by all its associated email addresses.

The total space complexity is proportional to the number of all email addresses that we need to store. While for the time complexity, for each email address we have the following operations:

  1. We first must hash a string to put it in a map with the operation taking O(l) where l is the average length of an email address
  2. We have operations on treemap which take O(log e) where e is the total number of all email addresses in all accounts (also the maximum number of elements in the treemap)
  3. Union and find operations which are defined by inverse Ackermann function a(n) (strict boundary, for the interview is enough to say the complexity is approximately constant) where n is the number of accounts This leads to total time complexity of O(e*(l+a(n)+loge)).

One tricky part to focus on is findParent(parent, i) != findParent(parent, idx) which prevents creating cycles inside the parent array which is tracking the parents of all sets/accounts.