0

I want to deduplicate an array of arrays. A duplicate array is one that matches a subset of element indices. In this case, say, index [1] and index [3].

const unDeduplicated = [
  [ 11, 12, 13, 14, 15, ],
  [ 21, 22, 23, 24, 25, ],
  [ 31, 88, 33, 99, 35, ], // duplicate in indices: 1, 3 with row index 4
  [ 41, 42, 43, 44, 45, ],
  [ 51, 88, 53, 99, 55, ], // duplicate in indices: 1, 3 // delete this row from result
];

const deduplicated = getDeduplicated( unDeduplicated, [ 1, 3, ], );

console.log( deduplicated );
// expected result:
// [
//   [ 11, 12, 13, 14, 15, ],
//   [ 21, 22, 23, 24, 25, ],
//   [ 31, 88, 33, 99, 35, ],
//   [ 41, 42, 43, 44, 45, ],
//   // this row was omitted from result because it was duplicated at indices 1 and 3 with row index 2
// ]

What is a function getDeduplicated() that can give me such a result?

I have tried the below function but it's just a start. And it isn't close to giving me the desired result. But it gives an idea of what I'm trying to do.

/**
 * Returns deduplicated array as a data grid ([][] -> 2D array)
 * @param { [][] } unDedupedDataGrid The original data grid to be deduplicated to include only unque rows as defined by the indices2compare.
 * @param { Number[] } indices2compare An array of indices to compare for each array element.
 * If every element at each index for a given row is duplicated elsewhere in the array,
 * then the array element is considered a duplicate
 * @returns { [][] }
 */
const getDeduplicated = ( unDedupedDataGrid, indices2compare, ) => {
  let deduped = [];
  unDedupedDataGrid.forEach( row => {
    const matchedArray = a.filter( row => row[1] === 88 && row[3] === 99 );
    const matchedArrayLength = matchedArray.length;
    if( matchedArrayLength ) return;
    deduped.push( row, );
  });
}

I've researched some lodash functions that might help like _.filter and _.some but so far, I can't seem to find a structure that produces the desired result.

2
  • Is the third row of the expected result meant to have a 99 at index 3? It seems that it turned into an 88. Commented Jul 12, 2020 at 4:43
  • @LeaftheLegend: You are correct. Good catch! Commented Jul 12, 2020 at 6:10

5 Answers 5

2

You can create Set out of the values in columns as you iterate over rows. You could choose to create sets only for the designated columns, e.g. 1 and 3 in your case. Then when iterating over each row you check if any of the designated columns in that row has such a value that is already in the corresponding set, and if it does you discard that row.

(On phone, cannot type actual code. And I guess code is pretty straight forward too)

Sign up to request clarification or add additional context in comments.

Comments

1

It's probably not the most efficient algorithm, but I'd do something like

function getDeduplicated(unDeduplicated, idxs) {
  const result = [];
  const used = new Set();
  unDeduplicated.forEach(arr => {
    const vals = idxs.map(i => arr[i]).join();
    if (!used.has(vals)) {
      result.push(arr);
      used.add(vals);
    }
  });

  return result;
}

Comments

0

Idk if i understand good what you want to do but here is what i've done

list = [
  [ 11, 12, 13, 14, 15, ],
  [ 21, 22, 23, 24, 25, ],
  [ 21, 58, 49, 57, 28, ],
  [ 31, 88, 33, 88, 35, ],
  [ 41, 42, 43, 44, 45, ],
  [ 51, 88, 53, 88, 55, ],
  [ 41, 77, 16, 29, 37, ],
];

el_list = []  // Auxiliar to save all unique numbers
res_list = list.reduce(
    (_list, row) => {
        // console.log(_list)
        this_rows_el = []  // Auxiliar to save this row's elements
        _list.push(row.reduce(
            (keep_row, el) => {
                // console.log(keep_row, this_rows_el, el)
                if(keep_row && el_list.indexOf(el)==-1 ){
                    el_list.push(el)
                    this_rows_el.push(el)
                    return true
                }else if(this_rows_el.indexOf(el)!=-1) return true  // Bypass repeated elements in this row
                else return false
            }, true) ? row : null)  // To get only duplicated rows (...) ? null : row )
        return _list
    }, []
)

console.log(res_list)

Comments

0

This is fairly concise. It uses nested filters. It will also work for any number of duplicates, keeping only the first one.

init = [
  [ 11, 12, 13, 14, 15],
  [ 21, 22, 23, 24, 25],
  [ 31, 88, 33, 99, 35],
  [ 41, 42, 43, 44, 45],
  [ 51, 88, 53, 99, 55],
];

var deDuplicate = function(array, indices){
var res = array.filter(
  (elem) => !array.some(
  (el) =>
  array.indexOf(el) < array.indexOf(elem) && //check that we don't discard the first dupe
  el.filter((i) => indices.includes(el.indexOf(i))).every((l,index) => l === elem.filter((j) => indices.includes(elem.indexOf(j)))[index])
//check if the requested indexes are the same.
// Made a bit nasty by the fact that you can't compare arrays with ===
  )
);
return(res);
}
console.log(deDuplicate(init,[1,3]));

Comments

0

Not the most efficient but this will remove dups of more than one duplicate array

const unDeduplicated = [ [ 11, 12, 13, 14, 15, ], [ 21, 22, 23, 24, 25, ], [ 31, 88, 33, 99, 35, ], [ 41, 33, 43, 44, 45, ], [ 51, 88, 53, 99, 55, ]]
const unDeduplicated1 = [
  [ 11, 12, 13, 14, 15, ],
  [ 21, 22, 23, 24, 25, ],// duplicate in indices: 1, 3 with row index 3
  [ 31, 88, 33, 99, 35, ], // duplicate in indices: 1, 3 with row index 4
  [ 21, 22, 43, 24, 45, ],// duplicate in indices: 1, 3 // delete this
  [ 51, 88, 53, 99, 55, ], // duplicate in indices: 1, 3 // delete this row from result
];
function getDeduplicated(arr, arind) {
  for (let i = 0; i < arr.length; i++) {
    for (let j = 1 + i; j < arr.length; j++) {
      if (arr[j].includes(arr[i][arind[0]]) && arr[j].includes(arr[i][arind[1]])) {
        arr.splice(j, 1)
        i--
      } else continue
    }
  }
  return arr
}
const deduplicated = getDeduplicated(unDeduplicated, [1, 3]);
const deduplicated2 = getDeduplicated(unDeduplicated1, [1, 3]);

console.log(deduplicated)
console.log("#####################")
console.log(deduplicated2)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.