Overview
This tool is available on all rooms only when Perform Near Duplicate Analysis room setting is turned On.
Near Duplicate Analysis runs on all documents uploaded for which Text can be created, except List Item Attachments.
After all ND workflows finish executing, Documents grid will show the ND related columns populated depending on which method for similarity calculation was selected. MiniHash is the one checked by default, Ngram is the other method available.
If text file cannot be created for the document (for example for file such as audio, video, unextracted archives, some image files, etc), the columns will be empty.
- MiniHash GroupId
- MiniHash Similarity %
- Ngram GroupId
- Ngram Similarity %
Note: If ND setting for the room is checked and saved after documents were uploaded, it should initiate workflows for all the unprocessed documents.
The same happens if the ND method is changed. ND workflows will run for all docs in the room again but this time similarity will be calculated using the other method.
Similarity uses the ND Threshold room setting configured.
Diff Viewer
When Show Diff Viewer room setting is turned On, you can select Diff View from the viewer's dropdown and you will see the differences in content between the base document and any other file you select.
You can right click a document and Set as Left Document for Comparison.
After that just click on a different doc and the viewer will show matching content and lines added/removed.
Comments
0 comments
Article is closed for comments.