Within Q there isn't currently a built-in function to do this, but a request has been made to the developers to add one in.
In the meantime, a work-around is to do this using a combination of R and JavaScript, as follows:
1. On the Variables and Questions tab, right-click and select Insert Variable(s) > R Variable....
2. In the R CODE field, paste in the following code:
##### BEGIN CODE
x <- cbind(UID, paste(VAR1, VAR2, VAR3))
res <- cbind("uid" = x[,1], "dupe_case_ids" = rep("", nrow(x)))
for (a in 1:nrow(x))
{
res[a, 2] <- ""
for (b in 1:nrow(x))
{
if (x[a,2] == x[b,2] && a != b)
res[a, 2] <- paste(res[a, 2], x[b,1])
}
if (res[a, 2] != "")
res[a, 2] <- paste("ID", x[a, 1], "has duplicate responses in IDs:", res[a, 2])
}
res[, 2]
##### END CODE
3. Edit the first line of the code to replace UID with the variable Name of the unique ID variable in your data file, and replace VAR1, VAR2, and VAR3 with the names of variables you want to include in the check, e.g. your e-mail variable. You can add more variables by comma separating them in the list. Careful to not accidentally delete any parentheses.
4. Click the Play button when done. You should see a result in the pane on the left.
5. Assign a Question Name [bottom of the dialogue box].
6. Click Add R Variable.
This will add a text variable to your data that stores the ID(s) (each separated by a space if more than one) of cases that match, exactly, the case that the information is stored against across the specified variables. You can then create a filter based on this variable:
1. On the Variables and Questions tab, right-click and select Insert Variable(s) > JavaScript Formula > Numeric...
2. In the Expressions field, paste in the following code:
R_CAYET != ""
3. Replace R_CAYET with the variable Name of the R code text variable we created earlier (it'll be R_something random).
4. When done, click OK.
5. In the Tags column, set this variable to a filter by clicking the F.
6. If you want to delete these cases, then on the Data tab, click the AZ filter button. You can then do a sort on two variables (two levels); e.g. select the new variable descending and on email address ascending.
7. The duplicate cases should then appear on top of your file. Ensure that you’ve set your UID in the Case IDs: at the top, and then select the rows of cases you want to remove. N.B. that this filters both (or all if more than 2 per case) duplicates, so if you want to retain one out of a set of duplicates, then you’ll need to select them manually, one by one). If you need to select a number of cases at once you can do this by holding shift and pressing Page Down.
Right-click the selected rows, and then click Delete Rows (click to the left of all the data to do this as a group, otherwise if you are in the data it will only select one case).
IF THE ABOVE DOESN'T WORK IT IT MAYBE BECAUSE THERE IS AN ERROR WITH Q SENDING INFO ONLIINE. IN THIS CASE FOLLOW THE BELOW.
Thanks for your note below. The error you're experiencing is because the calculation is taking too long: there's a restriction of 230 seconds which we're currently working on fixing so that these errors won't appear in future. As a temporary, and less flexible, solution, you can check these two variables for duplicates using the below JavaScript.
1. On the Variables and Questions tab, right-click and select Insert Variable(s) > JavaScript Formula > Text...
2. In the Expression field, paste in the code below:
///// BEGIN CODE
var email = q0047_0009;
var phone = q0047_0010;
var email_count = Array();
var phone_count = Array();
var output = Array();
for (a = 0; a < email.length; a++)
{
var ecount = 0;
var pcount = 0;
for (b = 0; b < email.length; b++)
{
if (email[a] == email[b])
{
ecount = ecount + 1;
}
}
email_count[a] = ecount;
for (c = 0; c < phone.length; c++)
{
if (phone[a] == phone[c])
{
pcount = pcount + 1;
}
}
phone_count[a] = pcount;
var edupe = "";
var pdupe = "";
if (email[a] != "" && email_count[a] > 1)
{
edupe = "Email appears " + email_count[a] + " times. ";
} else {
edupe = ""
}
if (phone[a] != "" && phone_count[a] > 1)
{
pdupe = "Phone number appears " + phone_count[a] + " times. ";
} else {
pdupe = ""
}
if (pdupe != "" || edupe != "") {
output[a] = edupe + pdupe;
} else {
output[a] = "";
}
}
output;
///// END CODE
3. Below the Expression field, ensure you tick the box marked Access all data rows (advanced).
4. Click Check Code (you should get a green tick).
5. Click OK.
This will give you a new variable that stores, against each case, if it's a duplicate based on e-mail, phone, or both. You can, on the Data tab, sort by this variable descending to get an idea of which cases are duplicates. I'm counting 1,380 cases in this file.