Credit Card Grep for EnCase

Creditcards

I was a given a challenge by a colleague to come up with a good grep for credit cards.  He had come up with a fairly complicated grep statement that looked for them specific to the domains in which each company operates – one for American Express (37…), one for Master Card (54…), etc.  While this can be useful, in my mind it overcomplicates things and while it may be more exact, the search in question is going to take MUCH longer.  (The more complicated a grep expression is, the longer it takes to use it to search.)

Using EnCase Grep, we can see that the character to represent numbers (as in any grep) is “#”.  The easiest grep for a (non-American Express) credit card number is:

####-####-####-####

We can rewrite this as:

#{0,4}-#{0,4}-#{0,4}-#{0,4}

Which means we’re searching for zero or more numbers four times, with dashes between them where the dashes are.

But that’s not entirely realistic, and will lead to a lot of false positives.  We’re better off writing it this way:

#{1,4}-#{1,4}-#{1,4}-#{1,4}

insofar as we want at least one number in each section.

Even this would work:

#{3,4}-#{3,4}-#{3,4}-#{3,4}

as we’re unlikely to run into any credit cards where we’re not going to have at least three numbers per section.

If you want to search for things that are characters that aren’t dashes, use the “.” character:

#{3,4}.#{3,4}.#{3,4}.#{3,4}

as that will match any character.  (That may give you some false positives, but it’s going to be faster.)  You could also add specific characters in brackets if you wanted.

What about American Express though?  They went and screwed things up with their only fifteen digits and not having them in the same syntax as all the rest.

A basic grep for American Express would be:

####-######-#####

wherein we’re looking for the fifteen numbers in that sequence.  Is it possible to add this into our existing grep to see if we can get them all?

The answer is yes:

#{3,4}-#{3,6}-#{3,5}-?#{0,4}

The trick with this is that we have to remember that with the American Express card, the last sequence of digits may not happen.  Therefore, we have to keep in mind that the last dash may or may not be there, so we have to use the “?” operator to indicate that it will be there zero or one times.  With the last sequence of numbers, we want it to be that there may not be any digits, so we change our numeric repeating operator {x,y} to include digits that will occur zero to four times.

Comments (0)

Comments are closed.

%d bloggers like this: