[KLUG Programming] questions on picking data types
Adam Tauno Williams
awilliam at whitemice.org
Wed Jul 14 19:21:24 EDT 2004
> For SSN, three fields feels wrong to me. I imagine the only
> operation you'll be doing on this field, apart from reading and
> writing it, is searching on the whole thing. Will you ever need
> to retrieve all SSNs with "45" as the middle 2 digits?
> If not, then it probably makes more sense to just store the
> 9-digit SSN as an INT. That way when you need to retrieve a row
> when you know the whole SSN, it's one index lookup instead of
> three. A single INT takes fewer bytes than the three-field idea
> too, though that's a very, very small difference.
I'd just store it as a string. Since a SSN isn't really a number, even
though it is composed of all digits. Otherwise you will have problems
with SSN's that *START* with a zero (an INT field will obviously drop
it). Like a phone number isn't a number. And while you may store SSN,
how often do you ever actually look someone up using SSN?
> The only gotcha with doing all 9 digits in one INT is that if you
> need to parse it with hyphens, you'll have to do that in your
> input and/or output code. But that's where you should be doing
> that anyway. Far as I'm concerned, all that cruft, adding hyphens
> for the most common representation of SSN, or parens or hyphens or
> periods or whatever to the 14 most common ways to write a phone
> number, is all I/O handling. You shouldn't design your DB schema
> based on whether "(123) 456-7890" or "123.456.7890" is preferred
> this week or in this country, and the same goes for "123-45-6789".
Agree, for phone numbers, etc... it is best to just store the actual
data parts, and drop all the .-() crap.
Also don't forget that *USA* phone numbers are XXX-XXX-XXXX, foreign
phone numbers are not.
More information about the Programming
mailing list