Hi,
I'm working on a large database that contains data of a longitudinal study. For each person I have, among other things, a personal identity code and personal identity codes of parents. I have managed to identify siblings, creating a new variable that has the same "family number" for all siblings. I now wish to identify cousins as well as uncles/aunts. Any suggestions? Thank you! Nomi |
At 12:22 PM 1/31/2007, Nomi wrote:
>For each person I have, among other things, a personal identity code >and personal identity codes of parents. I have managed to identify >siblings, creating a new variable that has the same "family number" >for all siblings. For all siblings, and their parents? That's a nuclear family. Or is it not a "family number" but a "siblings" number, with the parents numbered into whatever sets of siblings THEY belong to? Of course, each person record needs an identifier for the parents, or you've no chance. What do you do about half-siblings? Should you drop the 'family number' or 'sibling group number', as you've defined it, in favor of giving the person numbers for the parents? Then siblings would be easy to sort out: they have the same pair of parents. >I now wish to identify cousins as well as uncles/aunts. Any >suggestions? It's always an interesting problem. Back in the 1970s, the people I knew who were working on it called it 'family reconstruction'; is that term still current? It's a special and complicated case of the problem called 'transitive closure'. 'Transitive closure': implement the rule "if A is connected to B, and B is connected to C, A is connected to C". Do that by exploring the connections to whatever depth is necessary, and adding connections as necessary; for example, in the above case, if "A is connected to C" isn't already in the file, add it. It's complicated, because "is related to" isn't enough. (If I'm related to you and you're related to Joan, then I'm related to Joan; but if we follow that chain far enough, both of us are probably related to everybody. You want degree and kind of relationships, not just their existence. Let's see. I think the 'primitive', or irreducible, relationships, are 'father/child', 'mother/child', and 'husband/wife', and every other is definable by following those. (Gender-specific terms used advisedly.) 'Sibling', for example, is having the same father and mother. (Further complications: Our society is considering same-gender spouses; if yours includes those, 'husband/wife' isn't the only marriage relationship. And if a woman is in 'husband/wife' with two men, is that widowhood, divorce, or polyandry? Do beginning and ending dates, and ending reasons, of marriages, need to be included?) I haven't worked on transitive closure in SPSS for quite a while. It could be a chance to work up Python skills. On the other hand, SPSS may not be the most suitable tool. I suggest, . Look in the literature for 'family reconstruction.' I know people have done this before - as I say, I knew people at Brown University doing it, back in the 1970s. There must be computer methods; it's the first thing one would think of. This particular wheel must already have been invented. . Your E-mail domain doesn't specify and you don't use a signature. But, if you're at a university with a Computer Science department, this might be about the level for a master's thesis, or maybe senior project, in CS, if you could persuade somebody there that it's interesting. Fascinating problem. Good luck with the computational problem; even more, good success with your study. Richard Ristow |
Free forum by Nabble | Edit this page |