SOUR

SOUR is a tag and a record

One reason for going into detail about SOUR is that it illustrates the difference between a tag and a record and the use of xrefs. You can be a Lifelines user for a long time without ever being clear on what the difference is because you should be using REFN to tie records together, and you should never edit xrefs directly — doing so will usually break the Lifelines database.

You can also write some meaningful report programs without ever confronting xrefs because many of the built-in Lifelines functions do the heavy lifting when it comes to relating individuals and families. For example nextsib(indi) finds the next younger sibling of a person which covers up the work of finding the person's parent's family and going through its children.

Textbook comparison

The difference between a SOUR record and SOUR tag (often a SOURCE_CITATION) can be compared to the difference between a bibliographic entry (in the back of a textbook) and footnote at the bottom of a page in the textbook. Sometimes when a source is used in passing or only once, all the information about the source is contained in the footnote, and this analogy holds for SOUR.

The bibliographic entry contains much data about the publication that is being used as a source. The footnote does not repeat any more of that data than is necessary to identify which bibliograph entry is meant (in a textbook, this may be the principal author's surname or an abbreviation for the name of a well-know reference work. The footnote contains information about the place in the work that contains the particular item being sourced and may contain a brief quotation so there is no mistaking where the item came from. Some footnotes do not refer to any bibliographic entry, but contain all of the information provided about the source.

For (fictional) example:


(biliography entry)

Evans, Taylor.  The Secret Life of Wildflowers. (New York: Harper & Row,
       1844.) 2 volumes.
       
(a footnote)

1. Evans. v. 1, p. 28: "Wildflowers are pretty."

(another footnote)

2. A neighbor lady said so.

This is not meant to be an illustration of any particular form of academic citation.

The thing that ties the footnote to the bibliography entry is, in this case, the author's surname. It is important Evans be unique, so that it points unambiguously to the bibliography entry. If Evans is not unique, some citation systems add the publication date in parenthesis, and so forth. You need the bibliography entry to find the book in the library, but when you have the book, you need the footnote to find the particular passage. If you are a trusting soul, you will just take the author's word that the quotation appears where he says it does. Even if you take his word for the citation, you still may quibble about whether it proves what he seems to think it does.

The GEDCOM model

The GEDCOM SOUR record is like the bibliography entry and the SOUR tag (SOURCE_CITATION) is like the footnote. One obvious difference should be that you cannot enter a SOUR record in any other kind of record. This point is worth noticing because the Lifelines user interface lets you add a SOUR record to the datebase when you are looking at an individual or family record (with the %s 'Add source' command). This may create the false impression that a SOUR record is created in the individual or family record.

Tags that are valid in a SOUR record may be invalid (or have a different meaning in a SOURCE_CITATION (SOUR tag). SOUR is also used somewhat differently in the header of GEDCOM submission, but we are not going to worry about that. SOUR is also used within (below) certain other tags when it can only be a cross-references with no tags below it. We will worry about that, but later. For example, these are the first level tags that are valid in SOURCE_RECORD:

Some of these level 1 tags are the top of various sublevel structures, some of them can be cross-references to other kinds of records, many of them are optional, but there are no other level 1 tags in a GEDCOM SOURCE_RECORD.

Notice that the order of the level 1 tags is not important. Because it is so important to making Lifelines useable, REFN should probably be at or near the top of every record of any kind.

Lifelines is more flexible than GEDCOM, so it will not slap your wrist if you use some other level 1 tags in a SOURCE_RECORD. But if you do add other level 1 tags the information in them may not be transmitted in a GEDCOM derived from your database and you may not use other tags in a consistent manner if you have not carefully thought out the meaning and structure of those tags and, if your memory is no better than average, made some kind of record of how you plan to use such tags. You may intend never to submit a GEDCOM to anyone else, so you are not concerned about whether your database makes sense to the Mormons, but the point of entering data at all is to be able to retrieve it with understanding at some time in the future. That is what report programs are about.

You can write Lifeline reports to make use of the information in your invented or creatively applied tags, but you can only do that if you use such tags consistently. For example, if you add a level 1 DATE tag to the SOURCE_RECORD structure, you need to think about what that means so it is not the date on the title page of a book one time and the date you found the book in a library another and the date the research for the publication was done a third time. Lifeline lets you do that thing that is wrong by GEDCOM standards — add a level 1 DATE tag to a SOURCE_RECORD — but it is up to you to do that wrong thing consistently so that you can retrieve the data in a meaningful way with a report program.

You may notice that some of the reports that come bundled with Lifelines rely on tags that do not exist in the GEDCOM standard or on using tags that do exist in unconventional ways. What you may not notice about bundled reports is that most of them rely most of the time on records and tags being used in ways that are consistent with GEDCOM. When bundled reports fail often the problem is that the database contains some unconventional tag or unconventional record structures that is fatal to the report. Yes, the perfect report would check for possibly fatal data errors and do something other than die when such is found, but you cannot expect every contributed report to be up to that standard, and your may have found a way to enter something screwy in your database that no report writer, however conscientious, could anticipate.

For the sake of comparison, here are all possible tags that can be in the level immediately below a SOUR tag (SOURCE_CITATION):

Not all of these tags can be immediate below a SOUR tag (SOURCE_CITATION). For example, TEXT belongs immediately below a DATA tag in one form of SOURCE_CITATION, but in another form TEXT can be immediately below the SOUR tag, but the DATA tag is not allowed in this form of SOURCE citation. Again, there is no place for a DATE tag immedicately below a SOUR tag, so if you put one in, you have to know what you mean by it, and when you write a report program, you have know how to interpret it.

The first thing to notice is that there is no REFN tag. That is because you cannot refer to a SOURCE_CITATION. One of the reasons you want to create SOURCE_RECORDS instead of relying entirely on SOURCE_CITATIONS is that the same source may be useful for several different facts which may occur in several different records.

For example, an obituary will often have a number of facts about the life and death of the deceased. It will not be the best evidence for many of the facts (you really want a birth certificate, a death certificate, and so forth), but it may be the best evidence you can get and may serve to support (or the opposite) other evidence. The deceased aside, however, the obituary will often name survivors (which is some evidence they were alive at the time the deceased died), may name the survivors spouses, and so forth. It may list those who died before the deceased (or at least give you a count of them), from which you learn that certain people died before the deceased, which may be helpful if you have no information at all about the dates of their deaths. There are also some perhaps less obvious bits of evidence: evidence that the funeral home was in operation at this time, evidence that the minister was alive at this time, and so forth. So this one source could easily be pertinent in one way or another to one degree or another to dozens of records in your database. In some it may be essential. In others it may merely confirm other evidence.

Cross-references help to avoid entering repetitive information. In other database records they help to make clear what we mean. When you enter an individual by reference in a family record (something the Lifelines interface does for you) and the family in the individual's record (again, something Lifelines does for you), you link the records together in an unambiguous way, so there is no question of the database confusing individuals with similar names.

©Copyright 2009 by Lars Eighner. Original material may be copied for personal use, but may not be sold, made available contingent on the payment of any fee or access charge and may not be bundled in any product which is sold for a fee or media charge or which requires any payment for access. In short, you cannot charge money for material I have made freely available. Software and other products mentioned may be trademarks belonging to their respective owners.