How many sources/citations is too many?



I've been using GRAMPS, and am a little overwhelmed by the citation abilities. So I guess the question for seasoned veterans of genealogy is where do you put the citations, given the ability to cite every little detail? Or do you put it everywhere?

I've mostly been using the GRAMPS software, so the question may be tilted towards that. But, for example, a census can be cited in:

  1. A family object
  2. A census event
  3. Residence event for each individual (shared)
  4. Occupation event for each individual
  5. Each persons estimated birth event (year and place, if no other data is available)
  6. The names of each individual
  7. The birth relationship for each person (not always confident in the census for that, but sometimes better than nothing)
  8. Marriage event
  9. Other miscellaneous stuff (some list naturalization date, etc)

Similarly for birth/death certificates.

So how much is too much? Or is it common for people to cite just about everything like that? It's tedious, but I suppose it strengthens the relationships between all the data.

Benjamin Pritchard

Posted 2013-12-27T15:49:48.870

Reputation: 341



To me the whole question of how much to cite is an artifact of trying to use lineage-linked software as a tool for recording what we find in evidence. It took me years to figure out that these programs are designed to keep track of the material which we have already 'proven' or at least have concluded belongs to the same person. This may sound obvious, but I was trained to handle data very differently, so to me the whole data-entry process for a lineage-linked program is backwards from the way it should be.

In a source-based process, it works like this. You keep a research journal of what you collect, when you found it and where you found it. Let's say you have an oral history, because that's the kind of family history material which is most like the data I was trained to collect. You interview Aunt Jane. You record what she said, and then you go home and rewrite your notes and/or listen to your sound recording or video, and you make a transcript. Then you break that transcript down into all the bits of evidence that Aunt Jane told you, which some people call facts. But in reality, it's all stuff that you only know because "Aunt Jane says so", and to remind ourselves of this, some family historians prefer to call them "assertions".

From this perspective, the answer to "where do you put the citations" is automatically "everywhere". You've collected this one source, and you've started to extract the assertions in it. You haven't done any analysis of it, or proven anything about which person it belongs to. You might have an idea of where it belongs because that's why you collected it in the first place, but that idea isn't proof. It may make up part of your proof later, but you aren't there yet. So you have to create a citation -- because you have to record the source -- for all the individual assertions now, as you are extracting them -- otherwise, how do you keep track of their source while you are doing the analysis? (In the old days, we did this on paper, and we had boxes of index slips -- and we indexed everything. Every bit, each on its own slip, with each slip having a reference to where it came from, in case we discovered the source was crap and we had to throw it out. Each slip would contain an assertion in context, which I think is similar to what others call "persona" records. Slips that we thought belonged together got filed together, but nothing was attached in the same way evidence is attached to a person in a lineage-linked database. It was all much more fluid.)

This is why of all the lineage-linked software that I looked at, I chose Family Historian. It has auto-source citation. It was the only program that I found that would let me extract assertions from a source and post the data to the person I thought it belonged with, that would add the citations automatically as I went along. This is still not ideal, because it still encourages me to make up my mind about what person these assertions belong to in advance of actually doing the work to prove my conclusion, but it is better than working from the person-centric mindspace, where you add the data to the person first, and then try to figure out "where to put the citations" -- which results in what one user here called a "pedigree with source citations hanging off it". (See Mat's question What tools exist for collecting and managing evidence?)

I can't give you a good answer which is specific to Gramps, because when I looked at it, I was trying to figure out how I can make it work as a source-centric model of data handling, and I couldn't do it. So I haven't looked at it for very long. We have a lot of wonderful tools at our disposal, but IMHO they aren't always designed to do the things we need them to do, and if you are a newcomer and don't realize what the hidden assumptions are, it is easy to shoot yourself in the foot with them.

About your census example, Cole Valley Girl says:

If I have other sources for the same assertion (e.g. birth date), I might rank it as lower in precedence but I'd still record it as supporting the assertion. Same if it contradicts other evidence -- the discrepancy tells me something (even if I may not know what!).

I think it's important to record the data somewhere, but the question is, where? In a formal citation? In a local note? In your research journal?

One criterion for deciding which way to go might be the nature of the evidence itself. That is, for a death certificate, record the evidence and produce a citation for the stuff that that the document is about -- the death. But for the evidence which is not contemporary to that event, like the birth, one might put those assertions in source notes instead, along with the analysis of whether or not the birth date and place are consistent with your other evidence. But whether you create a formal source citation or not, I think it's important to keep some kind of record of what was asserted where.

In the system I was trained in, we used facing pages in the research journal. Evidence was recorded on the right-had page, and our commentary and analysis was on the left page, so it was always evident what came from the source and what was our own analysis. All of the researcher's cross-references, observations, notes for follow up, etc. all go on the facing page on the left; the informant's data goes on the right. All pages in the journal were numbered (so you can tell if pages go missing).

When we analyzed our source material, we marked up our research slips to assign the data to the appropriate categories, and filed the slips accordingly. (The program I've seen so far in the genealogy / family history/ microhistory realm that seems the friendliest to this process is The STEMMA Project by @ACProctor.) Any slip would be coded so if you discovered data that was in error, you could pull it back out. (In a group process where data was shared, both the informant's name and the researcher's name would be linked to, so it would always be clear whose work was whose.)

In our paper-based system, the permanent record of our research is the research journal, in which each source collected is dated. The boxes of paper slips extracted from the information in that journal make up our working index. The lineage-linked software, with its source citations, is the machine equivalent of the index. When I started out, I was baffled to find many programs for the lineage-linked part, but no good software to replicate the crucial core of the process, the research journal, which provides the equivalent of an audit trail in accounting.

I think it really helps to write out your thoughts and mock up the whole process on paper. If you can't describe the process on paper and have it make sense, it won't be any better once you transfer the whole process to the computer. Writing it by hand is a different tactile experience than typing; it engages your brain in a different way and can give a fresh perspective.

Jan Murphy

Posted 2013-12-27T15:49:48.870

Reputation: 22 994

1This is a great answer, and I think reminds us that genealogy isn't about "Gedcom" but really is just historical research, and can be approach as other research is. – Sam Wilson – 2017-11-28T09:01:38.033

It’s like on Ancestry I have ended up with multiple sources / media for the same fact. Say a birth, I have the FreeBMD version, the Civil Register version, the localWiltshire archive version. Three different sources all pointing to the same reference page media. Do I keep all three citations etc. In Ancestry? – Andrew Truckle – 2020-12-29T20:39:35.023

1@AndrewTruckle The principle is to cite what you use. One could choose to put information about all three in a research note instead, and only create a proper source when you have the certificate. – Jan Murphy – 2020-12-29T21:09:38.897

@JanMurphy So you would then delete all those sources etc. and just have your own and then "ignore" the hints that will show their face again. – Andrew Truckle – 2020-12-29T21:13:13.737

@AndrewTruckle This might be better as a discussion in chat, but I keep everything. But some might put the information from the three different indexes into the Notes. I park some hints into Undecided so as not to spoil the hint engine for other people who might be depending on it. I only Ignore when I think it's not the right person. – Jan Murphy – 2020-12-29T21:18:37.460

1@AndrewTruckle Example: for records showing parents plus child, I only accept the hint for the child, and mark the parents as 'undecided'. I can still view the hints for all their children by looking at undecided hints but it doesn't clutter the parents' profiles. – Jan Murphy – 2020-12-29T21:20:13.520

Yes, like I don't accept the hints for baptisms on the parents profiles. Just the child. – Andrew Truckle – 2020-12-29T21:27:03.147

Apologies if this is too discussion-like and not answer-y enough. – Jan Murphy – 2013-12-27T23:46:01.057

+1 Good answer. My only slight disagreement is that your assertions are part of your conclusions, and should be included with them, not with the source. – lkessler – 2013-12-27T23:59:50.027

In my answer "assertions" was supposed to refer to anything asserted in the document, i.e. what others call 'facts'. My observation that the calculated year of birth in a census or death certificate is consistent with other evidence belongs in a research note -- what I meant to say was, an assertion might be recorded, along with my observation about the assertion, in a note. In a person-based system this research note might be attached to the person; in a source-based system it might be attached to the source. – Jan Murphy – 2013-12-28T00:02:38.337

Thanks for the detailed answer. I chose GRAMPS because it runs on linux, and because the setup of the software (basically a frontend relational database) is a familiar setup to me. It's interesting to think of the source-vs-person centric idea, though. It may be somewhat possible in GRAMPS; I think there's a census add-on that does something like that. – Benjamin Pritchard – 2013-12-29T03:56:18.110


I store each source once only but link it to every assertion (supposed 'fact') for every individual that I'm using it to support. So, for a census, it would be cited for: names, ages, calculated birth years, residence at a particular date, occupation at a date, relationship to other individuals, etc..... (I don't use family events).

If I have other sources for the same assertion (e.g. birth date), I might rank it as lower in precedence but I'd still record it as supporting the assertion. Same if it contradicts other evidence -- the discrepancy tells me something (even if I may not know what!).

A class of exception to the above would be if I'd cited a finding aid as an interim source -- so for example, the England and Wales GRO indices might give me a quarter, year and registration district for a birth. If I subsequently obtain the more detailed certificate, I won't also cite the GRO index (unless as a note to the certificate if it would be useful to other researchers to help them locate the same certificate -- for example if it had been mis-indexed, or the index put the event in the subsequent quarter because of the delay between birth and registration).

So I end up with a source for almost every document etc. that I refer to and a link from that source to as many assertions as it supports. Most genealogy programs should cope with a high volume of sources and references, and this way allows you to understand your own reasoning behind (for example) your choice of the birth date you think most likely for an individual.


Posted 2013-12-27T15:49:48.870


I would argue that microanalysis of source references at the event level as you propose is not as effective as macroanalysis at the person level. – lkessler – 2013-12-27T19:48:01.113

1@lkessler I don't believe the two are mutually exclusive, but I do believe macroanalysis alone is ineffective and error prone. Also, I deliberately referred to 'assertion' not 'event/fact' because there is a difference between those two concepts. Macroanalysis alone doesn't allow you to differentiate levels of reliability for different pieces of information derived form the same source. – None – 2013-12-27T19:57:22.487

In all seriousness, I would really love to have a full discussion with you (elsewhere, not on G&FH SE) on how you manage to include all your source references at the assertion level in your genealogy software. – lkessler – 2013-12-27T23:33:51.440

One of the most frustrating things about Stack Exchange is that it has allowed me to make contact with many people I would love to have a discussion with, with whom I'm not allowed to have a discussion. – Jan Murphy – 2013-12-27T23:44:37.990

@Jan, discussion is allowed in chat -- we could set up a specific room (it wouldn't be private). – None – 2013-12-28T12:01:24.333


This is an interesting-enough issue that I've added a blog post with my views. Feel free to discuss there:

– lkessler – 2013-12-28T17:53:32.603


It is all about proof or lack of it.

Personally I record them all.

If you record every piece of information it will help others later to confirm your research and it will also enable you to identify any possible errors.

In my experience our ancestors were often 'mistaken' as to their age particularly on census records. I have some whose age varies by 20 years across 6 decades of census entries.

Recording all this information can help to build up a picture of your ancestors. In once census they may be a farm labourer and in the next have a totally different occupation. This can lead to interesting research to find out why they changed occupation and allows you to turn bland facts into the story of your ancestors lives.


Posted 2013-12-27T15:49:48.870

Reputation: 3 480


For standard genealogy programs such as Gramps, my recommendation is to cite (i.e. link to) sources that contain multiple events and/or multiple relationships once for each person that is mentioned. Store it as a general source for the person. A census record is an example of a source that could be stored this way.

When you have a source pertaining to a specific event, such as a specific record for a birth, marriage, death, residence, occupation, etc., then store it as a source for that event for the person or family. If the record also mentions other people, also store it as a general source for those people.

The census is somewhat of a special case because some programs include a census event. If yours does, then you can store your census source reference with the census event for each person mentioned rather than as a general source for the person.

This way, one source reference is not mentioned more than once in any person or family and minimizes their repetition while maximizing their usefulness. Too many of the same source references everywhere makes it difficult to determine what is really pertinent.

If you want to record your sources to a more detailed level, e.g. for each assertion, you should probably get a program designed for evidence analysis, such as Evidentia, GenQuiry, Clooz or Lineascope.

The responses to this question have prompted me to add a more detailed blog post with my views.


Posted 2013-12-27T15:49:48.870

Reputation: 16 148


The beauty of object (in this case "citation") sharing that it's so easy to do all of the above.

Once you creating the citation (and source, if necessary), drag it to the clipboard and from there drop it everywhere relevant.

For example, the same obituary citation can be attached to dozens of events, people and families.


Posted 2013-12-27T15:49:48.870

Reputation: 458