Closed Captures Models

The closed captures data type consist of 12 models. Each consists of the basic parameters p -- probability of initial capture, c -- probability of recapture given that the animal has been previously captured, and pi -- proportion of the population with a particular mixture. There are 3 basic closed captures models: p and c only (i.e., no mixtures -- the population consists of only a single type), p and pi only (i.e., no difference in recaptures from initial captures), and p, c, and pi (i.e., the most complicated type where mixtures of both p and c are allowed. The heterogeneity mixture models are from Pledger (2000).

For each of these 3 models, 2 versions exist. The first version is the full likelihood version, where the population size (N) is included in the likelihood. More technically, the quantity f0 = N - number captured [typically symbolized by M(t+1)] is what actually appears in the likelihood. Thus, f0 is the number of animals in the population that were never captured, or f0 = N - M(t+1). The second version is the Huggins (1989, 1991) version, where the population size is conditioned out of the likelihood. Instead, the population size is generated as a derived parameter. Thus, a total of 6 different parameterizations of the closed captures data type are available. Note that the likelihoods are not comparable between the Huggins version and the full-likelihood version. Hence, AICc cannot be compared between these models.

The following table demonstrates the various models from Program CAPTURE (Otis et al. 1978, White et al. 1982) that can be build with the 3 basic data types. If the full-likelihood version is used, then the population size (N) parameter would also be included in the parameter space.

Parameters Models

p, c M0, Mb, Mt, Mtb

pi, p M0, Mh

pi, p, c all of the models of Otis et al. (1978), including Mtbh

Note, however, that even though the full heterogeneity data type is completely flexible, the complexity of the model may be to difficult to easily build. Hence, by using the Change Data Type menu choice, you can build the same model, and obtain a comparable likelihood, but not face the complexity of a large number of PIMs and a complex design matrix.

To better understand how these models are parameterized, consider a simple 3-occasion experiment. Thus, there are 3 initial capture probabilities (p1, p2, and p3), and 2 recapture probabilities (c2 and c3). For the mixture distributions, only 2 mixtures are shown. The following table displays the cell probability for each of the 2^3 = 8 possible encounter histories.

Encounter

History p,c pi, pa, pb pi pa pb ca cb

111 p1 c2 c3 pi pa^3 + (1 - pi) pb^3 pi pa1 ca2 ca3 + (1 - pi) pb1 cb2 cb3

101 p1 (1 - c2) c3 pi pa^2 (1 - pa) + (1 - pi) pb^2 (1 - pb) pi pa1 (1 - ca2) ca3 + (1 - pi) pb1 (1 - cb2) cb3

110 p1 c2 (1 - c3) pi pa^2 (1 - pa) + (1 - pi) pb^2 (1 - pb) pi pa1 ca2 (1 - ca3) + (1 - pi) pb1 cb2 (1 - cb3)

011 (1 - p1) p2 c3 pi pa^2 (1 - pa) + (1 - pi) pb^2 (1 - pb) pi (1 - pa1) pa2 ca3 + (1 - pi) (1 - pb1) pb2 cb3

100 p1 (1 - c2) (1 - c3) pi pa (1 - pa)^2 + (1 - pi) pb (1 - pb)^2 pi pa1 (1 - ca2) (1 - ca3) + (1 - pi) pb1 (1 - cb2) (1 - cb3)

010 (1 - p1) p2 (1 - c3) pi pa (1 - pa)^2 + (1 - pi) pb (1 - pb)^2 pi (1 - pa1) pa2 (1 - ca3) + (1 - pi) (1 - pb1) pb2 (1 - cb3)

001 (1 - p1) (1 - p2) p3 pi pa (1 - pa)^2 + (1 - pi) pb (1 - pb)^2 pi (1 - pa1) (1 - pa2) pa3 + (1 - pi) (1 - pb1) (1 - pb2) pb3

000 (1 - p1) (1 - p2) (1 - p3) pi (1 - pa)^3 + (1 - pi) (1 - pb)^3 pi (1 - pa1) (1 - pa2) (1 - pa3) + (1 - pi) (1 - pb1) (1 - pb2) (1 - pb3)

In the Huggins data type, the 000 encounter history is not part of the likelihood, and the parameters are estimated from the multinomial created from the 7 remaining encounter histories.

The most common mistake made with the closed captures models is that no constraint is applied to the last time-specific initial capture probability parameter, pt. Intuitively, the population size under any of these models is estimated as the number of animals observed [M(t+1)] divided by the probability of being detected one or more times. The probability is defined as p*, and is estimated as 1 - (1 - p1) (1 - p2) ... (1 - pt). Note that none of the recapture probabilities (c2, c3, ... ct) appear in this estimate of p*. However, the last p parameter is not estimable without a constraint. In Model M0 of Otis et al. (1978), all the p and c parameters are assumed to be the same. In Model Mb, all the p's are assumed to be the same, and all the c's are assumed to be the same. In model Mh, the recaptures are assumed to be the same as the initial captures.

The take home message here is that the general model p1, p2, ..., pt, c2, c3, ..., ct results in a nonsensical estimator, where pt = 1, and N is estimated as M(t+1). With pt = 1, p* = 1, and Nhat = M(t+1)/1 = M(t+1). Thus, be sure to doubly check any models where N is estimated as M(t+1) to verify that pt is estimated as different than 1.

The final variation on these 6 models is that a parameter, alpha, is added to the likelihood to estimate the probability of mis-identification. These models were added to accommodate the the mis-identification of animals that takes place with DNA analyses when the amount and quality of DNA available is low and of poor quality. The likelihoods were developed by Paul Lukacs (2005). As a result, there are a total of 12 closed captures models in MARK. The definition of alpha is the probability of a correct classification, so that fixing alpha = 1 makes these 6 additional models equivalent to the first 6 models described above.

The effect of mis-identification is to bias the estimates of population size high caused by 2 factors. First, the number of unique genotypes found [M(t+1) above] is biased high because some of the unique genotypes found are actually errors in that the identified genotype was incorrect. Second, this increase in the numbers of animals supposedly encountered causes the estimated probability of detection to be smaller than it should be. The effect of these two factors is to cause the estimate of N to be too high.

As described above, both the simple and complex heterogeneity models are available for the mis-identification closed capture models. However, incorporation of both mis-identificaiton and heterogeneity typically leads to inconclusive results, in that mis-identification is somewhat confounded with heterogeneity. Intuitively, mis-identification is detected by too many animals only appearing once in the encounter histories. Thus, a large amount of individual heterogeneity may appear as mis-identification, and vice versa, mis-identification may appear as individual heterogeneity.

None of the mis-identification models include N in the likelihood, so that this parameter appears as f0 in the PIMs. Rather, the estimates of population size (N) are generated as derived parameters. To allow model averaging of population estimates from any of the 12 closed captures models, all produce estimates of N as derived parameters, even when N appears as a real parameter. As noted above, the Huggins models and the models with N (or more correctly f0) in the likelihood do not produce comparable likelihoods, so model averaging across these different data types does not make sense. However, model averaging the derived parameter N across models with and without mis-identification is reasonable.