Apparently a substantial portion of my recent exchanges with cfitz was caused by fairly simple misunderstandings due to lack of precision. Instead of decyphering who meant what, I'll try to summarize my view of KProbe in a simplified model that runs as follows.
Algorithm KProbe1 (simplified version of KProbe)
Step 1: Read Current Position into LBA_ini.
Step 2: Start accumulating from sector LBA_acc.
Step 3: Read Current Position into LBA_cur.
Step 4: If LBA_cur < LBA_ini + Delta_sam, go back to Step 3.
Step 5: Read C1C2.
Step 6: Set LBA_ini := LBA_cur.
Step 7: Go back to Step 2.
My assumptions are as follows. At Step 1, LBA_ini is set to the number of the latest sector read by the drive. Step 2 starts accumulating C1C2 errors from sector LBA_acc onwards, where LBA_acc > LBA_ini needn't be known to KProbe1. At Step 3, LBA_cur >= LBA_ini. At Step 4, Delta_sam >= 75 is fixed. Letting LBA_fin >= LBA_cur stand for the latest sector processed when Step 5 is reached, Step 5 outputs C1C2 errors accumulated for sectors LBA_acc through min{LBA_acc+74,LBA_fin}.
Introduce an outer loop counter k by setting k:=1 at Step 1, k:=k+1 at Step 7. Let LBA_ini(k), LBA_acc(k), LBA_cur(k), LBA_fin(k) denote the current values of LBA_ini, LBA_acc, LBA_cur, LBA_fin at Step 5. Then Steps 6 and 4 yield the basic relation
LBA_cur(k+1) >= LBA_cur(k) + Delta_sam.
Hence the estimated sample size Delta_est(k):=LBA_cur(k)-LBA_cur(k-1) satisfies Delta_est(k) >= Delta_sam for k>1. It should be distinguished from the accumulated sample size Delta_acc:=min{75,LBA_fin-LBA_acc+1}, which (using LBA_cur(k-1)=LBA_ini(k)) may be expressed as
Delta_acc = min{ 75, Delta_est + Delta_r - Delta_a}
in terms of the accumulation delay Delta_a:=LBA_acc-LBA_ini-1 of Step 3 and the read delay Delta_r:=LBA_fin-LBA_cur of Step 5. If both delays are absent (or Delta_r=Delta_a), Delta_acc=min{75,Delta_est} as expected. However, Delta_acc<75 is possible, as shown below.
Example 1: LBA_ini=200, LBA_acc=203, LBA_cur=276, LBA_fin=276 for k=2; then Delta_est=76 and Delta_acc=74.
Assume Step 5 outputs LBA_cur followed by the C1C2 counts into a csv file. Then, by inspecting the csv file, we can recover each Delta_est(k), but not Delta_acc(k). Thus we can't be sure whether the C1C2 counts were accumulated over 75 or less than 75 sectors!
Next, the basic property Delta_est(k) >= Delta_sam combined with Delta_sam >= 75 implies that the LBA values in the csv file should be separated by at least 75.
In contrast, both cfitz and MediumRare have mentioned KProbe's csv files with some samples separated by less than 75 sectors, and apparently such samples may be produced even when no reading errors occur. Does anybody have any idea why KProbe can produce such output?
Maybe KProbe employs a smaller Delta_sam? Here consider
Example 2 (Karr's "High Performance System"): LBA_ini(1)=100, LBA_cur(1)=174, LBA_ini(2)=174, LBA_cur(2)=248; such values could be produced by KProbe1 for Delta_sam=74 (not 75!).
BTW, assuming momentarily that the outputs of KProbe and KProbe1 are similar, "under-sampling" may refer to two different cases:
(a) the estimated sample size Delta_est(k) is less than 75;
(b) the accumulated sample size Delta_acc(k) is less than 75.
My former concern has concentrated on case (b), which apparently can't be detected in csv files. This post has originated from my inability to understand case (a) reported by cfitz and MediumRare.
Consider now another variant called KProbe2, assuming that the "Read C1C2" command of Step 5 additionally delivers the number LBA_fin of the latest processed sector.
KProbe2 is derived from KProbe1 by outputting LBA_fin (instead of LBA_cur) at Step 5, and setting LBA_ini := LBA_fin (instead of LBA_ini := LBA_cur) at Step 6. This boils down to resetting LBA_cur to LBA_fin at Step 5. Hence all the preceding relations hold for KProbe2 as well (with a null read delay Delta_r).
Of course, neither KProbe1 nor KProbe2 can model other important features of KProbe (e.g., handling of the beginning/end of the disc, or reading/slipping errors). Still, I hope this post will make it easier to formulate more specific questions on KProbe, which hopefully Karr will be able to answer.