Page 1 of 1

BSBI TPP dataset - duplicated records

Posted: Sat Mar 31, 2018 10:21 am
by AndyAmphlett
Tom,

VCRs are encouraged not to waste time on looking for duplicate records, while trying not to create duplicates if possible. But there are a few datasets from BSBI (not from individual VCRs) that seem to be exacerbating the problem. A fairly recently added dataset is one that appears to have been created specifically for the Threatened Plant Project (TPP).

https://database.bsbi.org/search.php#re ... 9942ef695f

As far as I can see (have only looked at a few examples, mostly in vcs 94 & 96) the records in this dataset are all duplicates, but sometimes with date range errors cf the original record. For example McCallum Webster records in vc96 with a correct date range in the source record, have in the TPP version the end date as 2008, when she died in 1985. Would it be safe to mark the whole dataset as duplicate? That would have to be done centrally, or is it for VCRs to investigate their own vcs?

Thanks,

Andy.

Re: BSBI TPP dataset - duplicated records

Posted: Sat Mar 31, 2018 11:17 am
by admin
Hi Andy,

This dataset is a known problem. At David Pearman's instigation I obtained the underlying data from BRC and have made extensive corrections to dates.

There will still be duplicates after the reload, but please wait until then before taking any action, because the refreshed data will definitely be better than the current situation. I hope to reload the set soon - hopefully next week.