Skip to content
This repository has been archived by the owner on Jun 1, 2022. It is now read-only.

~5,000 locations have multiple cvs: concordances #708

Open
simonw opened this issue Jun 30, 2021 · 2 comments
Open

~5,000 locations have multiple cvs: concordances #708

simonw opened this issue Jun 30, 2021 · 2 comments
Labels
bad-data Data that looks bad and needs investigating, or tasks to clean up bad data

Comments

@simonw
Copy link
Collaborator

simonw commented Jun 30, 2021

This SQL query shows locations with multiple cvs: concordances:

select location.public_id, count(*), array_agg(authority || ':' || identifier)
from concordance_location join concordance_identifier on concordance_identifier.id = concordanceidentifier_id
join location on location_id = location.id
where authority = 'cvs' group by location.public_id
having count(*) > 1
order by count(*) desc

https://vial.calltheshots.us/dashboard/?sql=select+location.public_id%2C+count%28%2A%29%2C+array_agg%28authority+%7C%7C+%27%3A%27+%7C%7C+identifier%29%0D%0Afrom+concordance_location+join+concordance_identifier+on+concordance_identifier.id+%3D+concordanceidentifier_id%0D%0Ajoin+location+on+location_id+%3D+location.id%0D%0Awhere+authority+%3D+%27cvs%27+group+by+location.public_id%0D%0Ahaving+count%28%2A%29+%3E+1%0D%0Aorder+by+count%28%2A%29+desc%3AYNvYK31mkC1Kq8Tzr1xIQsunYgPUAqZJ2Axb-4VmNYg

There are a LOT - grouping by their duplicate count gives:

count	n
2	4193
3	358
4	259
5	189
6	64
7	9
8	2

Originally posted by @simonw in #707 (comment)

@simonw simonw added the bad-data Data that looks bad and needs investigating, or tasks to clean up bad data label Jun 30, 2021
@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

The top ten:

?column? count array_agg
https://vial.calltheshots.us/location/ltyht 8 "[""cvs:10351"", ""cvs:5804"", ""cvs:10984"", ""cvs:6818"", ""cvs:16011"", ""cvs:7279"", ""cvs:10909"", ""cvs:6971""]"
https://vial.calltheshots.us/location/lwccx 8 "[""cvs:11209"", ""cvs:7551"", ""cvs:7462"", ""cvs:4777"", ""cvs:7805"", ""cvs:10585"", ""cvs:10910"", ""cvs:10311""]"
https://vial.calltheshots.us/location/lwfmf 7 "[""cvs:8442"", ""cvs:17561"", ""cvs:7432"", ""cvs:07432"", ""cvs:10579"", ""cvs:7449"", ""cvs:7296""]"
https://vial.calltheshots.us/location/lfrfm 7 "[""cvs:6955"", ""cvs:7295"", ""cvs:10581"", ""cvs:17097"", ""cvs:7816"", ""cvs:07295"", ""cvs:7269""]"
https://vial.calltheshots.us/location/lfqgw 7 "[""cvs:5906"", ""cvs:10875"", ""cvs:2995"", ""cvs:6984"", ""cvs:7401"", ""cvs:05906"", ""cvs:7715""]"
https://vial.calltheshots.us/location/lfrrw 7 "[""cvs:17023"", ""cvs:16933"", ""cvs:7816"", ""cvs:7462"", ""cvs:7295"", ""cvs:10027"", ""cvs:07816""]"
https://vial.calltheshots.us/location/lfkxd 7 "[""cvs:2368"", ""cvs:2182"", ""cvs:17801"", ""cvs:10909"", ""cvs:02182"", ""cvs:2996"", ""cvs:7274""]"
https://vial.calltheshots.us/location/lfhtk 7 "[""cvs:7038"", ""cvs:00835"", ""cvs:7196"", ""cvs:16970"", ""cvs:8351"", ""cvs:6847"", ""cvs:10436""]"
https://vial.calltheshots.us/location/lfrct 7 "[""cvs:16958"", ""cvs:6319"", ""cvs:6903"", ""cvs:10821"", ""cvs:10665"", ""cvs:7156"", ""cvs:07156""]"
https://vial.calltheshots.us/location/lwkmq 7 "[""cvs:7647"", ""cvs:7507"", ""cvs:16693"", ""cvs:10056"", ""cvs:10540"", ""cvs:17098"", ""cvs:5976""]"

@simonw
Copy link
Collaborator Author

simonw commented Jun 30, 2021

Worst example: https://vial.calltheshots.us/location/ltyht - "CVS Pharmacy in DALLAS, TX - 6474 E. NORTHWEST HWY DALLAS, TX 75214" - has the following concordances:

    "concordances": [
        "vaccinespotter_org:270662726",
        "us_giscorps_vaccine_providers:f1a3d68b-0d54-4dd3-b154-fe3b4cf34641",
        "tx_arcgis:3078b524189848569f62985d71f4584b_0_983",
        "cvs:10909",
        "us_carbon_health:9fd288ea-623f-4f75-99c2-08c22c2d8b50",
        "cvs:10351",
        "cvs:10984",
        "cvs:16011",
        "cvs:5804",
        "cvs:6818",
        "cvs:6971",
        "cvs:7279",
        "placekey:22p-222@5qw-svs-2hq",
        "placekey:zzw-225@5qw-svs-2p9",
        "_tag_provider:cvs",
        "tx_arcgis:3078b524189848569f62985d71f4584b_0_1167",
        "vaccinefinder_org:0fbe377f-52a6-44af-a016-8e9e7925e9e1",
        "vaccinefinder_org:6895dd71-8a19-4dbd-aed8-dd13fc9efc68",
        "vaccinespotter_org:270575579",
        "_issue:702"
    ],

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bad-data Data that looks bad and needs investigating, or tasks to clean up bad data
Projects
None yet
Development

No branches or pull requests

1 participant