Open
Description
Hello,
Thanks for making a great tool.
When training your own model is there a limit to the amount of missing data in the reference?
Ive found that including all missing data gives widely inaccurate results (e.g. assigning 100% ancestry to the completely incorrect ancestry group- im calling it incorrect based on the ancestry group from ADMIXTURE analysis), but subsetting the reference so there is zero missing data also doesnt give great results (I can tell from dxy trees that the undesired ancestries is not being completely masked).
Just wondering if you have any insight into this/how missing data in the reference should impact results.
Thanks!
Metadata
Metadata
Assignees
Labels
No labels