-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update schemas to latest format #803
Comments
Hi @valeriocos I started with the askbot. In the process, I faced a few issues. I think there is a mistake in the askbot configurations. I think there is a typo with EDIT 1: https://ask.sagemath.org/questions/ doesn't seem to be a right endpoint but https://ask.sagemath.org/ works fine. Just checked manually as I have receiving a 404 error. EDIT 2: Here is a list of askbot sites. You can choose which would be fine for the example. |
Hi @valeriocos
I changed it and I could be able to run the script, but unusually it is taking really long time. I will try to see what could be the issue and update you about it. |
Sorry for the late reply @vchrombie , I thought I had answered this message
Please fix the mistake. WRT the askbot server, there is no specific site to target. You can try with https://askbot.org (in the past we were mining it, I have just tried with perceval* and it seems to work fine) [*] perceval askbot https://askbot.org --no-archive
Yes, sorry the URL should be the main one ( |
No problem. 🙂
Sure, I will do it by night.
Oh okay, I will try and get back to you.
Thanks for the reply @valeriocos. |
Hi @valeriocos Just a quick update. I have executed the micro-mordred for the askbot backend. It is taking so much time, but ya fine with it. After some time, the index was created and I could inspect the index using the kibiter. I tried this EDIT: I have opened the PR for the same. It seems that the fields are updated. I have pushed a commit regarding it, 2904067 I will complete the PR soon. 🙂 |
Hi @valeriocos When I was working on the askbot schema, I faced a small issue during the enrichment face. Here is the log, askbot-log.
There was no trouble with the enrichment. I didn't understand what could the problem. I thought of asking it here. |
Hi @vchrombie, This kind of issues is generally related to a user that removed his account. In this case, the enricher is assuming that the username is always there. A possible to solution is to use the get method as follows: Waiting for a patch to fix this bug :) |
Hi @valeriocos.
Thanks for the clarification.
Other parts you mean, in elk.py or just askbot.py?
Can I work on this, if you don't have any problem? |
You're welcome!
Just askbot.py
Sure, please start when you have time Thanks! |
The docker image is quite outdated and hasn't been updated so long. It might not have the latest changes to that enriched. It would be great if you can try the docker-compose method. This is almost similar to the docker method except this uses the latest releases. It would be even great if you are using the developer setup for GrimoireLab.
One reason could be the time. It looks like there are many sources, so it might take 10-15 minutes for the data to appear on the dashboards. Else it could be an issue of the outdated image or some typo in the configurations.
The fields might be deprecated now, so the schema should be updated as well. |
For the people who are interested to work on this issue. You can execute micro-mordred to collect and enrich the data of a particular data source. You can inspect the enriched documents using the dev tools or the discover of Kibiter. For each attribute found in the enriched index, the corresponding schema should contain the name of the attribute, the type, whether the field can be aggregated, and a description. You can use this script for automating the process and creating the schema file from the index. |
Closing this issue in favour of #1010 |
ELK keeps a description for each enriched data used to build the KIbiter dashboards. Such descriptions are stored in the folder schema as CSV files. Over time, these descriptions have evolved and the current format is defined as a list of attributes that include the name, the type, whether the field can be aggregated and a description (e.g., https://github.com/chaoss/grimoirelab-elk/blob/master/schema/git.csv). Nevertheless, some schemas are still not aligned with the latest format. For instance, this is the case for:
The goal of this issue is to update the schemas to the latest format. In order to do so, given a data source (e.g., meetup, stackoverflow), micro-mordred[*] should be executed to collect and enrich the data. Then, the enriched documents should be inspected using the dev tools or the discover of Kibiter. For each attribute found in the enriched index, the corresponding schema should contain the name of the attribute, the type, whether the field can be aggregated and a description.
Note that some fields like the
grimoire_creation_date
,project
,project_1
,origin
, etc. are shared across all enriched indexes and their descriptions can be taken from existing schemas.[*] Details to execute micro-mordred for a given data source are available at: https://github.com/chaoss/grimoirelab-sirmordred#supported-data-sources
The text was updated successfully, but these errors were encountered: