-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GSoC Idea: Build an expert system to provide recommendations to users in a user interface #414
Comments
Hello! I'm Steven Kolawole, a Computer Science Junior from FUNAAB, Nigeria. GrimoireLab has piqued my interest right since when I joined the CHAOSS community back in February. Following the GrimoireLab's tutorial, I'm fairly familiar with how it works. I have a strong background in Machine Learning and I've done two prior internships in Data Science and Machine Learning Engineering roles. It's really exciting to be here. I plan to learn a lot and hopefully, have lots of fun while at this. Thank you for having me. 😄 |
Hi There! My name is Venkat and I'm a CSE Junior studying in Bangalore, India. I've been interested in GrimoireLab ever since I joined the community back in November 2020. I've got a good amount of experience with building and integrating/hosting ML models (especially NLP and recommender systems) having done 2 internships, one as a Data Science Intern and one as a Backend Developer (Funnily having to implement a recommender system for a recipe app 😆 ). I have also worked with JavaScript-based frameworks like React JS. I was just wondering whether this is on a first-come-first-serve basis? I'm super glad to see this project and hoping to contribute as much as I can in the near future! 😃 |
Hello! I am Darsh Mishra a sophomore studying at BITS Pilani. I am an experienced Django Developer having written bug-free code for websites serving payments, carrying out registrations, and managing a functioning of a large-scale college fest. I also have experience of developing ML Models especially recommendation systems which I learned during an internship and have been effectively been used in an online trade fair. I have also worked with JS frameworks like ReactJS so catching up with Vue JS would not be a problem for me. I would love to contribute to this project |
Hi! I'm comfortable with using Linux/bash terminal and commands, as well as github, and I could describe myself as a fast learner, ambitious and self-motivated student. I would like to participate in this project in order to learn more about how to build a recommender system, especially due to my interest in NLP, but also because I find this a challenging project that would bring me a great learning experience. As a newbie, I am very excited to get to know more about how to contribute to open source and become part of a community whom I could help and who could help me grow. Find me on linkedin :) |
Hi everyone, thanks for your interest in applying for this idea. You can start working on the microtasks to get a better idea of the project. Let us know if you have any doubts. 🙂
@Venkatavaradan-R, this is not on a first-come-first-serve basis. You have to submit a proposal as per the GSoC Guidelines and also attempt at least one microtask. |
For all students interested in this idea, please comment on this issue to get in touch with the mentors. This is the main communication channel. |
Hello @vchrombie I'm trying to understand the problem statement.
The goal of the project is to show "recommendations of profiles which might be same" to the user through the UI. We have to create the recommendation system using machine learning / AI. Am i correct? |
No. No machine learning/AI. It's very complicated for a project of one month. It's more important to use the methods that we already know, to improve them and to find a way to visualize that data with the current UI. |
Hello, @sduenas I was trying the new sortinghat server and ui can you give some data to test the graphql console. and |
Hi @rohanreddych, I understand you are asking about the microtask-6. Let me know if I'm are wrong.
You can use the data that was stored in the database after the microtask-5.
You can create a admin user using We will try to add some minimal documentation soon. |
Thanks, @vchrombie. I was able to enter data through UI and retrieve it through the graphql console.
@sduenas , @evamillan, @mafesan |
Hello, |
Hi, @SteveKola @Venkatavaradan-R @darmis007 @galexad @rohanreddych @WasimAkhtarKhan The main reason behind the microtasks is, these tasks will give a good minimum understanding of the Sorting Hat tool as well as the GrimoireLab platform as a whole. It will be really helpful for writing your proposal. If you haven't started working on the microtasks yet, I would suggest you start asap. You can create a github repository for storing the microtasks and you can open issues in that repo for asking doubts or reviewing the tasks. Thanks. |
Hello @vchrombie , I have gone through tutorials of GrimoireLab platform and SortingHat, setup the dev environment, and have started working on microtasks. It is very much interesting. |
Hello @vchrombie @sduenas , Is there anything I am doing wrong then please correct me. |
That's because you are running the grimoirelab container. Use docker-compose to run the Correct me if I am mistaken. 😄 |
You have to use the source code and docker method for setting up the dev environment.
The image looks good to me. It is the expected output. |
Ok I will set up with source code and docker method
Ok Nice. |
Hello @vchrombie , @rohanreddych
After executing
When I try to resolve Package requirement "grimoirelab-elk" is not satisfied by terminal and by Install Requirement |
Port 9200, so there is a problem with elasticsearch,I don't think you have elasticsearch running. Is there an elasticsearch instance running on your system? To solve this, I recommend using docker-compose as I said in the previous comment.
I got similar errors related to requirements. Can you create a new virtualenv and do |
Looks like the elasticsearch is not running at the required port. Can you confirm if there are no errors in the logs of docker-compose? You should have elasticsearch, kibiter, and MariaDB/MySQL running in the respective ports.
You have to set the Project Structure too, along with Project Interpreter. The Project Structure should have all the grimoirelab repositories.
@rohanreddych, this works but it would be better you use the source code of elk instead of using the pip package. The grimoirelab dependencies must be loaded using Project Interpreter, whereas the rest of the dependencies should be installed using Project Interpreter. For example, I'm providing an excerpt from the SirMordred requirements.txt file.
In this case
If you are installing everything using |
Hello @vchrombie
When I execute I tried to use docker-compose-without-searchguard.
Yes I have set the Project Structure along with Project Interpreter of all gremoirelab components as per Setting up PyCharm |
@WasimAkhtarKhan Elasticsearch has not started, thats why kibana is showing that warning. Once stop all the containers, run |
Thank you @rohanreddych |
Thanks, @rohanreddych for helping.
Cool, but I think you might need to change the es endpoints in the setup.cfg file if you are using without searchguard.
The |
Ok I'll. |
Hi @vchrombie |
Hi @WasimAkhtarKhan, can you confirm the full command which you are trying to execute? The configuration should be something like We can have a quick chat incase if it is not solved. |
|
I see you have mentioned only |
Ok |
Hello @vchrombie I think we need to improve the error message here. Should i start working on this? |
No problem @WasimAkhtarKhan, please let me know if that solved the issue.
Interesting. @rohanreddych, would you be interested to submit a PR for it? |
|
@vchrombie , facing this error. Did I forget to do any step? |
I am facing this error @vchrombie after running File "/home/wasim/.cache/pypoetry/virtualenvs/sortinghat-z07j1bw2-py3.8/lib/python3.8/site-packages/MySQLdb/connections.py", line 204, in init I face this also after changing the password in safe mode,flushing privilegds and then running When I execute |
Hi @rohanreddych, looks like there is some issue with sortinghat. The error comes from here, sortinghat_gelk.py#L74. Do you have MySQL/MariaDB installed on your machine? If yes, then there should be something wrong with the configurations. |
Yes, I have mysql installed. I do
How? IRC or matrix? |
Hi @WasimAkhtarKhan, it looks like a db connection issue. You have to update the |
We can connect at |
Ok @vchrombie |
Hello @vchrombie , I installed mariadb
I tried running Tried running again, which gave the following error:
|
Hi @rohanreddych, the logs look fine to me. You can check if the indexes are created using
You can execute the micro-mordred with the
|
This error was caused by using wrong version of SQLAlchemey, please use >1.2 and <1.4 This error can be resolved using this https://github.com/chaoss/grimoirelab-sirmordred/blob/master/Getting-Started.md#empty-index- . |
@sduenas , @evamillan, @mafesan |
The main purpose of the idea is to have useful recommendations for the system. The first problem we want to address is which recommendations we really need. Right now, we have those you mentioned but maybe they need to be reformulated. For example, the current recommendations for affiliations answer the question of "at what organization do these people work?" but maybe it's more useful to answer "who works for this organization?". In the case of the individuals, the recommendations are based on two sets of individuals to try to answer the question "which individuals are the same from these two groups?". Maybe we can create a higher layer which only does that for those individuals that weren't found before, or for those that work for a certain company, or maybe for the new ones that are added to the database. Also, we can expand the way we find these recommendations. For example, now we do some basic matching like if the email address is the same, then the individuals are the same. The problem gets more complicated when a person changed from one job to another so their email addresses are different now. The second problem is how to visualize all these information in the UI and how the user can accept those recommendations. All of these should be part of your proposal. You have to keep in mind that the work will last only one month or so, so try to adapt your proposal for that. Don't try to include too many things. Try to be reasonable according to the effort and time restrictions. More doesn't need mean better. |
Hi everyone, the student application period has started and the deadline is 13 April 2021, 18:00 UTC. GSoC Timeline Please continue working on the proposal and complete as many microtasks as possible. Please let us know if you need any help with doubts or reviewing the microtasks. |
Hello @sduenas, According to the GSOC page, the program would hold for two months. 10 weeks, to be exact. |
The program has changed a bit. I will post the brief timeline here, to avoid any confusion. Student Application Period: March 29, 2021 - April 13, 2021 |
Thanks to everyone who showed interest in applying for this idea and worked on making a proposal and the microtasks. It was great working with you. As the final steps, please make sure you submit your proposal on the Google Summer of Code website and you also need to open a PR adding your name and details to the GSoC-interest.md file in order to qualify as an interested candidate. Both have to be completed before the deadline mentioned on the GSoC website. Thanks once again! All the best. |
Thanks, everyone for participating in this idea! Unfortunately, no students have been selected for this idea despite the good applications from @SteveKola, @rohanreddych, and @WasimAkhtarKhan. We appreciate your time and effort in working on the project idea. It would be great if you would like to keep contributing to GrimoireLab. If you have any questions, comments, or concerns about the selection process, feel free to write here or send an email to [email protected]. Thanks! This issue is going to be closed on Friday. |
Most Welcome @vchrombie |
The new version of SortingHat provides a basic recommender system. It tells information about what identities could be the same, or what identities work for which companies. This information might not be useful for the end-user and it isn't available on the UI, though.
SortingHat is the tool that we use to manage identities data in GrimoireLab. As individuals in a project can have different identities - several usernames or email addresses - this tool allows creating unified profiles of them. Then, the platform will use this information to generate accurate results of the activity of these participants.
SortingHat started as a command-line tool but after some years, we saw its potential and we decided to create a new version, this time as a service. This new version provides a new GraphQL API to operate with the server and a UI web-based app, that replaces Hatstall, the old UI for SortingHat.
Although the development of it is in its later stage and it will be ready soon for the stable version of the platform, there are many good ideas that we will like to incorporate. Some of them were selected for GSoC 2021.
The aims of the project are as follows:
The aims will require generating code in Python for Django and the GraphQL API, and for the web app (generated with Vue.js and Vuetify).
Microtasks
For becoming familiar with GrimoireLab, you can start by reading some documentation. You can find useful information at:
Once you're familiar with Grimoirelab, you can have a look at the following microtasks.
Microtask 0:
Download PyCharm and get familiar with it (for instance, you can follow this tutorial).
Microtask 1:
Set up a dev environment to work on GrimoireLab. Have a look at chaoss/grimoirelab-sirmordred - Getting-Started.md.
Microtask 2:
Execute micro-mordred to collect, enrich and visualize data from Git repositories.
Microtask 3:
Based on the elasticsearch documents produced by micro-mordred and source code of chaoss/grimoirelab-elk, try to answer the following questions:
author_id
?author_org_name
?author_uuid
?author_domain
?uuid
?utc_commit
?origin
?Microtask 4:
Set up the developer environment of SortingHat (muggle branch).
NOTE: The sortinghat muggle branch is a WIP branch. As of now, it doesn't work with the core of the GrimoireLab platform but we hope to have it ready soon.
Microtask 5:
![](https://camo.githubusercontent.com/74e705dbd3c7e31eb37ca8114f8d7ff967e6b257887c92987c5955b1a37cafb4/68747470733a2f2f69302e77702e636f6d2f626c6f672e62697465726769612e636f6d2f77702d636f6e74656e742f75706c6f6164732f323032302f30322f736f7274696e676861742d6d756c7469706c652d6964656e7469746965732d657861706c652e706e673f726573697a653d3736382532433434382673736c3d31)
Create a sample profile with different identities and enrollments using the SortingHat UI.
Microtask 6:
Using the SortingHat GraphQL Console, create a query that fetches the data (identities, enrollments) of an individual profile.
Microtask 7:
Create a script that can parse the gitdm developer affiliation files and load the data in a SortingHat database using GraphQL.
Microtask 8:
Improve the visualization of the
individualCards
component. You need not send a PR, please update the work in your personal fork.Microtask 9:
Submit a PR to any of the GrimoireLab components to increase the test coverage of one or more files of the source code.
Microtask 10:
Submit at least a PR to one of the GrimoireLab repositories to fix an issue, improve the documentation, etc. Some good-first-issues are:
The text was updated successfully, but these errors were encountered: